Progress toward new books…

by CJ | Oct 15, 2012 | Journal | 18 comments

I’ve gotten the Foreigner short story through the html check, and gotten as far as ePub conversion. Only mobi and pdf yet to go…Jane’s going to show me how to do that.
I’ve gotten through 98% of the html check for the Yvgenie body text. One little unmatched pointy-bracket and the whole text changes fonts. Finding the pointy-bracket at fault in a 120,000 word ms takes some hair-pulling, especially if it occurs at several points; plus the html has more than one way to give a command, but only one of those ways works well in a text conversion through various formats, so a future problem won’t show up in Preview, only by eyeballing the code.—As an example, pointy-bracket i will give you italics in html, but won’t work well in conversion: you have to have it as a ‘statement’ that’s a whole line long, ie, the long command that involves a font appearance change. Then you find silly things like a command invoking red ink and bold text, followed by two pointy-bracket slash spans that cut it off and make it not display. So that wouldn’t show in Preview either, but it could cause future troubles in conversion.
I’ll finish that this evening and start working with the image placement and front-matter html.

All of which is to say—we’re getting there. And I’m working on the current novel during the day. The other stuff is after supper.

18 Comments

paul on October 15, 2012 at 9:46 am

Yeah, that’s what happens. 😉 🙂 But the mistake is usually near where the font change initially happens.

I know your dismay. Years ago I once tried the “Save as HTML” option from Word something or other. It put all the “end tags” in the same order as the “start tags”, not the proper inside-out reverse order. I decided it would just be easier to learn HTML and hand-coded everything after that with just a regular old text editor.
Log in to Reply
Apf on October 15, 2012 at 10:32 am

I remember the good old days when we thought SMLs likes HTML were going to make things easier and make such problems a thing of the past. In much the same way that computers delivered on the paperless office….

Good to hear that you are making progress. And yay Foreigner short…. (does happy dance)
Log in to Reply
dhawktx on October 15, 2012 at 12:59 pm

Yes, Word is heinous in its abuse of HTML code. They even suggest in Calibre to save it in Open Office ODT instead before converting to epub!
Log in to Reply
smartcat on October 15, 2012 at 1:10 pm

Looking forward to all this. Foreigner story YUM! I’ve held off rereading the Rusalkas until you have it all up.

It must feel good to be back in writing. creative mode.

Completely Off Subject: It’s the 107th anniversary of Walter McCay’s Little Nemo In Slumberland. Have you Googled today?
Log in to Reply
- WOL on October 16, 2012 at 5:35 am
  
  I’m with you, Smartcat. I’m holding off on reading the Rusalkas until I have all three. I like to read the whole shebang from start to finish whenever possible. When I’ve had to read a trilogy (or a fourlogy or fivelogy!)piecemeal, I usually wait a while and then read them again back to back.
  Log in to Reply
CJ on October 15, 2012 at 1:22 pm

One thing you can say about html—it has a logic, once you ‘get’ it, that makes sense. It’s futzy, but it’s reassuringly basic. The reason for the hairpulling is trying to ferret out a missing less-than symbol in a nest of TOC (table of contents) commands, which is a Medusa’s-hairdo of parens and formatting commands. But nay, it was not there, it was at the bottom of that, and then repeated at every chapter heading through the text.

I won’t touch Word if I can help it. I wonder that, in this day of e-script and readers, that they do not have a ‘clean-up my code’ toggle that lets you output clean html, instead of the incredibly messy pages and pages of formatting they use…. You can translate to Open Office, or you can use Word Perfect. WP produces very clean code, and, one of the most valuable things, you can ‘reveal code’ to see what’s buried but not manifesting, fossils, in other words, —and you can search on more criteria. I use Word Perfect for the text, NAMO for the html editor, and Calibre for the initial conversion.
Log in to Reply
- A.Beth on October 16, 2012 at 5:53 pm
  
  Instead of looking in Preview, have you tacked on a .html to the end and tried opening in a web-browser? I recently had to futz with some complex HTML and that helped me find rogue unclosed formatting. (Though at one point I could not find the dadgum error, and wound up just stuffing some close-codes in until it went away. *sigh* I have no idea if that would work when making an epub; it did work for Amazon’s own Kindle-izer conversion from HTML to mobi, though.)
  
  It’s entirely possible that if Preview can’t see the oddness — or if you’re using a different version of Preview that isn’t the Mac one! — neither will a browser, but it should work on simpler stuff, at least.
  Log in to Reply
- BlueCatShip on October 18, 2012 at 11:14 pm
  
  I have often wondered lately why there isn’t a word processor that uses epub or html + css as its basic format, instead of something proprietary.
  
  I devoutly wish for nice, clean html5+css3 output from a word processor. Devoutly. — I will try the .odt to .html or .epub route. Anything that will give me less cleanup of spaghetti junk code will help. (Thanks, dhawketx for the suggestion!)
  
  I use LibreOffice these days for my word processor. But I’ve been saving to Word .doc format for compatibility. LibreOffice does not do .doc to .html well, and its output to .xhtml is all squeezed together, no tidy human-readable coding style with indents and spacing. But LibreOffice works, as a word processor, pretty well for most things.
  
  I have been using CoffeeCup.com ‘s HTML Editor now for over a year, perhaps two years. They update frequently. They have a forum for users. I get the feel ing they aer a small- to medium-sized company focused on their work (programming) and their customers (geeky coder types *and* rank beginners). The program is good, but not perfect. The current version and my erstwhile Adobe CS4 Suite (ugh) argue over “ownership” (ability to open by default) some file types, and some bug is causing the install on my laptop to have a bug or two (CoffeeCup, that is). But — their programs are very affordable and their upgrades are free, and they improve them. (I think the current thing is from growing pains or from some oddity with my laptop and CS4.
  
  —–
  
  One thing about HTML and CSS, they do have a logic to them by their nature. If you find an error somewhere in your code, like a “missing” closing tag because of a typo with the less-than or greater-than angle brackets or a missing slash, or some other typo faux pas, you can do a time-honored test-and-check programming strategy: Add a closing tag halfway through, or comment out a portion, and see if that solves it. Then try guessing high or low to get hotter or colder, and repeat. It’s a simple mathematical strategy of dividing the problem in halves until you isolate the error. If you have some educated guess as to where the error really is, what caused it, then you can improve your debugging strategy to only a step or two, perhaps. Yes, not easy when it’s some unknown spot in 120K words of text. But — a preview of HTML in a browser or an HTML editor will almost always give a ready clue to where the bug first shows up, and that means the error is *right there* or shortly right before it. — Since you and Jane and Lynn are brighter than the average bears about such things, you probably are already doing that, but learning your way around HTML and CSS, which are what’s underlying EPUB, any trick can help.
  
  —–
  
  One of the main things I’ve gathered from personal experience and from advice from books is to divide up a novel into chapter files and make sure the chapters all use the same word processor stylesheet and page layout template / master pages. This was very true in the old days of yore, back when Word and PageMaker and such could not handle large, book-length files readily. It’s still true (and helpful) for HTML and EPUB, because it reduces any bug-squashing down to chapter-sized chunks. The books I’ve seen so far on EPUB seem to prefer one file per chapter instead of all-in-one, underneath at the EPUB prep level, for this reason and for the sake of better treatment of chapters by ebook readers. Of course, once the EPUB is done, it’s all bundled up in a flavor of .zip file to deliver to the avid reading public.
  
  (I’ve been playing with files to learn the ropes of EPUB, besides the usual playing I do, design-wise, with my files. — I think I have the hang of using webfonts, mostly, now too. Currently learning SVG and EPUB.)
  
  I really, really want it to be something that programs handle better, outputting nice, clean code files, instead of having to muck about so much to clean up and get things doing like they ought to do. I *like* writing and drawing more than I like the geeky HTML and CSS…but yes, obviously I like doing the HTML and CSS too, or I wouldn’t be doing so much of it. 🙂 It suits me. Something about web design suits both the artistic side and the techie side of me.
  Log in to Reply
jhutchins on October 15, 2012 at 2:34 pm

A dedicated HTML editor might be worth learning for HTML review/cleanup. Unfortunately my favorate, Quanta, is no longer under development. Many of them use a program called “html tidy” ( http://tidy.sourceforge.net/ ) which will find things like unmatched brackets and double-closed tags.
Log in to Reply
CJ on October 15, 2012 at 4:32 pm

I’ve run one of them that was freeware, but it seemed to cause as many things as it tried to clean up. I’ll take a look at that one. Thanks for the suggestion.
Log in to Reply
pholy on October 15, 2012 at 9:20 pm

CJ, I think you may find that sourceforge page a bit confusing. The original tidy was developed back about 2000, and since then many people and groups have had their hand in various updates. What we have here is the Library Project, which has to be, and has been, incorporated into some kind of calling program.
I use the batch version currently distributed with Ubuntu.
It has also been incorporated into several editors which run under Windows or Wine (apparently). The two which seem to be most recently or currently maintained are HTML-Kit (or HTML-Tools) from http://www.chami.com/html-kit/ and NoteTab from http://www.notetab.com . Neither are FOSS free, although both have $-free downloads available. Their costs range from $40 to $60, although you can donate more 🙂 I’ve not tried either of them, although I might try them out under Wine here.
Log in to Reply
cwg on October 16, 2012 at 2:27 am

Here is a tutorial that describes step by step the use of global search-and-replace to fix your source file for use by calibre for conversion to the format(s) of your choice. It’s not too wordy, and the advice is excellent. I suggest Notepad++ rather than sublime as the editor, but the method will be the same. SCITE is another good choice of editor, and it works on all platforms.

I’d offer to do the conversion for you but you’d probably figure I just wanted a chance to read the book before it’s published 😉
Log in to Reply
- cwg on October 16, 2012 at 2:30 am
  
  I forgot to include the link: http://guidohenkel.com/2010/12/take-pride-in-your-ebook-formatting/
  Log in to Reply
cwg on October 16, 2012 at 2:28 am

Oops, forgot the link to the tutorial.
http://guidohenkel.com/2010/12/take-pride-in-your-ebook-formatting/
Log in to Reply
Hawke on October 16, 2012 at 7:55 am

And they say writing is easy. Pah to that, no? 😛
Log in to Reply
chondrite on October 16, 2012 at 11:21 am

CJ, regarding the books that you and Jane are still dealing with a publisher over, do you find that you are expected to pick up more of the work that you used to hand off to an editor or copyreader? My FiL was a senior editor at one of the big publishing houses, and he complains bitterly about the quality of the stuff that is slinking through to be published. Often it should have been hit hard with the editing mallet, but it makes it past without more than a quick run by a spellchecker, which is NOT a substitute for spending some sit down time with the author and going over the manuscript. It’s especially crucial with more books going directly to e-publishing, taking an entire level of review out of the process.
Log in to Reply
P J Evans on October 16, 2012 at 9:11 pm

One of my friends swears by (and possibly at) UltraEdit. They want (moderate amount) of money for it, though.
Log in to Reply
CJ on October 19, 2012 at 12:04 am

You aren’t kidding re lack of editing in e-books. I’m a good grammarian—comparative linguistics and all that jazz. But unfortunately the quality of editing was already declining before e-books, like the copyeditor that mis’corrected’ all the subjunctives in a particular book. I hit the ceiling and swore if this person came near my mss. again I’d really be peeved, but I’m sure that they’re out there, like Titanic’s iceberg, just ready to hand out advice. Now a lot of people who haven’t had the advantages of a grammatical education are on there on their own trying to compose subjunctives…sigh.
Log in to Reply