Page 1 of 1

HTML Validator and e-pubs

Posted: Sat Dec 06, 2014 4:04 pm
by JohnVeit
I used HTML Validator 14.0 standard this year to make my web site files compliant with html 5 ( Changes had to be made to all the files to accommodate html 5's "new" rules for displaying images, and linking video clips, and its dropping of support for tags such as "font" and "center". HTML Validator served as an excellent and extra helpful assistant who is much smarter than me, as I am new to html 5, very old, forgetful, and don't work as well as I used to. :)

I also have a few e-pubs that were produced in 2010 using the html of that time, and Mobipocket Creator. It is a tool that combines an e-pub's index, and chapter files, and images into a .prc file, which was then uploaded, and an e-book produced.

I am in the process of revising one of the e-pubs, and have found that Mobipocket Creator and .prc files are no longer supported.

A .mobi file is now the means used to produce an e-pub, and an Amazon tool is freely available for use in generating and reviewing .mobi files. The tool is KindlePreviewer.

The rub is that when using the KindlePreviewer to create a .mobi file from html 5 compliant files consisting of an index, its linked-to sub files, and images, all of the files and images are not combined together in the output mobi file as was the case with a Mobipoocket created .prc file.

The .mobi file contains only the index file and its images. I plan to investigate this an hopefully find a solution in the KindlePreviewer instructions.

However, I have the original html pub files which conform to the rules in effect in 2010, and which also were linked together with the use of "name" anchors.

So, I used the KindlePreviewer to open the index file, and the result was a .mobi file containing all the e-pub files and images, which according to the instructions, should be able to be uploaded and a "new" e-pub produced.

Now my plan is to use HTML Validator 10.0 Lite for editing the pub files, as that version accepts pre html 5 coding which works with the KindlePreviewer.

Note: As an aside, the KindlePreviewer info also states that .jpg images that are bigger than those accepted previously, can be used in making .mobi files. The previous files were .gifs. How that will work out is yet to be seen. Also, the KindlePreviewer imprints the file name of the "index" file on the top of each page produced as output. So, I changed the index file name to the book name, and the book name shows on the top of each page. It does not seem to matter if the name of the file being opened is not index.htm.

As to my use of "name" anchors back in 2010, they were used so that when a chapter was read, the reader was given the options of jumping back to the TOP, the MAIN INDEX, the LAST CHAPTER, or to continue reading the next chapter of the total of 40.

FYI, the 2010 html version files linked with the use of the nmae tag, can be used to produce a pdf. file of all the files. 1. Open the index file in word (I have a 2003 version). It will suck up all of the e-pub files linked with the use of a "name" tag, and their images. 2. Save this doc as a .doc. file. 3. The .doc file can then be opened in Open Office and exported to a pdf. file. This does not work if files not linked with the use of the "name" tag.

I started this thread to clarify my understanding of the situation, for the info of others, as a push back against not supporting what has worked in the past, and for those like me that find that the "old" rules are easier to understand and use, than what is said to be newer, faster, better.

Comments are welcome and will be appreciated.

Re: HTML Validator and e-pubs

Posted: Mon Dec 08, 2014 11:12 am
by Albert Wiersch

CSE HTML Validator is very configurable so it is probably possible to configure it to your needs.

If you could send or post some small example documents or snippets along with the exact text of any validator message in question, then I would be happy to look into how the results can be improved for that particular situation/document.

Re: HTML Validator and e-pubs

Posted: Tue Dec 09, 2014 3:03 am
by JohnVeit
After spending the past few days, reading and rereading info and articles on html and in particularly html 5, as well as the Amazon/Kindle publishing guides, I have come to the conclusion that this thread can be deleted.

The "new" Amazon/Kindle tool (Kindle Previewer), that is used to produce a .mobi file for use in publishing an e-book, accepts most if not all of the "old html" tags that are deprecated in HTML 4 and obsolete in HTML 5.

So the CSE HTML Validator Lite v10.01 can be used in editing and updating old html files that were created prior to HTML 4 and 5.

My old e-book file has 5,566 lines, and when I validated it with the CSE HTML Validator Lite, the validator reported: 0 errors, 0 warnings, 0 messages, and 1 comment. Also, that file was converted into a .mobi file using the new Amazon/Kindle tool. So all I need to do is edit, and update as needed.


The Amazon/Kindle publishing guide says that their new tool can accept HTML 5 files, so that may be is an option for future use.

I think the CSE HTML Validator is a great tool, and I used it in updating my web pages so they would be in compliance with HTML 5.

FYI, here are the "messages" generated when validating the index file.

Possibly misspelled words (102, 55 unique): AAR (1x), Chiodo's (1x), CHP (1x), .... List limited to first 50 unique words. 2864 total words checked (0 in comments).
HTML 5 document detected.
31747 bytes; 22.0s@14.4Kbps, 11.0s@28.8, 6.3s@50, 5.0s@64, 2.5s@128, 0.8s@384, 0.6s@512, 0.4s@768, 0.2s@1.5Mbps, 0.0s@10Mbps.
0.33s, 0 errors, 12 warnings, 20 messages, 16 validator comments, 813 lines, 842 elements (with 286 end tags), 0 document comments, 52 character references, 2864 words spell checked (0 in comments), 2003 TNPL programs run.

Re: HTML Validator and e-pubs

Posted: Tue Dec 09, 2014 8:05 am
by Albert Wiersch
I'm glad you find CSE HTML Validator to be a great tool!

My question that remains is why can't the latest version be used? The latest version (v14 for the lite edition and v15 for the standard+ edition) does support new HTML5 tags and attributes and obsoletes old tags and attributes, but if you use an HTML 4.01 DOCTYPE then it should not check to HTML5 standards. I'm not familiar with the format of your e-book file but perhaps something can be easily done to make CSE HTML Validator check it as HTML 4.01 instead of HTML5.

If you'd like to pursue the above then a small sample document that reproduces the main issue would be very helpful. You should not have to use an old version.


Re: HTML Validator and e-pubs

Posted: Fri Dec 19, 2014 9:53 am
by JohnVeit
In my effort to read about and how to use the new Kindle publishing software, I mistakenly thought that one could now add photos and videos to an e-book.

So, as I have lots of photos of WW II planes and fly-by video clips, I put aside my plan to update a pub, and I got busy in putting together a book of photos and videos using html5.

Then I came to understand that video can not be added to the platform being used. Also, if that could be done, the e-book would be way to big at over 600 MBs.

So, I added photo galleries, and put in links to YouTube and my site for playiing the fly-by videos.

Here's a link to the e-book's "look inside" info where you can view the index plus some of the photos. You can click on the links to the fly-by videos to see them if you are interested in that -- I like them :-)

Also, note that some of the text is not left justified as it should be and the font is not arial as I prefer. ... B00R78F8NC

I now plan to re-look at updating an older book. But we shall see about that.

In putting together the new pub, I used CSE HTML Validator Std v14.05 and it was a great help.

FYI on the use of the "style" attribute.

The "border" attrib is not allowed in html5 as I understand it. So, you have to use "style" to add a border to pics, which I did 305 times. And the Validator frowns on that.

Also the use of "id" instead of "name" works out OK.

Re: HTML Validator and e-pubs

Posted: Fri Dec 19, 2014 12:09 pm
by Albert Wiersch
Looks like a great book for those interested in WW II planes!

I'm glad you found CSE HTML Validator a big help.

As for HTML5, it may be better to use HTML 4.01 for your purposes, then CSE HTML Validator shouldn't complain about HTML5 issues if you do that.