Errors when rying to use HTML tidy

For technical support for all editions of CSE HTML Validator. Includes bug reports.

Errors when rying to use HTML tidy

Postby generatorassociates » Thu Dec 01, 2011 11:55 am

Hello I wonder if anyone an help.
I used the trial version last year fine - but have now come to buy the version and am trying to validate and use HTML tidy for the first time and am having no luck.

The error message says that it has got more errors than the original! And that perhaps it was unable to work.

Can anyone help? I purchased this to help me get my webpages have less errors on because I was spending lots of time trying to manually do them and getting no where, but I have no idea what to do about this.

Can you help?

Thanks
Attachments
captured.jpg
screenshot of tidy
captured.jpg (44.15 KiB) Viewed 2525 times
generatorassociates
Rank 0 - Newcomer
Rank 0 - Newcomer
 
Posts: 2
Joined: Thu Dec 01, 2011 11:41 am

Re: Errors when rying to use HTML tidy

Postby Albert Wiersch » Thu Dec 01, 2011 2:19 pm

Hello,

Unfortunately HTML Tidy can be "quirky". Automatic fixing of documents can be prone to problems and not possible in many cases.

I can take a closer look, but would need the original and the tidied document.

Thank you.
Image
Albert Wiersch
User avatar
Albert Wiersch
Site Admin
Site Admin
 
Posts: 2435
Joined: Sat Dec 11, 2004 10:23 am
Location: Near Dallas, TX

Re: Errors when rying to use HTML tidy

Postby generatorassociates » Fri Dec 02, 2011 5:22 am

Hi,

Thanks - what is the best way to upload the versions - in what format?
generatorassociates
Rank 0 - Newcomer
Rank 0 - Newcomer
 
Posts: 2
Joined: Thu Dec 01, 2011 11:41 am

Re: Errors when rying to use HTML tidy

Postby Albert Wiersch » Fri Dec 02, 2011 9:47 am

generatorassociates wrote:Hi,

Thanks - what is the best way to upload the versions - in what format?


Hello,

Please email them to me (via attachment) at support at htmlvalidator dot com and I'll take a look.
Image
Albert Wiersch
User avatar
Albert Wiersch
Site Admin
Site Admin
 
Posts: 2435
Joined: Sat Dec 11, 2004 10:23 am
Location: Near Dallas, TX

Re: Errors when rying to use HTML tidy

Postby Albert Wiersch » Mon Dec 05, 2011 10:05 pm

Thank you. I received the sample documents. I see you sent one as rtf, but CSE HTML Validator doesn't check rtf files so I loaded it in Microsoft Word and copied the text to the clipboard and then pasted it into CSE HTML Validator.

When I used HTML Tidy, it said it generated 17 fewer errors and 23 fewer warnings than the original document, so that seemed to be a definite improvement in the tidied version.

I realized that one issue that might be affecting the comparison of the before & after is if the validation is terminated because of too many errors or warnings. In this case the results may not be accurate because the entire document wasn't checked. I will try to address this issue in an update.

The above looks like what might be causing the confusion in your case. The screenshot you provided is difficult to read. Is it showing 20 warnings for the original and the tidied version? If so, then it is likely the validation was aborted because of too many warnings, and because the validation of the original document may not have stopped at the same place as the validation of the tidied document, the results can be misleading or incomparable.

You could try increasing the maximum number of errors and warnings to 50 or so, in the Options->Validator Engine Options, Validator Engine->Message Output page. You could also try turning off accessibility checking. This may improve the comparison results.

By the way, I noticed in the original document there was this line with a missing '=' after the "content" attribute:
Code: Select all
<meta name="robots" content"index, follow">


HTML Tidy "corrects" this to:
Code: Select all
<meta name="robots">


Notice it removed the content"index, follow" part completely, as it probably assumed it was just an invalid attribute or construct. This is a good example of how these tools can do things you really don't want, so always be aware of things like this when using any auto-fix tool.
Image
Albert Wiersch
User avatar
Albert Wiersch
Site Admin
Site Admin
 
Posts: 2435
Joined: Sat Dec 11, 2004 10:23 am
Location: Near Dallas, TX

Re: Errors when rying to use HTML tidy

Postby MikeGale » Tue Dec 06, 2011 8:07 pm

It seems to me that there is another way to tidy. I've not seen anything written about it, but can't imagine why it shouldn't work.

Every time a browser reads a document it does work fixing bad markup, if it exists. The result is represented in a DOM document.

That DOM document could be serialized back to text (say using JavaScript to get a standard format) and the text saved.

This could be done with a variety of browsers (same / similar code so that it's easily comparable) to get a "tidied" version. Diff tools could be used to assemble an "authorised version" ready for final clean up.

Has anybody here done that?

Have code to share?

Stories of experiences?
User avatar
MikeGale
Rank VI - Professional
Rank VI - Professional
 
Posts: 612
Joined: Mon Dec 13, 2004 2:50 pm
Location: Tannhauser Gate


Return to CSE Tech Support

Who is online

Users browsing this forum: No registered users and 2 guests

cron