Thank you. I received the sample documents. I see you sent one as rtf, but
CSE HTML Validator doesn't check rtf files so I loaded it in Microsoft Word and copied the text to the clipboard and then pasted it into
CSE HTML Validator.
When I used HTML Tidy, it said it generated 17 fewer errors and 23 fewer warnings than the original document, so that seemed to be a definite improvement in the tidied version.
I realized that one issue that might be affecting the comparison of the before & after is if the validation is terminated because of too many errors or warnings. In this case the results may not be accurate because the entire document wasn't checked. I will try to address this issue in an update.
The above looks like what might be causing the confusion in your case. The screenshot you provided is difficult to read. Is it showing 20 warnings for the original and the tidied version? If so, then it is likely the validation was aborted because of too many warnings, and because the validation of the original document may not have stopped at the same place as the validation of the tidied document, the results can be misleading or incomparable.
You could try increasing the maximum number of errors and warnings to 50 or so, in the
Options->Validator Engine Options,
Validator Engine->Message Output page. You could also try turning off accessibility checking. This may improve the comparison results.
By the way, I noticed in the original document there was this line with a missing '=' after the "content" attribute:
- Code: Select all
<meta name="robots" content"index, follow">
HTML Tidy "corrects" this to:
- Code: Select all
<meta name="robots">
Notice it removed the
content"index, follow" part completely, as it probably assumed it was just an invalid attribute or construct. This is a good example of how these tools can do things you really don't want, so always be aware of things like this when using any auto-fix tool.