Developer Information

Validating with .NET (not supported by AI Internet Solutions)

This download enables .NET programmers to build CSE HTML Validator validation into their unit tests. It works with version 7 and provides source code, Visual Studio 2005 projects and compiled help which the experienced .NET programmer can use. The download is about 630 KB in size. Not developed or supported by AI Internet Solutions. More about these .NET Web Page Testing Files.

COM DLL wrapper (by third party) is Available (not supported by AI Internet Solutions)

Download COM DLL Wrapper

Integrating Using the DLL Method

A DLL interface has been developed to interface directly with CSE HTML Validator's validation engine. This is a more direct and powerful approach that lets you interface directly with the validator engine through its DLL. This is only available for CSE HTML Validator Std/Pro 4.0 and later and CSE HTML Validator Lite 6.52 and later. For more information, please visit the CSE HTML Validator Developer DLL Information page.

Using cmdlineprocessor.exe

The integration methods detailed on this page require using cmdlineprocessor.exe, included with CSE HTML Validator. In order to do this, first make sure there is a compatible version of CSE HTML Validator installed on the users system, then find the full path to cmdlineprocessor.exe so it can be called.

First, Check for a Compatible Version

Check ExternalCapability in the key HKEY_CURRENT_USER\SOFTWARE\AI Internet Solutions\CSE 3310 HTML Validator. It should contain a string value that is an integer and should be "2" or above. You should check to make sure this value is at least a minimum value to make sure that a compatible version of the Validator is installed. For example, for standard output support that is new in v10.0032+, please make sure ExternalCapability is 9 or above.

Next, Get the Full Path to cmdlineprocessor.exe

To get the full path to cmdlineprocessor.exe, try these, in order:

  1. Get the path from InstallDir in HKEY_CURRENT_USER\Software\AI Internet Solutions\CSE HTML Validator v4, and the name of the executable from ValidatorEXE (should typically be cmdlineprocessor.exe)
  2. If the above fails, then try getting the full path to cmdlineprocessor.exe from the default value for the key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\App Paths\htmlval.exe

Integrating Using Standard Output (stdout) (v10.0032+)

This is a new simplified method to integrate with CSE HTML Validator. It causes the output to be send to the standard output stream so windows messaging and global atoms are not needed. See the output file format section for the format of the easily parsable output.

The syntax for calling cmdlineprocessor.exe is

cmdlineprocessor.exe -e,(stdout),0,<flags> <html filename to validate>

Or, with no flags parameter:

cmdlineprocessor.exe -e,(stdout) <html filename to validate>

Examples:

cmdlineprocessor.exe -e,(stdout),0,4 "c:\temp\index.html"
cmdlineprocessor.exe -e,(stdout),0,4 "c:\temp\index.html" >"c:\temp\CSE Output.txt"

Flag Parameter

This is a bitmapped value.

Important Notes

Integrating Using the Old Method (with Global Atoms)

CSE HTML Validator can output an easily parsable validation result file that an application can easily process. Windows messaging and global atoms are used for communication.

To get this file cmdlineprocessor.exe (or htmlval.exe for older versions without cmdlineprocessor.exe) is called with command line arguments and the name of an HTML file to validate. HTML Validator validates the HTML file supplied by the command line arguments, and sends the name of the output file to the calling program by sending a message to a window of the calling program. The message contains a handle to a global atom containing the filename of the easily parsable validation results file.

The syntax for calling cmdlineprocessor.exe is

cmdlineprocessor.exe -e,<classname>,<message> <html filename to validate>

Example:

cmdlineprocessor.exe -e,SuperHTMLEditor,43 index.html

This will send the message WM_USER+43 to the window class named 'SuperHTMLEditor' using Windows API calls: PostMessage(FindWindow("SuperHTMLEditor", NULL), (UINT)(WM_USER+atol("43")), 0, (LPARAM)hAtom). The class name cannot contain space characters. Note that 'SuperHTMLEditor' and 43 are arbitrary- use your own name and message. There is no need to know the path to cmdlineprocessor.exe if you use a proper API command (like ShellExecute()) and htmlval.exe instead of cmdlineprocessor.exe because the registry contains the information that Windows needs to locate cmdlineprocessor.exe.

For instance, in Borland C++Builder when you receive a message from Windows, you should receive a TMessage class object. You should then be able to get the filename of the results file with code similar to the following:

ATOM hAtom; char resultsfilename[256]; hAtom=(ATOM)(Message.LParam); GlobalGetAtomName(hAtom,resultsfilename,256); GlobalDeleteAtom(hAtom);

The above copies the results filename to the character buffer resultsfilename and then deletes the global atom. For more information about the atom functions, please reference a Win32 API help file.

IMPORTANT: It is the calling program's responsibility to delete the global atom and to delete the output file of the validator.

Status Update Messages

Using -n between lines in the text file can be used to keep your application notified of the status of the processing. For example:

-e,SuperHTMLEditor,43 c:\htmlfiles\document1.html -n SuperHTMLEditor,1,25 -e,SuperHTMLEditor,43 c:\htmlfiles\document2.html -n SuperHTMLEditor,1,50 -e,SuperHTMLEditor,43 c:\htmlfiles\document3.html -n SuperHTMLEditor,1,75 -e,SuperHTMLEditor,43 c:\htmlfiles\document4.html -n SuperHTMLEditor,1,100

Here, the -n keeps SuperHTMLEditor notified of the percentage complete. Message numbers 43 and 1 are arbitrary.

Other Options

Output Format (Original)

Sample Output

VALIDATOR=CSE HTML Validator Professional VERSION=10.00 REGISTERED=YES REGISTRATIONNAME=Albert Wiersch FILENAME=t:\sampleoutput.html BYTESIZE=361 CHARSIZE=361 LINESCHECKED=12 PERCENTLINESCHECKED=100.0 LINESINFILE=12 LINESIGNORED=0 NUMBEROFENTITIES=0 NUMBEROFTAGNAMES=6 NUMBEROFCLOSINGTAGS=4 PERCENTCLOSED=66.7 SERVERSECTIONS=0 NUMBEROFHTMLCOMMENTS=0 NUMBEROFVALIDATORCOMMENTS=15 NUMBEROFMESSAGES=1 NUMBEROFERRORS=5 NUMBEROFWARNINGS=2 MAXMESSAGECHARS=633 MESSAGETYPE=WARNING MESSAGECATEGORY=Search Engine MESSAGENUMBER=2 MESSAGE=No title tag was found. Each page on a site should have its own unique title. Every title should contain appropriate keywords and search terms that are relevant to the page. Don't just stuff keywords in the title. The first words in the title are more likely to result in higher rankings than subsequent words, so use important keywords at the very beginning when reasonable. The title should also be something that a user will want to click on when it's listed on a search engine. A good title is also important when a visitor bookmarks a page. MESSAGETYPE=ERROR MESSAGECATEGORY=Web Content Accessibility Guidelines 2.0 (Level A) MESSAGENUMBER=1 LINENUMBER=2 CHARLOCATION=2 CHARLOCATIONLENGTH=4 MESSAGE=The default human language of each web page should be identified [A, 3.1.1]. Consider using the "lang" and/or (for XHTML) "xml:lang" attributes with the "html" tag to specify the default human language. For example, lang="en" for English or lang="fr" for French. Specifying the language assists braille translation software, speech synthesizers, translation software, and has other benefits. Visit http://www.w3.org/TR/WCAG20-TECHS/H57 for more information. MESSAGETYPE=MESSAGE MESSAGECATEGORY=Web Content Accessibility Guidelines 1.0 (Priority 3) MESSAGENUMBER=1 LINENUMBER=2 CHARLOCATION=2 CHARLOCATIONLENGTH=4 MESSAGE=The natural primary language of a document should be identified [P3, 4.3]. Use the "lang" and/or (for XHTML) "xml:lang" attributes with the "html" tag to specify the language. For example, lang="en" for English or lang="fr" for French. Note that the language may also be specified by the web server through HTTP headers in which case checkpoint 4.3 would be satisfied without the "lang" or "xml:lang" attributes. Specifying the language assists braille translation software, speech synthesizers, translation software, and has other benefits. MESSAGETYPE=ERROR MESSAGENUMBER=2 LINENUMBER=4 CHARLOCATION=3 CHARLOCATIONLENGTH=6 MESSAGE=The "tittle" element is not a recognized element. Is it misspelled? MESSAGETYPE=ERROR MESSAGENUMBER=3 LINENUMBER=4 CHARLOCATION=10 CHARLOCATIONLENGTH=20 MESSAGE=Text is contained in a "head" element. Because it is contained here, it must also be contained in a "title" element. MESSAGETYPE=ERROR MESSAGENUMBER=4 LINENUMBER=4 CHARLOCATION=31 CHARLOCATIONLENGTH=6 MESSAGE=The end tag for "title" was found, but no start tag for "title" was found. This appears to be a misplaced end tag that should be removed. MESSAGETYPE=WARNING MESSAGECATEGORY=Web Content Accessibility Guidelines 1.0 (Priority 2) MESSAGENUMBER=1 LINENUMBER=7 CHARLOCATION=3 CHARLOCATIONLENGTH=4 MESSAGE=This document has a "head" section but it does not contain a title element. Metadata should be provided to add semantic information to pages and sites [P2, 13.2]. Such metadata, like that supplied by the "head" element, can provide important orientation information to users. Metadata is information about your document. MESSAGETYPE=ERROR MESSAGECATEGORY=Web Content Accessibility Guidelines 2.0 (Level A) MESSAGENUMBER=5 LINENUMBER=7 CHARLOCATION=3 CHARLOCATIONLENGTH=4 MESSAGE=This document has a "head" section but it does not contain a title element. Web pages should have titles that describe topic or purpose [A, 2.4.2]. Visit http://www.w3.org/TR/WCAG20-TECHS/H25 for more information. MESSAGETYPE=COMMENT MESSAGECATEGORY=Section 508 Accessibility Standards MESSAGENUMBER=1 MESSAGE=[73] Section 508 accessibility checking is disabled. MESSAGETYPE=COMMENT MESSAGECATEGORY=Search Engine MESSAGENUMBER=2 MESSAGE=Keyword density: quick (1x - 20.0%), test (1x - 20.0%). Complete list. 3 words in exclude list. MESSAGETYPE=COMMENT MESSAGENUMBER=3 MESSAGE=HTML 4.01 Transitional document detected. MESSAGETYPE=COMMENT MESSAGECATEGORY=Search Engine MESSAGENUMBER=4 MESSAGE=[8] <meta name="description" content="(actual description)"> should be used in the "head" section to provide a brief description of what is contained on this page. Although descriptions may not be used directly for rankings, search engines may display descriptions in search results, with bolding of the relevent keywords. Therefore, a good description can help boost click-through rates and thus increase traffic to a website. If you're using HTML Validator's integrated editor, then this can be added from the 'Tags' menu or from the Tag Inserter. MESSAGETYPE=COMMENT MESSAGECATEGORY=Search Engine MESSAGENUMBER=5 MESSAGE=[8] <meta name="keywords" content="(actual keyword list)"> should be used in the "head" section to provide a list of keywords that are relevant to this page. This information may be used by search engines when indexing a site, however some experts now say meta keywords are no longer useful and may even be harmful if used by a competitor for research, so you may or may not want to use this tag. Our current recommendation is to use it, but avoid spending too much time on it. If you're using HTML Validator's integrated editor, then this can be added from the 'Tags' menu or from the Tag Inserter. MESSAGETYPE=COMMENT MESSAGECATEGORY=Search Engine MESSAGENUMBER=6 MESSAGE=No "h1" or "h2" header tag was found. Using these header tags (preferably with important keywords) to describe sub-topics of a page may improve search engine rankings. MESSAGETYPE=COMMENT MESSAGECATEGORY=Search Engine MESSAGENUMBER=7 MESSAGE=No italicizing, emphasizing, bolding, or strong tags were used. Emphasizing or italicizing keywords (with the "em" element) may improve rankings. Similarly, using strong text or bolding keywords (with the "strong" element) may also improve rankings. Some sources say that italicizing may have more benefit than bolding. MESSAGETYPE=COMMENT MESSAGECATEGORY=Search Engine MESSAGENUMBER=8 MESSAGE=[113] Random Search Engine Tip #29 - Important! Keep sites and content crawlable. Content that search engines can't access cannot be indexed. MESSAGETYPE=COMMENT MESSAGECATEGORY=Accessibility Tips MESSAGENUMBER=9 MESSAGE=[124] Random Accessibility Tip #17 - Provide text alternatives for ASCII art, emoticons and leetspeak. Visit http://www.w3.org/TR/WCAG20-TECHS/H86 for more information. MESSAGETYPE=COMMENT MESSAGECATEGORY=Web Content Accessibility Guidelines 1.0 (Priority 3) MESSAGENUMBER=10 MESSAGE=Provide keyboard shortcuts to important links (including those in client-side image maps), form controls, and groups of form controls. This is often done using the "accesskey" attribute. [P3, 9.5] MESSAGETYPE=COMMENT MESSAGENUMBER=11 MESSAGE=An ICRA RDF label was not found in the "head" section of this document. Browsers that are enabled with this free, self-regulating, content rating system may not display documents that have not been labeled. Currently, however, ICRA labels are not widely used. Consider if it's worth including an ICRA label in this document. For more information and online ICRA tools, please visit http://www.icra.org/webmasters/. MESSAGETYPE=COMMENT MESSAGENUMBER=12 MESSAGE=[10] CSE HTML Validator Std/Pro allows you to disable certain messages (like this example message) and groups of related messages by disabling flags. For instance, the [10] at the beginning of this message indicates that you can disable this message by disabling validator flag 10. If you are using HTML Validator's integrated editor, then you can simply use your mouse on this message to open the context menu (usually done by right-clicking the mouse on this message) and select 'Disable Flag 10' to disable this message. For more information about disabling messages, please look at the Configuration section in the documentation. MESSAGETYPE=COMMENT MESSAGENUMBER=13 MESSAGE=CSE HTML Validator Std/Pro allows you to disable many messages on an individual basis without using flags. For instance, you can disable this message by using HTML Validator's integrated editor to open the context menu for this message (usually done by right-clicking the mouse on this message) and selecting 'Options for this Message->Disable Message' to disable this message. For more information about disabling messages, please look at the Configuration section in the documentation. MESSAGETYPE=COMMENT MESSAGENUMBER=14 MESSAGE=367 bytes; 0.3s@14.4Kbps, 0.1s@28.8, 0.1s@50, 0.1s@64, 0.0s@128, 0.0s@384, 0.0s@512, 0.0s@768, 0.0s@1.5Mbps, 0.0s@10Mbps. MESSAGETYPE=COMMENT MESSAGENUMBER=15 MESSAGE=0.01s, 5 errors, 2 warnings, 1 message, 15 validator comments, 12 lines, 6 tags (4 closed), 0 document comments, 0 entities, 5 words spell checked (0 in comments), 16 programs run. ENDOFFILE=YES

Output Format Notes

  1. Statistical and header information will always come before the first MESSAGETYPE but the order of the statistical information may change. Lines can also get added and deleted. For example, the number of character entities is given only when the user has "Validate entities" checked.
  2. Every message begins with MESSAGETYPE and ends with another MESSAGETYPE or ENDOFFILE=YES. That is, all the message information for each message is between the beginning MESSAGETYPE line and the next MESSAGETYPE line or between the beginning MESSAGETYPE line and the ENDOFFILE line.
  3. The message information can come in any order and sometimes items may be omitted if it is not available (such as LINENUMBER for comments).
  4. The file will always end with ENDOFFILE=YES.
  5. There are four MESSAGETYPES: ERROR, WARNING, MESSAGE, and COMMENT.
  6. Each message type can have message numbers of 1, 2, 3, etc. (i.e. there can be an error message with a message number 1 as well as a warning message with a message number of 1, but there can't be two error messages with the same message number).

Parsing Recommendations

When parsing the output file, we recommend the following for best compatibility:

New Lines/Changes to Output File

Output Format (JSON)

CSE HTML Validator v10.0034 and above support JSON output with the appropriate option flag set (flag 16).

Example:

cmdlineprocessor.exe -e,(stdout),0,16 "c:\temp\index.html"

Output Format Notes

  1. Variable names are similar to the original format, except the names are in all lowercase.
  2. The "messages" variable is an array of messages, with each validator message being an object element of the array (unless the flag is set to NOT output the validator messages).
  3. Remember that ['messages'][i]['messagenumber'] is based on the message type, so there could be more than one message with the same message number as long as the messages have different types (one could be an error message and the other could be a warning message), but there will never be two error messages (or warning, comment, etc messages) with the same message number. Also, messages may be listed out of 'messagenumber' order.

Other Useful Information

CSE HTML Validator Lite versions prior to v6.52 do not support any integration (via any method-- old or new) with other applications. Version 6.52 and above, however, do support integration with third party programs.

To see what command line arguments CSE HTML Validator and above can accept, see the documentation page about the Command Line Arguments and cmdlineprocessor.exe.