Sometimes the Batch Wizard will request too many pages from a server in a short amount of time. Slowing down the requests can sometimes resolve various problems with pages downloading. To slow down the page requests made from the Batch Wizard, change the options in the Limits Page of the Batch Wizard Options.
If the Batch Wizard is not following all the links that you expect it to then see the below for possible solutions.
Make sure that the follow links option is checked in the Follow Links tab of the target's properties. See the Target Properties topic. |
Links may not be followed if a validation is terminated due to too many errors or too many warnings. When a validation is terminated, CSS HTML Validator stops parsing the document and stops extracting links. To limit this situation from happening for websites containing pages with large numbers of errors and/or warnings, we recommend that an "Errors only" validation be performed first. You can select "Errors only" in the Tool to Use Page of the Batch Wizard Options. You may also want to try increasing the maximum number of errors by going to the Message Output Page of the Validator Engine Options. A value of 50 or 100 may work better. Once enough errors are corrected in the web pages so that the validations do not terminate, then an "errors and warnings only" validation or "full" validation can be performed to find further issues. Another solution to the above problem would be to address/fix the errors and warnings of the pages that the Batch Wizard was able to check, then process the target list again. Continue fixing the errors and warnings and then re-processing the target list (revalidating) until you are satisfied with the results. The key is to get the number of errors and warnings down so that the validations aren't prematurely terminated (due to too many errors or too many warnings) and so that all the links can be extracted, followed, and validated. |
Make sure that the Limit to text in the Follow Links tab of the target's properties is correct. See the Target Properties topic. One cause of incorrect Limit to text is adding a URL like https://domain.com which is then redirected to https://www.domain.com (with hostname www). In this case the Limit to text might be *://domain.com/ which doesn't include the www hostname. Instead use *://www.domain.com/. |
An extra or missing quotation mark in or around attribute values may confuse the parser. If this happens, then the document cannot be parsed properly and links may not be extracted. If this is the case then many warning messages like the following will be generated: "A quoted string spans more than one line (this is not recommended). A possible cause is a nearby attribute value that is missing its start or end quotation mark. To allow multi-line quoted strings, check the option in the Validator Engine Options. This message is displayed up to 10 times.", or "A quoted string spans more than one line (this is not recommended). Check nearby attribute values for proper quotes. This message is displayed up to 10 times." If you see one or more of the above validator messages then look for and correct any mismatched quotation mark problems and re-process (revalidate) the target list. NOTE: These messages will not be generated if the Allow multi-line quoted strings option is checked in the Attribute Options Page of the Validator Engine Options. |
If a page's navigation uses JavaScript and/or the links are embedded in JavaScript, then CSS HTML Validator won't be able to extract them and follow them. Some possible workarounds: •Create a new target that is a simple HTML document with links to the URLs that you want to crawl. Add this HTML document to the target list so the Batch Wizard will parse it and extract the additional links. •Add normal links with no link text using the "a" element at the end of the document just before the </body> tag so the Batch Wizard can extract the links. CSS HTML Validator may generate a message about no link text for these elements, but this message can be disabled. •( |