Follow Links (tick_24Pro, cross_24Std, cross_24Lite)

batch-wizard-target-properties-follow-links

Follow Links Tab

Follow links - Check this box to follow and validate the links in the target and to 'crawl' or 'spider' a site. Links that meet certain requirements (such as those specified by the limit to and extensions text) are automatically added to the target list being processed. This option has an effect only when validating, spell checking, or checking links.

Limit to - Links that do not start with the text in this box will not be validated or followed. Relative links are changed to absolute links before checking whether the link begins with the limit to text or not.

Follow ext - A comma separated list of extensions that can be followed. Links that do have an extension but the extension is not specified here will not be checked, unless the link is a folder. If the link is a folder, then it is followed if it meets all the other requirements. NOTE: Links that do not have an extension are followed. The default, represented by "(default)", is "htm,html,shtm,shtml,asa,asp,aspx,cfm,cfml,css,js,jsp,php,php3,php4,wml"

No follow - A comma separated list of extensions that will not be followed. The default, represented by "(default)", is "au,avi,bmp,exe,gif,jpeg,jpg,mid,mp3,pdf,png,scr,ttf,txt,wav,zip". (New v11.9923)

MIME - A comma separated list of MIME types that can be followed. If a MIME type is available for the target (like for HTTP/HTTPS targets), then this overrides (has priority over) any extension settings. The default, represented by "(default)", is "text/html,text/css,text/javascript,application/javascript,application/x-javascript,application/xhtml+xml". (New v11.9917)

Depth limit - Set the depth to follow links to. This limits the number of links that are followed. Set to -1 for no depth limit. A limit of 0 is the same as not following any links. A limit of 1 will follow the links in the target, but will not follow the links in the followed links.

Determining Whether a Link Will be Followed

The determination of whether a link will be followed is made in the following order:

1.If the link is from the "src" attribute of an "img" element, then the link will not be followed. These common links are always automatically excluded.

2.If 'limit to' text is specified and the link does not begin with it (using a case-insensitive compare), then the link will not be followed.

3.If the link matches an "exclude string", then the link will not be followed.

4.If the link has an extension and the extension is a "no follow" extension, then the link will not be followed.

5.If the link has an extension and the extension is listed as a "follow ext" extension, then the link will be followed.

6.If the target is a URL and MIME types to follow are specified, then the link will be requested to obtain the MIME type. When the headers are received, the MIME type is checked. If the MIME type is one that can be followed, then the link will be followed; if it isn't, then the request is aborted and the link is not followed.

7.If none of the above apply, then the link is not followed.