Is rel="canonical" supported?

For technical support and bug reports for all editions of CSS HTML Validator, including htmlval for Linux and Mac.
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Is rel="canonical" supported?

Post by ktp »

Question: if the url pointed to by rel="canonical" is incorrect (e.g. error code 404), it does not seem that the validator points out this error. I hit this problem during one run.

Could admin confirm the problem?
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Is rel="canonical" supported?

Post by Albert Wiersch »

ktp wrote: Sat Dec 19, 2020 11:29 am Question: if the url pointed to by rel="canonical" is incorrect (e.g. error code 404), it does not seem that the validator points out this error. I hit this problem during one run.

Could admin confirm the problem?
If the link is bad (404) then it should be detected by the link checker, so the link checker will need to be used to generate link reports in the Batch Wizard. You won't see a validator message for a bad link because that's the link checker's job.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Is rel="canonical" supported?

Post by ktp »

In Batch Wizard options, I had:
Always generate a link report...: not checked
Include in the target report: Error links (404) : checked

So I believe I have to check "Always generate a link report" option.
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Is rel="canonical" supported?

Post by Albert Wiersch »

ktp wrote: Sat Dec 19, 2020 12:22 pm In Batch Wizard options, I had:
Always generate a link report...: not checked
Include in the target report: Error links (404) : checked

So I believe I have to check "Always generate a link report" option.
Hello,

I'm sorry for the confusion.

You shouldn't need to check that.... but you can if you want. From the documentation:
Always generate a link report - Check this box to always generate a link report for every target, including for documents without any errors or warnings even when you have asked not to generate reports for documents without any errors or warnings (in this case, a link report will be generated but not a message report). This option is not enabled by default.

The primary thing you need to do is to enable link checking in the 'Tool to Use' options page in the Batch Wizard Options.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Is rel="canonical" supported?

Post by ktp »

Excellent! Thank you very much for the tip. I assumed wrongly that the link checker is enabled by default. Because detecting link error is very important. Why it is not enabled by default?
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Is rel="canonical" supported?

Post by Albert Wiersch »

ktp wrote: Sat Dec 19, 2020 12:42 pm Excellent! Thank you very much for the tip. I assumed wrongly that the link checker is enabled by default. Because detecting link error is very important. Why it is not enabled by default?
I can understand wanting it enabled, but it's disabled by default because it takes time (can be a lot slower than validation), requires an Internet connection, and can make a lot of server requests (which may not be wanted)... so to "play it safe" the user needs to turn it on.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Is rel="canonical" supported?

Post by ktp »

Thank you for the explanation. But I always use the validator (editor or Batch Wizard) against URL, not against a file or a folder (I test with virtual machine: development or with real Internet server: production). So what is the difference when link checker is enabled in such situation?

By the way, I am running with link checking enabled. The statistics on Batch Wizard window title show: Overall N% (xxx <> yyy), Linkx M% (aaa <> bbb). Note that aaa <= bbb, and xxx <= yyy. In my case, aaa or bbb is much greater than xxx/yyyy. For example : bbb = 327 451, and yyy = 29 470.

An explanation for these figures aaa, bbb, xxx, yyy, M, N would be welcomed.
Edit (add):
I understand that N% = xxx divide to yyy in percent. Same for M% = aaa divide to bbb. yyy is the number of links found, and xxx (always <= yyy) are the nomber of links processed. I need to understand the aaa and bbb, why there values are much greater than xxx or yyy.
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Is rel="canonical" supported?

Post by Albert Wiersch »

ktp wrote: Sat Dec 19, 2020 1:19 pm Thank you for the explanation. But I always use the validator (editor or Batch Wizard) against URL, not against a file or a folder (I test with virtual machine: development or with real Internet server: production). So what is the difference when link checker is enabled in such situation?
I'm not sure I understand the question. With the link checker enabled, it will issue requests to the servers to check the URL (make sure it's not a 404), will take more time, and will generate/include the necessary/relevant link reports in the Batch Wizard report. Without the link checker enabled the Batch Wizard will still crawl a site but it won't do link checking or generate link reports.
ktp wrote: Sat Dec 19, 2020 1:19 pm By the way, I am running with link checking enabled. The statistics on Batch Wizard window title show: Overall N% (xxx <> yyy), Linkx M% (aaa <> bbb). Note that aaa <= bbb, and xxx <= yyy. In my case, aaa or bbb is much greater than xxx/yyyy. For example : bbb = 327 451, and yyy = 29 470.

An explanation for these figures aaa, bbb, xxx, yyy, M, N would be welcomed.
Edit (add):
I understand that N% = xxx divide to yyy in percent. Same for M% = aaa divide to bbb. yyy is the number of links found, and xxx (always <= yyy) are the nomber of links processed. I need to understand the aaa and bbb, why there values are much greater than xxx or yyy.
xxx and yyy are the targets that the Batch Wizard is crawling and aaa and bbb are the links being checked by the link checker and will include links like images and to external documents.

xxx - number of targets processed/validated (this will increase as targets are processed/validated)
yyy - total number of targets to process/validate (this will increase as targets are discovered and queued for processing)
aaa - number of links checked by the link checker (this will increase as links are checked by the link checker)
bbb - total number of links to be checked by the link checker (this will increase as new links are discovered and queued for link checking)
N% is xxx/yyy
M% is aaa/bbb

aaa and bbb are much greater because there are more links to check than targets to be validated. When I say "link to check" here, it means that the link is just checked to make sure it exists and isn't a 404 link; it isn't downloaded and validated. The xxx and yyy are targets that are actually downloaded and run through the validator engine to check for HTML/CSS syntax issues.

I hope this helps. Please let me know if you have any more questions.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Is rel="canonical" supported?

Post by Albert Wiersch »

I've incorporated the user function I posted earlier in this thread into the next update of CSS HTML Validator. It won't abort the validation or generate a message for documents with canonical URLs but it will automatically exclude them from Batch Wizard duplicate page title and duplicate meta description reports (if the canonical URL is not the same as the source document URL).

So there will be no need to use the user function earlier in this thread in v21.0001 and later... unless you want to generate a validator message or abort the validation, in which case the user function is still required for that customization.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
Post Reply