Bad links reported by Link checker though no error

For technical support and bug reports for all editions of CSS HTML Validator, including htmlval for Linux and Mac.
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Bad links reported by Link checker though no error

Post by ktp »

Hello,

I have 5255 bad links reported (17.8% of total docs). But when I click on each reported link, I do not see any error, e.g.:
267 links extracted; 1 bad, 0 warning, 266 good, 0 not checked, 0 excluded, 1 in this link report
but only one line displayed with status column OK. So false report? Please help.
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Bad links reported by Link checker though no error

Post by Albert Wiersch »

Hello,

Sorry, I am not sure. I need more information.

Can you ZIP up your Batch Wizard report files and email it to me at support at htmlvalidator dot com? Thanks.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Bad links reported by Link checker though no error

Post by ktp »

Hello, zip just sent.
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Bad links reported by Link checker though no error

Post by Albert Wiersch »

Got it, thank you.

Something is weird as the "OK" links are being considered bad. It looks like there is no HTTP status code (like 200 for OK links). I am wondering if your server isn't sending one.

Please go to 'File > Open from the Web' and try opening https://vm7raid.example.com/anh/chinois_33.gif

It should tell you it can't be opened but please copy and paste into a reply what's in the 'Open Progress' window so I can see the "Response>" lines.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Bad links reported by Link checker though no error

Post by ktp »

Here the screen capture. Status code also OK 200 seen in Chrome inspector.
Attachments
CSS_html_validator_2020-12-20_062432.jpg
CSS_html_validator_2020-12-20_062432.jpg (307.2 KiB) Viewed 3893 times
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Bad links reported by Link checker though no error

Post by ktp »

It seems that the Batch Wizard detects somehow wrongly the status code for all URL, not only for images.
For example, you can go to page 535 of the report, and there are 6 plains html that are detected as bad, with status 200 OK though.
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Bad links reported by Link checker though no error

Post by Albert Wiersch »

Thank you. If you go to 'File > Open from the Web' and try opening one of the URLs that has one or more bad links with an "OK" status, then validate it (F6), then go to the 'Links' tab in the Results Window, then press the "Play" button (the green triangle button) to check the links, then does it show the same links as bad in the 'Links' tab (bad even though it has an "OK" status)?
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Bad links reported by Link checker though no error

Post by ktp »

I did it as you said, most links are with status OK. Except (in status column):
There are xml url that got: Expected media type "application/xml" but got "text/xml" instead. (status: check media type)
Some javascript src: Media type "text/javascript" is obsolete, recommend "application/javascript" instead. (status: obsolete media type)
---
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Bad links reported by Link checker though no error

Post by Albert Wiersch »

ktp wrote: Sun Dec 20, 2020 1:29 am I did it as you said, most links are with status OK. Except (in status column):
There are xml url that got: Expected media type "application/xml" but got "text/xml" instead. (status: check media type)
Some javascript src: Media type "text/javascript" is obsolete, recommend "application/javascript" instead. (status: obsolete media type)
---
So the status column says "OK" but does the link show up red as an error? Can you provide a screenshot? Thank you.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Bad links reported by Link checker though no error

Post by ktp »

Information provided with private message.
Hope to get back information soon.
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Bad links reported by Link checker though no error

Post by Albert Wiersch »

Thank you! I received it and responded.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Bad links reported by Link checker though no error

Post by ktp »

In response to your question in private message: there is no differences in result between the web server in test and the web server in production.

What I can say, from chronological order:
- With v20, for me the editor (Results Windows) and Batch Wizard are in sync, in reorting errors or warnings messages.
- With v21, there are mismatch in messages reporting, stiil addressed by this topic viewtopic.php?f=1&t=3023
- I discover the link checker (Tool to use) in Batch Wizard options, and enable it. The Batch Wizard gives me huge number bad links (but for me no error with those links), which is the subject of this topic. In Links tab on Results Windows: no error, but Batch Wizard reports error, although all the mentioned URL have OK in Status column. This is the problem.
- Now I found some interesting thing: I discover that Link checker default options is Full, so I set it to "Errors only" (which is the default in my Editor). With this setting, no more false report on bad links : I get now real 80 bad links instead of 5255 false bad links as before.

Hope this helps. For me the problem is solved since now I have real bad links reported.

Note: by the way, still about the mismatch in setting options. If the editor chooses "Errors only", why the Batch Wizard does not follow it. Ah OK, maybe the link checker is another component, and has different settings? But this is confusing since both components should obey the same "Vaiidator Engine Options". OK, this is why it is confusing, there are Validator Engine Options and Batch Wizard Options (with the cranted wheel icon). Batch Wizard options should be named "Wizard options" wih cranted wheel insteed of only "Opions". This avoids confusion (at least for me) since there is Options in big windows menu title, then also Options in Batch Wizard menu title.

Edit (add):
Ah, in fact I think I understand now (hope so :-)). There are different options. "Errors only" in the editor (Validator Engine Options) is for checking HTML content, and "Errors only" in Link checker of Batch Wizard is for checking URL (404 etc...). Is is correct?
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Bad links reported by Link checker though no error

Post by Albert Wiersch »

ktp wrote: Sun Dec 20, 2020 11:41 pm - I discover the link checker (Tool to use) in Batch Wizard options, and enable it. The Batch Wizard gives me huge number bad links (but for me no error with those links), which is the subject of this topic. In Links tab on Results Windows: no error, but Batch Wizard reports error, although all the mentioned URL have OK in Status column. This is the problem.
When you say the 'Links' tab in the Results Window has no errors, do you mean it's all green? Or is there any "red" links (the colored square in the 'Order' column is red when CSS HTML Validator considers the link to be bad and green when it's good). This is where a screenshot would be very useful especially because I cannot seem to reproduce this issue on my end. From the Batch Wizard report you sent, it seems that links are being reported with a status of "OK" but still registering as an error, so when you say that the links are OK, I don't know whether you mean the status is being reported as "OK" but still counted as an error, or if you mean the status is "OK" and it's not being counted as an error.
ktp wrote: Sun Dec 20, 2020 11:41 pm - Now I found some interesting thing: I discover that Link checker default options is Full, so I set it to "Errors only" (which is the default in my Editor). With this setting, no more false report on bad links : I get now real 80 bad links instead of 5255 false bad links as before.
The 'Errors only' setting is for the validator, not the link checker. I do not know why this setting would fix the link checker problem. Perhaps it's a coincidence and something has changed on the server and it is now reporting correctly to the link checker?
ktp wrote: Sun Dec 20, 2020 11:41 pmHope this helps. For me the problem is solved since now I have real bad links reported.
That's great but again, I do not know why changing the Batch Wizard 'Tool to Use' setting to 'Errors only' would fix this link checker issue.
ktp wrote: Sun Dec 20, 2020 11:41 pmNote: by the way, still about the mismatch in setting options. If the editor chooses "Errors only", why the Batch Wizard does not follow it.
The Batch Wizard uses the settings in the Batch Wizard Options 'Tool to Use' page while the editor uses the other setting. This allows you two have two different validation mode settings because often you may want to set the Batch Wizard to use a setting that generates fewer validator messages (like 'Errors only').
ktp wrote: Sun Dec 20, 2020 11:41 pm OK, this is why it is confusing, there are Validator Engine Options and Batch Wizard Options (with the cranted wheel icon). Batch Wizard options should be named "Wizard options" wih cranted wheel insteed of only "Opions". This avoids confusion (at least for me) since there is Options in big windows menu title, then also Options in Batch Wizard menu title.
I think I understand what you are saying, but the "Options" button in the editor is for options that are tailored to the options when using the editor and the "Options" button in the Batch Wizard is for options that are for the Batch Wizard. The buttons look the same with the same image but they bring up different options depending on what window they are in.
ktp wrote: Sun Dec 20, 2020 11:41 pmEdit (add):
Ah, in fact I think I understand now (hope so :-)). There are different options. "Errors only" in the editor (Validator Engine Options) is for checking HTML content, and "Errors only" in Link checker of Batch Wizard is for checking URL (404 etc...). Is is correct?
No, the 'Errors only' option in the Batch Wizard Options 'Tool to Use' page is for the validator and not the link checker, and the 'Errors only' option in the editor is also for the validator and not the link checker. There is no selection of 'Errors only' or 'Errors and warnings only' for the link checker because that is only for the validator. They are 'validation modes' and do not apply to the link checker. You can read more about the validation modes here:
https://www.htmlvalidator.com/current/d ... _modes.htm

I hope this helps clarify things. I will be doing further research on the link checker issues you are having on Monday (later today because it is 12:15 AM on Monday right now :) ).
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Bad links reported by Link checker though no error

Post by ktp »

Albert Wiersch wrote: Mon Dec 21, 2020 12:14 am
ktp wrote: Sun Dec 20, 2020 11:41 pm - Now I found some interesting thing: I discover that Link checker default options is Full, so I set it to "Errors only" (which is the default in my Editor). With this setting, no more false report on bad links : I get now real 80 bad links instead of 5255 false bad links as before.
The 'Errors only' setting is for the validator, not the link checker. I do not know why this setting would fix the link checker problem. Perhaps it's a coincidence and something has changed on the server and it is now reporting correctly to the link checker?
I provide you two screenshots of my runs:

1) Dec 19th 2020, 21h28, 5255 false bad links reported in 541 pages. Batch Wizard option/Tool to use/Validator/Full.
2) Dec 20th 2020, 21h05, 80 real bad links reported in 23 pages. Batch Wizard option/Tool to use/Validator/Errors only.

False bad links: it reported as bad, but the Status column all are marked OK (you see this with my zipped report). This is the big problem.
Real bad links: I agree that the reported links are bad. This is the information I need, and I have to fix them.
As far as I remember, I did not change anything in the server between the two runs.

On Dec 20th 2020, I changed the option to "Errors only" since I hope to get rid of the false bad links that garbaged the Batch Wizard report, that would make I missed some real bad links (difficult for a human to browse through 541 pages).
Attachments
CSS_html_5255_false_bad_links_2020-12-19_21-28_541pages.jpg
CSS_html_5255_false_bad_links_2020-12-19_21-28_541pages.jpg (171.31 KiB) Viewed 3854 times
CSS_html_80_real_bad_links_2020-12-20_21-05_23pages.jpg
CSS_html_80_real_bad_links_2020-12-20_21-05_23pages.jpg (176.25 KiB) Viewed 3854 times
ktp
Rank III - Intermediate
Posts: 60
Joined: Sat Oct 29, 2016 10:34 am

Re: Bad links reported by Link checker though no error

Post by ktp »

Another screen capture from the Link report (it said so). It said: 1001 links extracted: 3 bad, 0 warning, 997 good, 1 not checked, 0 excluded, 3 in this link report. Then it shows 3 lines, each line with a link, with status OK for all 3 links. Note: On other pages of the report, if there are 6 links, then it said: 6 bad etc...

So for plain user, it is contradictory! It is as if the Link report as a person said to me: Hey, I checked for you the links, there are 3 bad links, here the proof: all 3 links have status 200 OK. Do you agree? You can guess my answer! Unless there are recent changes in HTTP protocol that I am unaware of, so now 2xx code is swapped with 4xx or 5xx :-). By the way, the server supports http/2.
Attachments
CSS_html_bad_links_report_2020-12-21_055952.jpg
CSS_html_bad_links_report_2020-12-21_055952.jpg (64.49 KiB) Viewed 3846 times
Post Reply