Batch Wizard doesn't propagate jsessionid
Batch Wizard doesn't propagate jsessionid
I am trying to link check a site that uses JSESSIONID. I'm not sure what's wrong with how the links are being generated but I'm getting lists of hundreds of pages when the whole site probably contains less than 50 pages. The pages are generated dynamically so I don't have an exact count. Is there a way to make the batch link checker ignore the JSESSIONID for purposes of deciding whether it has already crawled a link? I have 9.0302
- Albert Wiersch
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
- Contact:
Re: Batch Wizard doesn't propagate jsessionid
I'm sorry for the trouble.
Could send me your target list and a sample Batch Wizard report? Send to support at htmlvalidator dot com. Thank you.
Could send me your target list and a sample Batch Wizard report? Send to support at htmlvalidator dot com. Thank you.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
Re: Batch Wizard doesn't propagate jsessionid
I can't send a target report because so far the batch wizard has never completed. I've just set it off to run again.
I'll send the site by private email.
...
after running the link checker again for 23 minutes (1 second delay between requests) the list was up above 1400 links and pages had started to time out. I just canceled the run. I'll send what I have.
I'll send the site by private email.
...
after running the link checker again for 23 minutes (1 second delay between requests) the list was up above 1400 links and pages had started to time out. I just canceled the run. I'll send what I have.
- Albert Wiersch
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
- Contact:
Re: Batch Wizard doesn't propagate jsessionid
Thanks. I got it and plan on checking it out tomorrow. A canceled Batch Wizard report should be fine to help in finding the problem.bknights wrote:I can't send a target report because so far the batch wizard has never completed. I've just set it off to run again.
I'll send the site by private email.
...
after running the link checker again for 23 minutes (1 second delay between requests) the list was up above 1400 links and pages had started to time out. I just canceled the run. I'll send what I have.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
- Albert Wiersch
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
- Contact:
Re: Batch Wizard doesn't propagate jsessionid
Hi Brett,
I've reviewed the report files. It seems there is more than once session ID. It is possible this could be solved with an option to strip the jsessionid out of the URL for purposes of comparing whether a link has already been checked or not. Do you think this would address the problem? If so, I may be able to add this in the next major release but I can't make any guarantees.
A solution you could use now would be to add each link you want to check manually to the Batch Wizard and not have it follow links.
I've reviewed the report files. It seems there is more than once session ID. It is possible this could be solved with an option to strip the jsessionid out of the URL for purposes of comparing whether a link has already been checked or not. Do you think this would address the problem? If so, I may be able to add this in the next major release but I can't make any guarantees.
A solution you could use now would be to add each link you want to check manually to the Batch Wizard and not have it follow links.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
Re: Batch Wizard doesn't propagate jsessionid
Albert,
Yes stripping out or ignoring the jsessionid would work quite well.
Regarding your workaround: can I load a list of links or do I have to do that one-by-one?
Yes stripping out or ignoring the jsessionid would work quite well.
Regarding your workaround: can I load a list of links or do I have to do that one-by-one?
- Albert Wiersch
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
- Contact:
Re: Batch Wizard doesn't propagate jsessionid
OK, I will consider an option to strip "jsessionid" when comparing URLs for a future version, though I am unable to make any guarantees. Thanks for the suggestion!bknights wrote:Albert,
Yes stripping out or ignoring the jsessionid would work quite well.
Regarding your workaround: can I load a list of links or do I have to do that one-by-one?
As for loading a list of links, yes, you can make your own Batch Wizard target list file. For more information, please see:
https://www.htmlvalidator.com/current/d ... ormats.htm
The easiest format is something like this:
Code: Select all
URL : http://www.htmlvalidator.com/page1.html
URL : http://www.htmlvalidator.com/page2.html
URL : http://www.htmlvalidator.com/page3.html
URL : http://www.htmlvalidator.com/page4.html
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial