Batch Wizard doesn't propagate jsessionid

For technical support for all editions of CSS HTML Validator. Includes bug reports.
Post Reply
bknights
Rank 0 - Newcomer
Rank 0 - Newcomer
Posts: 5
Joined: Mon Jan 16, 2006 2:20 pm

Batch Wizard doesn't propagate jsessionid

Post by bknights » Wed Sep 30, 2009 3:45 pm

I am trying to link check a site that uses JSESSIONID. I'm not sure what's wrong with how the links are being generated but I'm getting lists of hundreds of pages when the whole site probably contains less than 50 pages. The pages are generated dynamically so I don't have an exact count. Is there a way to make the batch link checker ignore the JSESSIONID for purposes of deciding whether it has already crawled a link? I have 9.0302

User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3451
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Batch Wizard doesn't propagate jsessionid

Post by Albert Wiersch » Wed Sep 30, 2009 5:56 pm

I'm sorry for the trouble.

Could send me your target list and a sample Batch Wizard report? Send to support at htmlvalidator dot com. Thank you.
Image
Albert Wiersch

bknights
Rank 0 - Newcomer
Rank 0 - Newcomer
Posts: 5
Joined: Mon Jan 16, 2006 2:20 pm

Re: Batch Wizard doesn't propagate jsessionid

Post by bknights » Thu Oct 01, 2009 1:53 pm

I can't send a target report because so far the batch wizard has never completed. I've just set it off to run again.
I'll send the site by private email.

...
after running the link checker again for 23 minutes (1 second delay between requests) the list was up above 1400 links and pages had started to time out. I just canceled the run. I'll send what I have.

User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3451
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Batch Wizard doesn't propagate jsessionid

Post by Albert Wiersch » Thu Oct 01, 2009 3:20 pm

bknights wrote:I can't send a target report because so far the batch wizard has never completed. I've just set it off to run again.
I'll send the site by private email.

...
after running the link checker again for 23 minutes (1 second delay between requests) the list was up above 1400 links and pages had started to time out. I just canceled the run. I'll send what I have.
Thanks. I got it and plan on checking it out tomorrow. A canceled Batch Wizard report should be fine to help in finding the problem.
Image
Albert Wiersch

User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3451
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Batch Wizard doesn't propagate jsessionid

Post by Albert Wiersch » Fri Oct 02, 2009 10:30 am

Hi Brett,

I've reviewed the report files. It seems there is more than once session ID. It is possible this could be solved with an option to strip the jsessionid out of the URL for purposes of comparing whether a link has already been checked or not. Do you think this would address the problem? If so, I may be able to add this in the next major release but I can't make any guarantees.

A solution you could use now would be to add each link you want to check manually to the Batch Wizard and not have it follow links.
Image
Albert Wiersch

bknights
Rank 0 - Newcomer
Rank 0 - Newcomer
Posts: 5
Joined: Mon Jan 16, 2006 2:20 pm

Re: Batch Wizard doesn't propagate jsessionid

Post by bknights » Mon Oct 05, 2009 7:27 pm

Albert,
Yes stripping out or ignoring the jsessionid would work quite well.
Regarding your workaround: can I load a list of links or do I have to do that one-by-one?

User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3451
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Batch Wizard doesn't propagate jsessionid

Post by Albert Wiersch » Tue Oct 06, 2009 9:07 am

bknights wrote:Albert,
Yes stripping out or ignoring the jsessionid would work quite well.
Regarding your workaround: can I load a list of links or do I have to do that one-by-one?
OK, I will consider an option to strip "jsessionid" when comparing URLs for a future version, though I am unable to make any guarantees. Thanks for the suggestion!

As for loading a list of links, yes, you can make your own Batch Wizard target list file. For more information, please see:
http://www.htmlvalidator.com/htmlval/v9 ... ormats.htm

The easiest format is something like this:

Code: Select all

URL : http://www.htmlvalidator.com/page1.html
URL : http://www.htmlvalidator.com/page2.html
URL : http://www.htmlvalidator.com/page3.html
URL : http://www.htmlvalidator.com/page4.html
Image
Albert Wiersch

Post Reply