Faster Batch validation, yet again.

For topics about current BETA or future releases, including feature requests.
Post Reply
User avatar
roedygr
Rank V - Professional
Posts: 367
Joined: Fri Feb 17, 2006 5:22 am
Location: Victoria BC Canada
Contact:

Faster Batch validation, yet again.

Post by roedygr »

I have been massively speeding up some of my own Java code by using threads. However, I understand that is much harder to do with C++ which is how I understand HTMLValidator was written.

I thought you might do it this way that avoids the thread complication.

Write a single thread batch validator that takes a list of files and puts its results in a file in computer-friendly format.

Then you can configure N, the number of instances to run. The Master program that kicks off the batch process collects the list of files to process. It "deals" them in N piles much the way you would deal cards. Then it launches N independnt copies of the batch validator, each handed 1/Nth of the files to process. These processes should terminate at roughly the same time. Then the mother program collects the output files, sorts them, and formats them into HTML, and prepares a list of file that have problems that can be used to load them into the editor.

Another wrinkle would be to use an SQL database where you record the problems discovered. That way you can bypass processing if the file is unchanged since the data in the database.

I keep pushing you so that I can mindlessly run "validate everything" periodically and have it finish in a few seconds if I have not changed all that much.
User avatar
Albert Wiersch
Site Admin
Posts: 3785
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Faster Batch validation, yet again.

Post by Albert Wiersch »

Yes, there would definitely be "complications" to resolve and lots of testing to make the DLL engine multi-threaded. It is something that I'd like to do, but other things have had higher priority because the quality of syntax checking is more important than Batch Wizard speed.

Your suggestion might work for local files and folder targets, but for many, they enter a URL target and spider the site. The pages need to be validated to extract the links to follow & check, so I don't believe it would work in that case because the number of documents to check is unknown until the job is completed.

I'm going to see if I can do anything about this issue in v12, but it would be after other features are added that I consider to be 'higher priority', but I hope a solution can be found to help speed up your Batch Wizard jobs.

What about the 'Limit to documents modified within this number of days' option (for folder targets)? Do you use that option?
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
Post Reply