Hello,
It takes more than 18 hours for batch wizard to crawl my website.
Is there a feature I can disable to speed up the crawl?
Thank you,
Alex.
Batch wizard performance
-
- Rank II - Novice
- Posts: 34
- Joined: Wed May 05, 2010 4:41 pm
Batch wizard performance
Conversions and Calculations
https://www.aqua-calc.com
https://www.aqua-calc.com
-
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
Re: Batch wizard performance
Hi Alex,
Great question. I have mulled over some possibilities for increasing the Batch Wizard speed for large jobs like yours.
Here is the documentation page that I'm working on:
https://www.htmlvalidator.com/2020/docs ... peedup.htm
There are several things to try. Do you know what is causing the most delay? Is a CPU core stuck at 100% validating documents? Is the computer doing a lot of waiting for the HTTP requests to finish before it can validate? It the link checking causing a long delay?
If there was a way to validate on the URLs that have changed since the last validation then would that be acceptable? That could significantly speed things up by greatly reducing the amount of documents that need to be checked.
Great question. I have mulled over some possibilities for increasing the Batch Wizard speed for large jobs like yours.
Here is the documentation page that I'm working on:
https://www.htmlvalidator.com/2020/docs ... peedup.htm
There are several things to try. Do you know what is causing the most delay? Is a CPU core stuck at 100% validating documents? Is the computer doing a lot of waiting for the HTTP requests to finish before it can validate? It the link checking causing a long delay?
If there was a way to validate on the URLs that have changed since the last validation then would that be acceptable? That could significantly speed things up by greatly reducing the amount of documents that need to be checked.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
-
- Rank II - Novice
- Posts: 34
- Joined: Wed May 05, 2010 4:41 pm
Re: Batch wizard performance
Hello,
I've disabled the following checks, and the elapsed time had decreased from 18 to 14 hours:
JSHint, JSLint, PHP Checker, Security Messages, Search Engine Messages and the keyword density message, and spell checking.
That's a very good improvement for me.
-- Thank you, Albert for the published "wizard_speedup.htm" document!
Albert, how expensive are the checks for duplicate HTML titles and meta descriptions?
Thank you!
Alex
I've disabled the following checks, and the elapsed time had decreased from 18 to 14 hours:
JSHint, JSLint, PHP Checker, Security Messages, Search Engine Messages and the keyword density message, and spell checking.
That's a very good improvement for me.
-- Thank you, Albert for the published "wizard_speedup.htm" document!
Albert, how expensive are the checks for duplicate HTML titles and meta descriptions?
Thank you!
Alex
Conversions and Calculations
https://www.aqua-calc.com
https://www.aqua-calc.com
-
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
Re: Batch wizard performance
Hi Alex,
That's great. That's a 22% improvement if my calculations are correct.
I have not done any tests to figure out how expensive the duplicate page title and duplicate meta description tests are. You process a large number of documents so they could be "expensive". If you want to find out though, then I can probably add some timing information to the Batch Wizard progress window that will spit out some processing times for these reports. It would have to be for the next major release of CSS HTML Validator which will be 2020/v20. If you'd like to test this out and help test a BETA version of the upcoming new major release then please let me know and I will put it on my to-do list.
That's great. That's a 22% improvement if my calculations are correct.
I have not done any tests to figure out how expensive the duplicate page title and duplicate meta description tests are. You process a large number of documents so they could be "expensive". If you want to find out though, then I can probably add some timing information to the Batch Wizard progress window that will spit out some processing times for these reports. It would have to be for the next major release of CSS HTML Validator which will be 2020/v20. If you'd like to test this out and help test a BETA version of the upcoming new major release then please let me know and I will put it on my to-do list.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
-
- Rank II - Novice
- Posts: 34
- Joined: Wed May 05, 2010 4:41 pm
Re: Batch wizard performance
Hi Albert,
Yes, by all means. I'd like timing information:-)
Alex.
Yes, by all means. I'd like timing information:-)
Alex.
Conversions and Calculations
https://www.aqua-calc.com
https://www.aqua-calc.com
-
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
Re: Batch wizard performance
Hi Alex,
Great. I will work on that in the next week or two and get a BETA to you for testing when ready.
Great. I will work on that in the next week or two and get a BETA to you for testing when ready.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
-
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
Re: Batch wizard performance
Hi Alex,
I hope to get a BETA to you by the end of this week.
I also wanted to let you know that I forgot to put HTML Tidy checking on that page. You're probably not using that but if you are then disabling HTML Tidy checking could save a significant amount of time. The next version of the documentation page will include this.
I hope to get a BETA to you by the end of this week.
I also wanted to let you know that I forgot to put HTML Tidy checking on that page. You're probably not using that but if you are then disabling HTML Tidy checking could save a significant amount of time. The next version of the documentation page will include this.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
-
- Rank II - Novice
- Posts: 34
- Joined: Wed May 05, 2010 4:41 pm
Re: Batch wizard performance
Hi Albert,
No, I wasn't using HTML Tidy checking.
I'm looking forward to testing the beta release.
Best,
Alex.
No, I wasn't using HTML Tidy checking.
I'm looking forward to testing the beta release.
Best,
Alex.
Conversions and Calculations
https://www.aqua-calc.com
https://www.aqua-calc.com
-
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
Re: Batch wizard performance
In case anyone following this topic is interested, adding the timing information for the duplicate title and meta description reports and then doing some test runs did not show that those reports where taking an inordinate amount of time.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial