Spell check eternal exceptions

For topics about current BETA or future releases, including feature requests.
Post Reply
User avatar
roedygr
Rank V - Professional
Rank V - Professional
Posts: 370
Joined: Fri Feb 17, 2006 5:22 am
Location: Victoria BC Canada
Contact:

Spell check eternal exceptions

Post by roedygr » Thu Apr 24, 2014 2:39 am

In spell checking, when it finds a word not in the dictionary, I have two choices:

1. add a new word to the dictionary
2. ignore the problem.

If I add it to the dictionary, this word becomes legit in all other contexts, something often inappropriate.

If I ignore it, the next time I spell check, it will pester me again and again.

I would like a third alternative, to mark it as permanently ignored. Htmvalidator would insert some markup into my text around the offending word to suppress future spell checking, e.g <span lang="x-none">... </span>

If I am computer generating text, I can put these tags in myself around material it would make no sense to spell check.

User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3221
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Spell check eternal exceptions

Post by Albert Wiersch » Thu Apr 24, 2014 6:59 am

What about a way to add a special HTML comment that contains a list of extra words to ignore from the point of the comment to the end of the document? Note that this would only work for the validator spell checking message and not for an 'F7' spell check in the editor.
Image
Albert Wiersch

User avatar
roedygr
Rank V - Professional
Rank V - Professional
Posts: 370
Joined: Fri Feb 17, 2006 5:22 am
Location: Victoria BC Canada
Contact:

Re: Spell check eternal exceptions

Post by roedygr » Fri Apr 25, 2014 6:31 am

Albert Wiersch wrote:What about a way to add a special HTML comment that contains a list of extra words to ignore from the point of the comment to the end of the document? Note that this would only work for the validator spell checking message and not for an 'F7' spell check in the editor.
I don't use the batch spell validator. There are just too many legit spelling errors. I also need to see context. I don't think that woud help, unfortunately.

User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3221
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Spell check eternal exceptions

Post by Albert Wiersch » Fri Apr 25, 2014 8:10 am

Then I assume you are referring to the spell checker in the editor (F7) and not the validator misspelled word message that the validator generates when a document is validated.

Unfortunately the editor spell checker is currently a 'dumb' spell-check. It may be improved in the future but it's not designed to parse HTML and handle the "lang" attribute like the validator engine is for the spelling message it generates.

The only work-around I can think of right now is to make active all the dictionaries for all the languages you think you need as well as all the ignore words to try to remove as many false positives from the spell checking. If you don't want to ignore a misspelled word for all documents, then you can use the 'Ignore Once' option.
Image
Albert Wiersch

User avatar
Lou
Rank V - Professional
Rank V - Professional
Posts: 246
Joined: Fri Jul 29, 2005 5:55 pm
Location: CO
Contact:

Re: Spell check eternal exceptions

Post by Lou » Fri Apr 25, 2014 9:15 am

Albert Wiersch wrote:What about a way to add a special HTML comment that contains a list of extra words to ignore from the point of the comment to the end of the document? Note that this would only work for the validator spell checking message and not for an 'F7' spell check in the editor.
Albert, I like it.
I am a horrid speller and use both spell checkers all the time. Adding a HTML comment that contains a list of words to ignore would help me.

Unless I am missing a setting somewhere, your suggestion could remove errors like

Code: Select all

knob.com (1x), phpBB (1x), Riverdale (1x), SpamCop.net (1x), Ver (1x) 
that I get all the time and must sort through to fined real errors.
Lou
Say what you will about Sisyphus. He always has work.

User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3221
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Spell check eternal exceptions

Post by Albert Wiersch » Fri Apr 25, 2014 9:25 am

Hi Lou,

The point of the HTML comment to add extra words to ignore would only be to add them for the document they are in.

In your case, it sounds like the better option might be to copy and paste the misspelled word validator message into the editor, then do a spell check (F7) and add the words to the dictionary. That should then work for all your documents.

Or do you have a need for specifying ignore words on a per-document basis?
Image
Albert Wiersch

User avatar
Lou
Rank V - Professional
Rank V - Professional
Posts: 246
Joined: Fri Jul 29, 2005 5:55 pm
Location: CO
Contact:

Re: Spell check eternal exceptions

Post by Lou » Fri Apr 25, 2014 10:41 am

Lou wrote:Unless I am missing a setting somewhere,
Guess I hadn't noticed that the validator uses the {F7} dictionaries. Thanks.

Having learned something today, I can now go back to bed. :D
Lou
Say what you will about Sisyphus. He always has work.

User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3221
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Spell check eternal exceptions

Post by Albert Wiersch » Sat Apr 26, 2014 9:29 am

Yep, the same dictionaries are used in both the editor spell check (F7) and validator spelling message. :D

If you ignore a word in an editor spell-check (F7) then it should also be ignored for the validator spelling message.
Image
Albert Wiersch

User avatar
roedygr
Rank V - Professional
Rank V - Professional
Posts: 370
Joined: Fri Feb 17, 2006 5:22 am
Location: Victoria BC Canada
Contact:

Re: Spell check eternal exceptions

Post by roedygr » Mon Apr 28, 2014 6:32 pm

Albert Wiersch wrote: Unfortunately the editor spell checker is currently a 'dumb' spell-check. It may be improved in the future but it's not designed to parse HTML and handle the "lang" attribute like the validator engine is for the spelling message it generates.
The trick then is to pre-massage what it sees. You may have to create a completely fake document.
You might break it into pieces, deentify/reentify it, remove pieces of it that should not check...
hide bits already known to be good.

I don't know what your spell API looks like, so I can't be more specific..

User avatar
roedygr
Rank V - Professional
Rank V - Professional
Posts: 370
Joined: Fri Feb 17, 2006 5:22 am
Location: Victoria BC Canada
Contact:

Re: Spell check eternal exceptions

Post by roedygr » Mon Apr 28, 2014 6:47 pm

Lou wrote:Guess I hadn't noticed that the validator uses the {F7} dictionaries.
What languages are available for the dictionary engine? Is there any way to put something in the documents to tell it which language to use,
on a per page or per span basis?

What is the engine called? Have you looked an any alternative engines?

see http://mindprod.com/jgloss/spellchecker.html

Back in the days of the punch card I invented a batch way of doing spell correction and data correction. It worked like this:

Let's say I want to clean up the spellings of cities. I dump out a punch card with each variant spelling of a city with the number of times it was used.
I then order the punch cards with the desired spelling at the front (usually the most popular) followed by all the variants. This creates a lookup table for correcting the data to canonical form. You can clean up an entire database in about 10 minutes total work. In the DOS days, when the mouse was an experimental device, I wrote a program to simulate the process on screen.

It terms of time, it takes far longer to spell check than validate and correct markup syntax. So spell checking is becoming my priority. I have much technical stuff on my site, company names, products, programs, name of people, specifications. These are all spelling exceptions, but not really words that belong with true vocabulary.

User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3221
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Re: Spell check eternal exceptions

Post by Albert Wiersch » Mon Apr 28, 2014 7:58 pm

Dictionaries are here:
http://www.htmlvalidator.com/downloaddictionaries.html

This is the dictionary component used in CSE HTML Validator:
http://www.addictivesoftware.com/index.htm

I have not looked at alternative engines.

When you add a word to the dictionary, it adds it to a user specific dictionary. I suspect it would make sense for you to add a lot of the technical stuff on your site to your user dictionary.

There's also a dictionary tool listed on the above page so that you should be able to make your own custom dictionary(ies) with the words for your site, and enable and disable that dictionary as if it were a language dictionary.

If you use the validator engine to validate or spell check the document (instead of F7), then you have a lot more control. You can write user functions to ignore certain misspelled words and you can ignore different words based on what document is being checked (if you write the user functions to do that).

But if you're using F7 and just the editor spell checker, then user functions won't matter.

Typically though, one would just add the words you wish to ignore to your user dictionary. While this might not be a perfect solution, I think it would be a big help in weeding out the false positives.
Image
Albert Wiersch

Post Reply