cseignore for spelling

For topics about current or future BETA releases, including feature requests.

cseignore for spelling

Postby roedygr » Sun Nov 06, 2011 4:04 am

I spend hours every day spell checking, or more accurately RE-spell checking. I am checking to make sure none of the edits introduced errors.

I have keep hitting ignore over and over and over for several reasons:

    acronyms with embedded tags.
    words with accented letters done with entities.
    dialect or precise quoting.
    foreign language (usually French)
    I have already thoroughly checked this section, and I have made no changes to it.

In regular English you can say "sic" to indicate, you know this is wrong, but I that's what the orginal said.

It would be nice if there were something analogous to <cseignore> called <csesic></csesic>that you could surround text you don't want spell checked.

I suppose another way to do it would be to use a magic css class instead of <cseignore> and <csesic>, then other tools would not complain about the strange tag.
User avatar
roedygr
Rank V - Professional
Rank V - Professional
 
Posts: 242
Joined: Fri Feb 17, 2006 6:22 am
Location: Victoria BC Canada

Re: cseignore for spelling

Postby Albert Wiersch » Mon Nov 07, 2011 11:13 am

Hi Roedy,

I assume you are referring to spell-checking in the editor (F7) since you are hitting an "Ignore" button.

Hopefully some of these issues will be resolved in the future, but there are some things you may want to try:

1. Instead of hitting 'Ignore', use 'Add to Dictionary'.
2. Add dictionaries to support foreign languages. You can download them here: http://www.htmlvalidator.com/downloaddictionaries.html
3. Select the part of the document that you want to spell-check, then hit F7 and check only the selected part.

I hope this is helpful. :D
Image
Albert Wiersch
User avatar
Albert Wiersch
Site Admin
Site Admin
 
Posts: 2361
Joined: Sat Dec 11, 2004 10:23 am
Location: Near Dallas, TX

Re: cseignore for spelling

Postby roedygr » Thu Nov 10, 2011 6:42 am

roedygr wrote:1. Instead of hitting 'Ignore', use 'Add to Dictionary'.
2. Add dictionaries to support foreign languages. You can download them here: http://www.htmlvalidator.com/downloaddictionaries.html
3. Select the part of the document that you want to spell-check, then hit F7 and check only the selected part.
roedygr wrote:I

I have keep hitting ignore over and over and over for several reasons:

Another problem is includes. I have snippets of text include in many files. I wrote my own macro generator that does the including. The problem with the current spell checker is that I end up validating the same text over and over since it is inline in the uploaded HTML. And if I am not on my toes , I accidentally correct one of the broadcast copies, not the master. I would like a way to turn off spell check of the copies and only turn it on for the master.


User avatar
roedygr
Rank V - Professional
Rank V - Professional
 
Posts: 242
Joined: Fri Feb 17, 2006 6:22 am
Location: Victoria BC Canada

Re: cseignore for spelling

Postby roedygr » Thu Nov 10, 2011 7:00 am

Albert Wiersch wrote:Hopefully some of these issues will be resolved in the future, but there are some things you may want to try:

1. Instead of hitting 'Ignore', use 'Add to Dictionary'.
2. Add dictionaries to support foreign languages. You can download them here: http://www.htmlvalidator.com/downloaddictionaries.html
3. Select the part of the document that you want to spell-check, then hit F7 and check only the selected part.


1. I am reluctant to do this since these partial words are not valid generally. I don't want nimals in my dictionary in case I use it accidentally an a typo outside the context of a generated acronym expansion. I have some of them in, despite the violence it does to my anally retentive nature.

2. I have English, French and Esperanto, and a touch of German and even a few words of a dozen other languages on my site The problem is only with accented letters. I currently do these with entities, mainly because my trusty macro-editor Slick-Edit cannot handle UTF-8 and the new version that can is well outside by price point. The problem is entities, not isolating blocks of text by language. I wonder what people who write bi-lingual web pages to do keep the languages spell-checked separately.

3. That would work for freshly written document undergoing rounds of composition and validation. What gives me most trouble is catching stray stuff in files I did not even notice I had changed, something too tiny to bother with a validation at the time. I want to validate everything at least once a day, preferably once an hour before every upload, to make sure nothing slips by. I am very picky about spelling and grammar in others, so it would by hypocritical if my website had any spelling errors. This is why I am so big on a cache. One side effect it would let me run gigantic batches unattended since most of it would do nothing.
User avatar
roedygr
Rank V - Professional
Rank V - Professional
 
Posts: 242
Joined: Fri Feb 17, 2006 6:22 am
Location: Victoria BC Canada

Re: cseignore for spelling

Postby roedygr » Thu Nov 10, 2011 7:11 am

Another approach that may be simpler to integrate with a thing party spell checker is just to consider & and ; as letters to the alphabet, and feed words containing them to your spell checker add-on. Its guesses at alternates would be silly, but at least it could avoid crying wolf on every accented letter.

I assume you are living in the USA. So the problem of accents in not in-your-face. But it certainly is for anyone in Europe who might want to use HTMLValidator.

You have a chicken and egg problem.
    Europeans won't use your product because it can't handle accents.
    You can't invest resources in handling accents because you don't have European customers.
My point is, this is a more important problem than you realise. It is also a bigger opportunity. You have already sold to most of the Americans who want your product. You have to go after new markets to get new sales.
User avatar
roedygr
Rank V - Professional
Rank V - Professional
 
Posts: 242
Joined: Fri Feb 17, 2006 6:22 am
Location: Victoria BC Canada

Re: cseignore for spelling

Postby roedygr » Thu Nov 10, 2011 7:44 am

There are two closely related spell checking problems
1. handling entities
2. handling words with embedded style information e.g. William Sarg<b>a</b>nt , or acronyms:

<span class="acronym2">SPCA</span> (<span class="means"><span class="ac2">S</span>ociety for the <span class="ac2">P</span>revention of <span class="ac2">C</span>ruelty to <span class="ac2">A</span>nimals</span>)

The first in the more pressing for economic opportunities. The second is easier to solve since it does not require a change to the dictionary structure.

You won't get any sales to the Canadian government without handling accents including &Eacute; upper case E acute, peculiar to Canada. Canada is an extremely bi-lingual country. You won't get important job in government without fluency in both languages. And of course every goverment and major business website, and document is bi-lingual. If you do get a sale, you will likely sell thousands of copies at a pop. Canadian bureaucrats would love HTMLValidator. They would even mandate the config files and make it mandatory for every employee.

It is obviously more work to sell to a government, but once you do, other governments may buy without much effort. Further, you get guaranteed updates forever, and a massive initial sale. I have little experience in sales, and even less in making money. Just please just talk these ideas over with your investors.

Yet another approach is to find a replacement spell check engine plug-in that handles accents, entities and tags with ease.
Image
Roedy Green
User avatar
roedygr
Rank V - Professional
Rank V - Professional
 
Posts: 242
Joined: Fri Feb 17, 2006 6:22 am
Location: Victoria BC Canada


Return to CSE BETA Talk

Who is online

Users browsing this forum: No registered users and 2 guests