After a welcome drop in workload two days ago I've now had a chance to look at the Beta. My observations follow.
My Situation
I simplify my markup and retain backward compatibility for sites that are already built. New sites probably get less backward compatibility.
To get this to work well for me I can either ignore some warnings (when I read the output) or customise the messages. My focus with these checks has been to revisit customising the validation. (In the past I abandoned that because I didn't like cusomising it all over again for each new version. This time I hope that programexport.xml, source control and diff tools can, together, make it practical.)
Things I noticed
1) The export programs feature is a lifesaver. I don't know all variables available at a point in the code, that would be valuable when modifying the programs. (Sometimes message control is limited to a flag that controls multiple messages, in that case programexport is your way to get finer control.)
2) Is there an alternative to string coding like '+"'"+' in the programs. Named entity encoding like ' might be a convenient alternative.
3) I need to enhance some tests for messages. Two examples: I use both id and name, and I use both type and language (in script). I want the message test to be "name is present, id is not present" before generating the message. The message currently suggests I might use only id or id + name, which is redundant.
4) I run the manual spell check and get nothing found. When I run validate a misspelling is found. (In this case I was using a UK English spelling, maybe all dictionaries are not in use during the validate version of spell check?) In other words I get misspelling messages and I can't kill them the obvious way.
5) Where are the accessibility messages programmed? I didn't find them in programexport.xml.
6) I'm getting some accessibility messages repeated. (Identical messages.)
7) With utf-8 encoding I get strange characters at the start of the editing panel. Shouldn't these be hidden?
I ran the program over an RSS feed. It gave a lot of red ink. It would be cool to check out RSS feeds.
First Look at Beta 5
-
- Rank VI - Professional
- Posts: 726
- Joined: Mon Dec 13, 2004 1:50 pm
- Location: Tannhauser Gate
-
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
Hi Mike,
Thanks for the feedback and taking a look at the soon-to-be-released BETA!
I know that it can be confusing when getting that involved with the program configuration.
Thanks for the feedback and taking a look at the soon-to-be-released BETA!
1) Can you be more specific as to what you'd like to see?1) The export programs feature is a lifesaver. I don't know all variables available at a point in the code, that would be valuable when modifying the programs. (Sometimes message control is limited to a flag that controls multiple messages, in that case programexport is your way to get finer control.)
I know that it can be confusing when getting that involved with the program configuration.
2) No, there's not a better way right now. I also have thought about this. I would like to address this one of these days. There's not very many people who even look at the tag name programs so it's pretty low on my list.2) Is there an alternative to string coding like '+"'"+' in the programs. Named entity encoding like ' might be a convenient alternative.
Can you send me a sample HTML document and detail exactly what you'd like CSS HTML Validator to do given the sample HTML document?3) I need to enhance some tests for messages. Two examples: I use both id and name, and I use both type and language (in script). I want the message test to be "name is present, id is not present" before generating the message. The message currently suggests I might use only id or id + name, which is redundant.
The validator spell check works a little differently, checking spelling in comments and certain attribute values that just doing a spell check in the editor might not find. You may want to uncheck the Ignore markup languages option in the Options->Spelling Options and then do a spell check in the editor. That may make it easier to add the misspelled words because you will be prompted to ignore or add them. Otherwise you can add new dictionary words by using Options->Spelling Options, clicking on the Dictionaries button, and then editing the appropriate dictionary by adding ignore words.4) I run the manual spell check and get nothing found. When I run validate a misspelling is found. (In this case I was using a UK English spelling, maybe all dictionaries are not in use during the validate version of spell check?) In other words I get misspelling messages and I can't kill them the obvious way.
Some are in programexport.xml but some are in the DLL and are called by runProgram() in the tag name programs. See the documentation for runProgram() to see if a certain accessibility message might be generated via a runProgram() call.5) Where are the accessibility messages programmed? I didn't find them in programexport.xml.
Can you send a sample document that reproduces the problem (to support@htmlvalidator.com)? There may be some similar messages, but they should not be identical -- having the same text, same category, and referring to the same problem location in the document.6) I'm getting some accessibility messages repeated. (Identical messages.)
Unfortunately, the editor component currently used does not support UTF-8. I plan to switch to a new editor component in a future version. I suspect that's the cause of the strange characters you see.7) With utf-8 encoding I get strange characters at the start of the editing panel. Shouldn't these be hidden?
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
-
- Rank VI - Professional
- Posts: 726
- Joined: Mon Dec 13, 2004 1:50 pm
- Location: Tannhauser Gate
Thanks very much.
1) Programming programexport. I have had another look. With that page and the code I can do what I want. (I was hoping for a lazy man's way to see what variables are in scope at some point in code.)
3) See email.
5) I was looking for some accessibility messages which (I guess) are inside a runProgram.
6) See email.
7) Characters in editor with utf-8 file. Yep that will be the cause. utf-8 slips in 2 bytes (I think) at the start of a file. (The rest of that file might be identical to conventional encoding.)
1) Programming programexport. I have had another look. With that page and the code I can do what I want. (I was hoping for a lazy man's way to see what variables are in scope at some point in code.)
3) See email.
5) I was looking for some accessibility messages which (I guess) are inside a runProgram.
6) See email.
7) Characters in editor with utf-8 file. Yep that will be the cause. utf-8 slips in 2 bytes (I think) at the start of a file. (The rest of that file might be identical to conventional encoding.)
-
- Site Admin
- Posts: 3785
- Joined: Sat Dec 11, 2004 9:23 am
- Location: Near Dallas, TX
Hi Mike,MikeGale wrote:Thanks very much.
1) Programming programexport. I have had another look. With that page and the code I can do what I want. (I was hoping for a lazy man's way to see what variables are in scope at some point in code.)
3) See email.
5) I was looking for some accessibility messages which (I guess) are inside a runProgram.
6) See email.
7) Characters in editor with utf-8 file. Yep that will be the cause. utf-8 slips in 2 bytes (I think) at the start of a file. (The rest of that file might be identical to conventional encoding.)
You're welcome!
1) Sorry, there's no way to see what variables are in scope... but I can say that all variables are global variables so "everything" should be in scope.
3 & 6) I received your emails and will respond via email later in the day.
Albert Wiersch, CSS HTML Validator Developer • Download CSS HTML Validator FREE Trial
-
- Rank 0 - Newcomer
- Posts: 1
- Joined: Thu Nov 10, 2005 7:14 am
Sorry to bump this, just wanted to elaborate on this topic.MikeGale wrote:7) Characters in editor with utf-8 file. Yep that will be the cause. utf-8 slips in 2 bytes (I think) at the start of a file. (The rest of that file might be identical to conventional encoding.)
The two bytes slipped into the document are called BOM (byte-order-mark) and used to determine the endian-ness of the UTF encoding of the file. They are optional to the UTF8 encoding but absence of them may cause problems when you cross platforms
Rest of the file will be identical, only if you use lower ascii characters (e.g. from the english codepage). If the document contains higher characters such as umlauts or cryllic, these characters will appear garbled as well, with the current editor without UTF support.
-
- Rank VI - Professional
- Posts: 726
- Joined: Mon Dec 13, 2004 1:50 pm
- Location: Tannhauser Gate
Yep Justine it's those BOM.
As you say you need to use the common part of the ASCII set. This tends to mean numeric entities are safer in many cases. Passing the material through different editors and CMS packages gives interesting results!!
I put up a little article about some experiences at:
http://www.decisionz.com/document/CMS/C ... nXHTML.htm
As you say you need to use the common part of the ASCII set. This tends to mean numeric entities are safer in many cases. Passing the material through different editors and CMS packages gives interesting results!!
I put up a little article about some experiences at:
http://www.decisionz.com/document/CMS/C ... nXHTML.htm