Google and Bing and poor HTML

Post here if your message doesn't fit into another forum but is still about web development. Includes site critiques, web hosting and server questions, helpful software and resources, and more.
Post Reply
User avatar
Albert Wiersch
Site Admin
Site Admin
Posts: 3414
Joined: Sat Dec 11, 2004 9:23 am
Location: Near Dallas, TX
Contact:

Google and Bing and poor HTML

Post by Albert Wiersch » Thu Jun 18, 2009 12:23 pm

I usually wonder why big tech companies like Google and Microsoft (bing.com) don't care to have better HTML.

Google:
http://onlinewebcheck.com/check.php?url ... oogle.com/

Google doesn't properly quote certain attribute values or use "&" for '&' in URLs. They also don't even have end tags for "body" and "html" (as of this writing).

On first glance, the new search engine Bing seems to do a better job:
http://onlinewebcheck.com/check.php?url ... .bing.com/

But if you look closer, there are issues that are found with the std/pro edition of CSE HTML Validator but not the lite edition. Some of them are putting "p" and "div" tags in an "a" element.

Would it really be that much harder to use better, more structured, more valid, HTML? :D
Image
Albert Wiersch

User avatar
MikeGale
Rank VI - Professional
Rank VI - Professional
Posts: 709
Joined: Mon Dec 13, 2004 1:50 pm
Location: Tannhauser Gate

Re: Google and Bing and poor HTML

Post by MikeGale » Thu Jun 18, 2009 4:45 pm

Yes I noticed that.

It's also a mystery to me.

I tend to think that the people who started many of these programs that write HTML simply didn't have any idea of how to do it right. Let's face it you can get away with a lot, the browsers are so forgiving. Then the code gets stuck.

On top of that the design of HTML has kinda ground to a halt for years, so the frustration at the things missing from it is growing. That might lead to a cavalier attitude to doing things right, even when you do know how.

For example in bing I see a "u" parameter in div tags. (It looks similar to u="8|76220799722437|fcc5e23b,776236cd".) It seems to tie into mapping against some sort of cache, but I haven't yet taken the time to figure it out. I can see that this isn't a legal ID (must start with an alaphabetic) so I can understand that they use another parameter name. (In my view that is a serious design error (in X/HTML), not allowing purely numeric ID's.) In fact I'm not sure there's anything else a competent designer could do.

bing also contains spurious/no-parameter spans and divs. That's just "lack of attention to detail" from the programmer/s, in my view.

I haven't got notes on Google. It's just too full of bad markup (last time I checked). It's too overwhelming (= time consuming) to analyse much.

I sometimes wonder whether Google started out as a case of ignorance but then retained the mess for other reasons. For example they don't like people to use their results programmatically. Maybe having a markup mess is considered to be a protection against screen scraping. I don't know. (I certainly hope there are people in Google who are deeply embarrassed by the mess they flood into our browsers. They have no excuse for ignorance, they now make a browser.)

Side Note: When a large part of Internet volume is such bad markup it contributes to slow progress. Browsers need to be able to handle this stuff.

bing is still technically in Beta so there might be a chance that they fix it.

I see no way to influence this. I guess that if there were a sufficiently noticeable public outcry against The Revolting Incompetence of Search Engines then we might see action. Who knows.

Post Reply