I'm interested in the use of web pages for more than eyecandy, text and pictures.
Maybe others here are too.
Seems that some Germans have been looking into it.
http://j.mp/Hp8pWq
They've looked at a lot of pages (from common Crawl) and extracted the information where it exists.
Of the dozen or so formats they looked at:
* hCalendar Microformat (details of events) seems to have been most common
* Followed by XFN (XHTML Friends Network), RDFa and increasingly html-microdata.
The big story is that RDFa is growing robustly, html-microdata is also growing and everything else is either growing slightly or in a decline. (The full detail is more nuanced than that!)
In the batch (/batch set) they processed 2,565,741,671 URL's of which 251,855,917 had such data. That's 9.8%. One heck of a lot more than I expected.
Apparently the work cost all of something like USD 600. That is astonishing!
If anybody here is using these things I suggest checking out what RDFa and html-microdata, have going for them.
If anybody has insight into these, would appreciate a heads up.

