Wednesday, January 9, 2008

Semantic Coding - Whose Job Is It?

Pete Warden covered Ratchet-X on his blog this week. While Ratchet-X was prominently featured, the real thrust of his posting has to do with the semantic coding of content and screen scraping. I think Pete’s most salient point is the following:

“The promise of the semantic web is that it will allow your computer to understand the data on a web page, so you can search, analyze and display it in different forms. The top-down approach is to ask web-site creators to add information about the data on a page. I can't see this ever working, it just takes too much time for almost no reward to the publisher.”

I strongly agree with this point. Publishers will only code their content and services with metadata if there’s something in it for them. Unfortunately for most publishers, the rewards are not commensurate with the effort. And for those willing to put in the effort, what semantic schemes should they use?

At RatchetSoft, we break the semantic issue down into two bases components; access and meaning. Ratchet-X goes a long way in solving the “access” issue by creating a user-focused method for accessing data sources by leveraging established accessibility standards (MSAA and MUIA). These methods are much more reliable and stable than traditional screen scraping techniques.
The “meaning” issue is a bit more challenging. On that front, we shift the semantic coding responsibility to the entity that actually reaps the benefit of supplying the semantic metadata. So, if you’re a user that wants to add new features to existing application screens, you have a vested interest in supplying metadata about those screens and data sources so they can be processed by external services. If you are a publisher who has a financial interest in exposing data in new ways to increase consumption of data, you have a strong motivation to semantically code your information.

While I’d love to see broad adoption of one semantic scheme, I don’t see this happening any time soon. This is why our Ratchet-X product not only allows plug-in authors to supply metadata about their content (via xmodels), we also allow end users to supply metadata about the data sources they frequently use. While allowing both publishers and consumers to supply metadata about data sources poses some potential conflict and duplication related risks, it also allows us to shift the responsibility of coding to the party that receives the benefit.