David Sifry on ENT.
Easy News Topics. Last week, Paolo Valdermarin and Matt Mower released their specification of Easy News Topics 1.0 (ENT), which is designed as an RSS 2.0 module that can add topic and categorization information to an RSS feed. I committed to get back to them (and others) with a review and some commentary on the approach.
The good news: As a format, ENT is easy to understand, easy for application developers to implement, and pretty easy to parse. Kudos to Matt and Paolo for coming up with a design that is simple but extensible.
Now the bad news: I'm worried about two issues. First is the problem of self-categorization. ENT presupposes that authors can successfully create microcontent with the following properties: It can be placed in one or more categories the author is qualified to categorize the content correctly the author's categories have meaning to the reader In addition, we then run into a larger problem with self-categorization, which is the question of categorization across feeds. In other words, we have a problem of definitions - one person's rebel is another person's revolutionary.
Even with ENT's inclusion of clouds, which are (potentially) external topic maps that create self-consistent maps of the world, we still have the problem of intentional or unintentional misunderstanding and misreading of metadata like categories, which leads me to think that the entire concept of self-categorization is extremely difficult to work on a large scale. A good example of this failure to scale is the history of web page... [Sifry's Alerts]
David presents a good analysis of ENT 1.0 and, as you would expect from David, some of the wider issues around self-categorization of data. In particular he compares a future of users adding topics to their RSS feeds to the abuse of META tags in HTML. It's a point worth discussing:
Off the cuff I can see two potential arguments to suggest that this won't be a big problem.
- ENT is not designed only for self-categorization of feeds. Yes that's how we intend to use it, and certainly I think a lot of people will use it that way. But ENT could just as easily be used by a categorizer bot that sucked in feeds and annotated them (using heuristics) with topics from it's own cloud. This thought has lead me to wonder if there is some need for authorizing the use of a cloud. Would you trust Googlebot to add topics to RSS feeds? Or Feedster bot?
- As David points out, the solution to the META problem, as wrought by Google, was to bring an element of the social into the mix. He rightly, I think, indicates that a solution to the eventual problems of metadata in RSS will probably be social also. There has been a lot of talk recently about identity and reputation systems. Blogging tends to be very much more personable than ever web publishing was before. I read sites because they are meaningful to me. If your categorization isn't, I probably won't read you for long.
[Curiouser and curiouser!]
My own response which I sent David already:
So what is a solution for this problem? How can technology solve this? Isn't this the complexity, conundrum and on-going challenge - that we'll all face - trying to build a semantic web?
Your tomato is my toMAHtoe. I don't see how any technology can solve this. If this is one of the things the meta metas in RDF try to solve - well then, no wonder it's so complex!
One thing we've learned from RSS, OPML and XML-RPC - even HTML and HTTP = is KISS.
Keep it simple - stupid. So ENT to me seems to be JUST the right level of flexibility versus simplicity.