September 16, 2005

Open Tag, Close Tag

I'm not sure if its missing entirely, or if I just missed it, but the whole
tagging/folksonomy discussion seems to be lacking any significant exploration of the difference between various types of tags. One difference in particular strikes me as important, the difference in who can author and edit the tags given to any particular item. Whether the tags are open, as in open to the public to create and edit, or closed, for private creation only. This in not a dichotomy, there is plenty of gradation and also approaches that lie somewhat off this spectrum of open to closed. But it is a decent guideline for what is happening.

The best example of closed tags is Technorati, which allows web page authors to add tags to their pages but no one else. How these tags differ from the "metatags" of old school html is beyond me, and its clear from that past that letting people define their own page is not exactly the most reliable way to produce meta data. Technorati compensates for this by using as much tag data as they can snag from outside sources, del.icio.us, furl, flickr, etc... Still searching their index reminds me of the spam filled searches that predate Google. The closed tag has a deep flaw inthat the people with the most incentive to use it extensively are often not the ones that people searching for information have the most interest in.

Flickr's tags are not open, but they are a big move in that direction. Flickr lets anyone inside your social network (as defined inside their system) add tags to your images. This is a simple interface change with potentially radical implications. Suddenly it is no longer necessary for all users in the system to tag in order to get a relatively even distribution of tagged data. Instead a dedicated core of "taggers" can tag up the data of all the lazy people like me who just don't care... Flickr's system works well in part because it is not fully open though, but rather bounded. By limiting the taggers to one's social circle they eliminate anonymity and reduce the potential for malicious tagging.

del.icio.us takes a different approach, one that is simultaneously open and closed. If you are using del.icio.us to search and mark your own account, the tags are essentially private. No one is allowed to go into your del.icio.us bookmarks and add tags to your collection. But because del.icio.us is a bookmarking site, and because each bookmark references a unique identify, a url, or more accurately a URI (universal resource indicator), del.icio.us is able collate the numerous private tags, and bundle them into public tags for any given URI. del.icio.us controls this infrastructure of course so this never should be confused with a true public service, but as companies/services go del.icio.us appears to be amongst the most transparent.

What gets market fetishizer's hearts all aflutter about tagging is embedded in the open tag, the possibility that a large group of people might be able to produce more useful metadata than a small set of librarians and catalogers. Whether this will happen is an open question, although at the moment del.icio.us sure seems to lead to better results then Technorati. What happens as tags scale in use is a big unknown though. Or even if they will scale, maybe most people still don't care about producing metadata? Certainly open tags open up the potential for a metadata elite of sorts to emerge. I might not care about tagging my crap, but maybe there are people out there that do, and maybe they'll be willing to take the time to tag my stuff. I've already seen it happen on Flickr, and I'd be willing to bet it will happen in any successful semi-open tag system. The freaks that want things categorized will go ahead and do it, while the rest of are happy with our personal ad hoc processes. Tagging might not eliminate the hardcore classifiers, but instead let them multiply by lowering the threshold for entry. You don't even need to go to library school to enter the classification battles now!

The other side of open tags worth looking at the though, is just how they work. What's really important here is not tagging itself, but algorithmic search. Tags are just an interface that makes it easy to generate metadata. With closed tags this essentially creates locally structured data, and a small set of hooks for algorithmic search. With open tags though the hooks for algorithmic search multiply. One tag saying "nomadic" is pretty meaningless to an algorithm, it needs a means of verification. But in an open tag system there might be 150 tags all saying "nomadic" and another 30 saying "nomadism". This is the sort of information that is extremely useful to an algorithmic search. Of course there will probably be another dozen tags saying "phentermine", or whatever else it is the spammers want to splatter, but one hopes the algorithms of tomorrow will stay a step ahead of the spam... What's important to realize though is that it is not the collection of open tags themselves that produce the relevant search result, but the algorithm that is using the tags (and most likely other information as well). Tags themselves are merely ticks in a database, its only through the execution of code, or through the navigation of a database structure that the information becomes useful and interesting.

Posted by Abe at September 16, 2005 03:54 PM