December 15, 2004 00:14 | WebBlogging / WebTech

Tags... no! Semantics! No! Ontologies!

Or something like that...

I've been ranting about tags, Ado spoke of APIs, Leonard digs in with facets and triplets, and surely there are looooads of other people talking about all of this, and have been for a long time.

Right now I want to point y'all to this Google Translation of Karl's recent answer to my tags rant. Charmingly enough, Google's french is about on par with Karl's english hehehehehe... (Sorry man couldn't resist. You know I'm kidding! ;)
Please please please, if you are interested in this topic at all, read it and make the effort to get through it. The translation is rough but passable.

So, Exhibit A:
Joi Ito's del.icio.us tags (on the right). What a mess. ;)
More to the point, he has 141 "tags" in his self-maintained taxonomy. Before I go on...

Exhibit B:
Flickr's top 150 tags. Looks cooler but that's not the real difference. Nor does it solve the same problem Joi has, and the rest of us soon will have...

Let's look closer, first at Joi's tags. You'd assume that a taxonomy maintained by one person would stay reasonably redundancy free:

5 cc , 2 creative_commons
2 conference, 1 conferences
1 gadget, 1 gadgets
1 health, 1 heath <- Typo!!!
4 japan, 2 Japanese, 6 japanese_culture
2 movie, 1 movies
1 photography, 5 photos
1 society, 3 sociology
1 stupid, 1 stupidity
1 tech, 1 technology
1 terror, 1 terrorism
9 politics, 3 us_policy, 16 us_politics

And my favorite alphabetical grouping:
5 sex, 5 sharing_economy

This is problematic on many levels. First of all, Joi has to type each tag every time he wants to attach one. That means increased chances of typos or slightly different "way of tagging" (e.g.: CC, creative commons). This in turn increases the number of tags in the soup, as well as reduces chance in finding a match when searching later (e.g.: "I would have sworn I tagged that CC... why isn't it here?" 5 minutes later... "Oh! I tagged it creative commons!" C'mon folks, the computer ain't helping us that much if we still have to think... ;). It also reduces the chance of "matching up" with other people's tags. And that's also part of the point of all this isn't it?

Now, back to Flickr's top 150. "Oh mighty Flickr! Show me all pictures in New York City, New York, U.S.A!" "Boooonk"... NYC? New York? NuYawk? What about York? Manhattan?
(Some intrepid souls are tackling this right now. Bless em.)

animal? or animals?
autumn? or fall?
cat, or cats? (funny, the dog doesn't have this problem...)
I could go on... the redundancy is three to four deep for some terms.

Ontologies, relationships between information, context. The exponents of flat hierarchies would have you believe that "a rose is a rose is a rose" and that "a rose by another name is still a rose". What they forget is that you don't *know* what a rose *IS* until you've seen it, touched it, smelled it; heard that when given to a lady, especially in bunches, it is a romantic gesture; felt the crunch of a dozen roses on your wallet; had a thorn rip into your lip... now THAT's a rose.

One does not learn in a vacuum. One requires context and one must have something one can establish relationships with. Children learn by establishing relationships between information: "mommy says these orange things are called carrots and if I eat them my eyes will be healthy".

"The New York Times is reporting on Fallujah, which looks like this according to the moblogged pictures from there on Flickr."
Cough and cough.

(One more cough... *someone* familiar with XMLHttpRequest-DOM stuff should look into how Livesearch is being implemented here and there and perhaps figure out a way for folks like, oh I don't know, say Flickr, could implement it for easing tag entering. All it takes is a little array of the top 3 to 400 tags...)

I was gonna say something like "Tags are the alphabet's last-ditch effort to bring the electric nature of cyberspace under it's linear perspective thumb! We must not allow this to happen!"... but then you'd most certainly take me for a nutcase. ;)