Web 2.0, blogging, and tags all go together, hand-in-hand. However, while RPC standards exist for blogs and the pinheads boggle over the true definition of a “blog,” no one has a cast-in-iron standard for tags. Depending on where you go and who you ask, tags are implemented differently, and even defined in their own unique way. Even more importantly, tags were meant to be universal and compatible: a medium of sharing and conveying info across the internet — the very embodiment of a semantic web. Unfortunately, they’re not. Far from it, tags create more discord and confusion than they do minimize it.
To Space or Not to Space, that is the Question
This one is probably the most obvious obstacle and the most destructive when it comes to tallying tag popularity or making those pretty tag clouds: Can tags have spaces in them or not?! If tags don’t/shouldn’t have spaces, then what do you do with multi-word tags that you just can’t shorten? Do you replace the spaces with underscores, dashes, or just take ‘em out? Does it matter?
Yesterday we were discussing how best to implement the tagging feature in the upcoming blogging engine, Habari, and this topic caused quite a lot of confusion. It’s an important question. What it means is you can have a single tag split across 4 or 5 different tags – for no good reason. If you thought having www. and no www. in a domain name made things confusing, you should probably sit down now. Take, for instance, the tags “Software, Windows Vista, Microsoft” Depending on which site you’d like to enter this tag, it’ll take quite a few different forms – each with a different meaning to another website!
- Del.icio.us: WindowsVista Software Microsoft or Windows-Vista… or Windows_Vista
- Technorati: “Windows Vista” Software Microsoft
- UTW/WP: Windows Vista, Software, Microsoft
That’s assuming you already know what form the site accepts and what it filters out. Suppose you were used to Del.icio.us and just found Technorati’s tagging feature – do you put “Windows Vista” in quotes or do you type it as WindowsVista? Or do you use underscores instead? Talk about semantic web!
Obviously the need for spaces in tags is an important one. Whether it’s “Semantic Web” or “Ford Interceptor” that you need to tag, it’s rather different from “Windows AND Vista” and “Ford AND Interceptor” – and it gets worse if you have a search engine that places OR in there instead of AND. Much worse. The big question is, why doesn’t such a standard already exist? It’s obvious that Web 2.0 is all about connecting ideas and bringing articles, content, and readers together. But anyone looking at the tagging process would immediately assume it’s about the exact opposite: splitting up content, making things difficult to find, and purposely making bloggers’ lives miserable.
With Habari, so far we’ve gone through all the forms, and at the moment we’re at number 3 for compatability and familiarity’s sake. But that may change – hence the need for a visible, tangible tagging standard. The only problem is, tagging isn’t some new concept. A tagging standard isn’t something that we can just whip up and serve on a platter.
What about the noun/verb argument? Look at the tags for this post: “Blogs, Blogging … Tags, Tagging” We just don’t know what people will search for – and we try to cover all the bases. But then you have so many possibilities! Code, Coding; Design, Designing; Research, Researching. For every pair there is one word more likely than the other. But people like to have all the bases covered, hence all the clutter. Tagging is fun, but only if done the right way.
Something this prevalent and widespread needs years of discussion, negotiation, and failure between the big companies before they can come to a conclusion. It’s going to be something that del.icio.us and Technorati and all the other major players agree on – which is practically impossible.
Del.icio.us is arguably the “tagging leader” in Web 2.0, but their budget is far smaller than that of the commercial competitors like Technorati, and their ideas are also much older and even out-dated given their being the original players in the game. Spaces are important, maybe they can agree on that. But what about delimiters? UTW uses commas as delimiters, Technorati & Del.icio.us use spaces. But if spaces are a part of a tag, then you have to enclose them in quotes – but what if your tags require quotes? ((We can’t conceivable think of a tag that would actually require quotes, but you never know what might happen. What if C# is replaced with C”? No one considered the octothorpe a viable tag element – then again, it’s not a real octothorpe but a sharp))
Basically, it’s too late for a tagging standard that will be used unanimously throughout the web. A truly semantic web most certainly won’t ever exist because of the reluctance to change and the unwillingness to compromise and accept defeat. A semantic web requires objective analysis of methods and data, culminating in honestly evaluated options, and immediate acceptance of the outcome. But that’s never going to happen.