Microformats: The Quiet RevolutionOct 21, 2008 In Web Culture By John Allsopp
Imagine a browser that could automatically detect locations, addresses, people, or events; and allowed us to easily add them to our address books or calendars. This vision is quietly becoming a reality. By adopting microformats, not only do you get the practical benefits of a set of well developed conventions for marking up common data, but you'll be helping to fuel the next generation of browser and search engine innovation
Web 2.0? More like Web 1.0.
As many suggest the end of Web 2.0, and others announce the coming of 3.0, if we look at the heart of the web experience for most users, has much really happened since 1995?
“By adopting microformats, you’ll be helping to fuel the next generation of browser and search engine innovation.
Despite the rise of CSS, Ajax, and Rich Internet Applications, two key aspects of our web experience are essentially unchanged since the days of Lycos and Mosaic (ask your grandparents if those names aren’t familiar to you!).
Let’s take a look at how this really is the case, how it things could be different, and then see at how microformats can, and are already changing the web landscape, and how you can use them to your advantage in your own projects.
In 1995,what could you do with a web browser? Well,you could visit web pages, print them out, bookmark them. And that’s about all. In 2008 – we can still visit web pages, print them out, bookmark them. And that’s still about all. The role of browsers, and our experience as users is just about identical.
If you think that’s a stretch, here’s a screenshot of the first widely used browser, Mosaic.
And here’s its great great grand child, Firefox
With the exception of the search field, it’s more or less all there on Mosaic.
The other central aspect of the web experience, one that just about any web user will do several times a day, is search.
Of course, since 1995, several companies, culminating in Google, have dominated search, but the search experience is more or less identical now to the search experience in 1995. Here’s the search experience then and now:
- We visit a search engine…
- We decide on some keywords for the kinds of pages we want to find…
- The search engine returns a list of sites that match our criteria…
- We choose a link and click it…
- We visit the page.
Now, you might argue that since the browser and search engine as they are have served us well for the last decade or more, then there’s not need to go changing them – “if it ain’t broke then don’t fix it”.
But both browsers and search engines could provide much more functionality to their users.
What if when we visited a web page, our browser recognized locations – addresses, landmarks and so on, and easily allowed us to map them on Google Maps, Virtual Earth or other mapping site? What if the browser recognized people, and let us add them to our address book? Or events, and allowed us to easily add them to our online or desktop calendars?
Similarly, what if a search engine returned results about a movie or restaurant we searched for with an average rating from reviews across the web, no matter whether they are published – in newspapers, on blogs, in forums, or on any other type of site? What if results which contained information about location were displayed on a map, or results that contained information about dates and times were displayed on a calendar?
Search engines could also enable users actions other than simply visiting a site – they could, like the browser examples earlier, let us add events to our own calendars, or contact details to our address books, with no need to visit the page itself to get those details.
We do see some limited examples of this kind of functionality at search sites, but on the whole, we are still very much in the “Search 1.0″ paradigm.
Why is it so?
So, why is it that we’ve seen so little innovation in browsers and search? For one thing, it’s easier said than done. Software is not particularly good at “Natural Language Processing” or NLP (which is basically a fancy way of saying “understanding the written word”). If you think about how most web pages are coded, we don’t markup addresses, or locations, or reviews, or events with any specific HTML to help software more easily extract that information from the page. But if we did have a way of marking up this kind of data, then browsers and search engines could very easily extract it, and provide the sort of functionality we just described.
And hopefully by now you’ve guessed the role of microformats – to provide a way in which we can mark up web pages so that common types of information can be more easily extracted from them by software.
So, now we have an idea of what microformats are for, the next question is how do they work? The good news is that If you are a reasonably experienced web developer, familiar with aspects of markup like the class attribute, the abbreviation element, and a handful of other quite commonly used features of HTML, you’ll have very little to learn. Even if you have just a basic understanding of HTML, there’s really not a lot to it. Let’s take a look at an example to demonstrate.
Let’s suppose a site has an address – something like this
<p>1164 Morning Glory Circle</p> <p>Westport</p> <p>Connecticut</p> <p>06880</p> <p>USA</p>
Let’s now use a very common microformat, ADDR, to mark this up. We’ll discuss ADDR, and where it comes from, after we’ve looked at the example.
First, we’ll want to identify the whole construct as being an address. So, let’s wrap it in a containing div (it’s very likely the address would already be marked up in such a way):
<div> <p>1164 Morning Glory Circle</p> <p>Westport</p> <p>Connecticut</p> <p>06880</p> <p>USA</p> </div>
Now, we’ll add the magic microformats dust – a class value of “adr” to this div
<div class="adr"> <p>1164 Morning Glory Circle</p> <p>Westport</p> <p>Connecticut</p> <p>06880</p> <p>USA</p> </div>
This technique of using the class attribute of an element to describe what the element is is one of the key aspects of microformats, and you’ll likely find yourself using it most of the time when working with them.
We’ll now similarly mark up the components of the address using the ADR microformat.
We’ll identify the street address…
<p class="street-address">1164 Morning Glory Circle</p>
… the locality
… the region
… the postal code
… and the country.
And, altogether we have:
<div class="adr"> <p class="street-address">1164 Morning Glory Circle</p> <p class="locality">Westport</p> <p class="region">Connecticut</p> <p class="postal-code">06880</p> <p class="country-name">USA</p> </div>
Now, I’m sure many of you are asking, “where did these terms like ‘region’, or ‘postal-code’ come from?” To cut a long story short, microformats attempt as much as possible to reuse existing standards, rather than invent new ones. That way, we increase data interchange, and take advantage of the considerable effort that has gone into developing those standards.
The ADDR microformat is a subset of the hCard microformat, the microformat for contact details, which is based on vCard, the more or less universal format for contact details in address book applications. The terms like ‘region’, and ‘country-name’ all come from vCard.
All microformats work in more or less the same way as this example – they use existing aspects of HTML, like the class attribute, in ways that conform with the HTML standard. And by far the most commonly used aspect of HTML in microformats is the class attribute, a feature of HTML just about any web developer will be familiar with.
Microformats in the real world
Ok, so even if microformats don’t take much effort to learn or implement, is there really any practical value to them? I think there are a couple of very valuable reasons why you should consider microformats.
First, your pages need to be marked up anyway. By choosing an existing format like ADDR, you don’t have to do any work developing your own markup conventions, and your code will be much more maintainable and readable, by virtue of using a common markup conventions. If every instance of an address on your sites is marked up in the same way changing their appearance using CSS will be far simpler.
“You’ll be part of a quiet revolution in browsers and search that is already underway.
But beyond this very practical reason, you’ll also be part of a quiet revolution in browsers and search that is already underway.
The examples of what a browser could do with the data in a page we looked at earlier aren’t simply great concepts – they are actually being implemented in browsers today.
There’s an Add-on for Internet Explorer that will recognize microformatted content, like contacts, events and locations in any web page, and display them for the user. Here for example, is my personal site, which lists things like my speaking engagements in the hCard microformat, contact details using hCard, and location using the GEO microformat.
The microformats add-on gives me the option of adding an event to one of several on and offline calendars, or contact details to an address book. You can also show any locations in the page on a map.
There’s a similar microformats extension for Firefox that provides a set of actions you can perform on microformatted content on any web page you visit with that browser. This extension, called Operator and created by Michael Kaply, has really led the way in bringing microformats to a wider audience.
On the search front, Yahoo! SearchMonkey now indexes microformatted content – and provides an API so developers can build their own applications on top of this data. Hopefully this and related innovations from other search engine developers will drive a new wave of search experiences, on top of structured data like microformats.
By adopting microformats, not only do you get the practical benefits of a set of well developed conventions for marking up common data, but you’ll be helping to fuel the next generation of browser and search engine innovation.