1

I spent this weekend at the rather smart Guardian offices at King’s place near King’s Cross to visit Warblecamp and I thought I’d write up my thoughts on some of what I’d learned.

WarbleCamp is a free unconference style event for the UK/Euro Twitter developer community which was held at the offices of the guardian. I was lucky enough to see an announcement notice before the first set of tickets had vanished in a very short space of time. We arrived at the very smart guardian offices where guardian staff were still working throughout the weekend (apparently there is some sort of big news unfolding or something?) and watched @jot and @kalv give the opening introduction to the event.

Session boards at Warblecamp (Photo from @raffi)

The day was broken up into Barcamp style sessions where anyone could propose a session simply by writing it onto a notecard/post-it note and finding a free timeslot and room on the wall board.

Session boards at Warblecamp (Photo from @raffi)

After several introductions and conversations over cups of coffee and a Strawberry and Banana oatmeal breakfast courtesy of moma foods we selected our first session. I was a little disappointment that there seemed to be a quite a few people who had registered for tickets and then simply never turned up.

Aral Balkan (@aral) – An Introduction to the Streaming API.

I felt very sorry for Aral as he’d bravely volunteered to move his session to one of the first of the day which meant that delays in starting was to be inevitable while people found rooms etc. It also amazes me how you can put 40-50 geeks into a room and then find that noone can work a projector.

After changing rooms we finally got onto the subject of the new Twitter Streaming API. The Streaming API is the 3rd Twitter API Subset which allows you to access various slices of public tweets. The Key differentiator is that unlike the other 2 twitter API services it actively pushes tweets to you in (sort of) real-time so that you don’t have to constantly poll Twitter’s servers to check for new information.

The Streaming API is an invaluable addition to the API Services and also hints at how many API services of this type of data model should actually be modelled. This session was cut rather short but one point that stuck with me is that Json is now the preferred response type and they’re considering XML return methods for depreciation.

Paul Mison (@blech) – Annotations vs Machine Tags

This session was a good discussion around the uses and power of Machine Tags and Annotations which focussed around Flickr Machine Tag usage as at this point in time Twitter doesn’t support adding metadata to Tweets (little did we know what was to come later in the day).

The basic concept is to decorate a resource with metadata in some way, this usually follows a simple model. This normally takes the form Namespace:key=value. For Example, on Flickr, One might tag a photo with Car:Manufacturer=Ferrari to indicate that it is a photo of a Ferrari car. This makes data about our photo more specific as a photo simply tagged “F357″ might be be any number of things.. a room, a camera, a car.

@aral started a good discussion around how you might “standardise” the namespaces of these tags and the hierarchy as it’s entirely possible that people may have different words for the same descriptor. There was popular consensus for an organic community driven approach.

@tommorris raised a good point: that even with a hierarchy of key-value pairs, is this really enough to actually associate a relationship perspective with an object? The example he gave within the context of Annotated Tweets is that you can tag a tweet with a book:title=”Treasure Island” but how can you tell whether this refers to a book I’ve read, a book I bought, a book I wrote?

Nick Halstead (@nickhalstead), CEO of Tweetmeme – #NoSQL

Nick gave us a brief overview of Tweetmeme‘s platform growth, they started with a well trodden track of a LAMP stack.

Tweetmeme services a fairly eye-watering amount of work: they store 4TB of data per day, index 500 million URLs and service 10,000 Req/s from about 25 servers. MySQL worked fine, following the usual Master / Slave division pattern until they got above a certain number of rows after which deletions started to become a pain point.

They have now moved to a Custom PHP job queuing system,  Nginx and HA_PROXY and use a tiered system of graceful degradation (similar to Twitter’s own system) which allows them to service requests differently according to the load on individual machines.

They first evaluated Tokyo and concluded that whilst it was very very fast  it was going to prove a difficult as it would mean carefully sharding data themselves.

They then evaluated Redis from which they were very impressed with the clever features to create Lists and Sets of data.

Finally they settled on Cassandra which provides eventual consistency and easy scaling across commodity hardware (a key advantage for a startup like Tweetmeme).

Christian Heilmann (@codepo8) – Geo Platforms

I’ve been a follower of Christian on Twitter for quite a while so I’m used to his enthusiasm for the Yahoo YQL platform but it was great to actually see his enthusiasm in person.

Whilst we were waiting for people to find their way to the room Christian revealed that some people are so into Geo that apparently they decide to walk around London on a particular trail in order to create a phallic route.. ergo gpsc**ks.com (conceivably NSFW) was born.

He first made the point that we’re beyond the point of it being novel to be able to display our GPS location on our mobiles and how we actually needed to do something useful with it. He covered how a few lines of Javascript we can use the HTML5 Geo API to locate a user (with varying degrees of accuracy). He pointed out that working with Geo locations is considerably more complicated than simply obtaining the Lat/Long of the current user. To help with this he suggested several resources:

  • Geo Planet Explorer – A Yahoo web service that allows you to submit a WOE (Where on Earth) location and get back data about the neighbourhood and it’s neighbouring areas.
  • YQL Geo Library – A JS Library which uses the YQL Library to provide services like geolocation, reverse geocoding, content analysis all in one easy library.

He then gave us a live demo of the YQL Service which which truly very powerful. It provides structured, consistent access to a whole load of different web services and datasets. This reminded me of what Ander Heijlberg said in his Technet 2010 keynote about programming for the what and not the how.

An example of a YQL queries might be: select * from geo.places where text=”san francisco, ca”

As someone who is used to working with web services the hard way, the YQL frees you from some of the effort of Authentication, gives you a consistent output format and does some caching on the server side.

Raffi Krikorian (@raffi) – Twitter Annotations

@raffi is Twitter’s Dev Lead for the API Platform (aside from being a generally nice chap) and had gotten permission just before he flew out to the UK to give us the first public draft of the upcoming annotations feature planned. You can check out his slides on the Twitter API Annotations and even watch the recording of the talk over on Vimeo.

The basic idea is that you will shortly be able to decorate tweets with up to 512 Bytes of data (excluding formatting and data classification structures). This could be used to append data about a book that the tweet refers to, a movie which it is talking about or any number of uses, including pure machine-to-machine communication.

This adds context to a tweet, something which twitter started doing when it added GeoLocated tweets and intends on doing more of in the future.

It is stressed that this is still a very early draft but that the Annotations may look something like the following:

{(type =>  (attribute => value),
               (attribute => value)}

And as an example:

[(“tv episode” => {“episode” => “The Vampires of Venice”,
		“series” => “Dr Who”,
		“air date” => “8 May 2010”}}]

The names of types and attributes will be allowed to grow organically from the community although they will have a broad set in use at the time of launch with support from a few partners. Attribute types and key name usage will be tracked and stats made available on dev.twitter.com to allow developers to browse existing usage implementations.

@codepo8 asked if they looked at RDF and microformats to see if they could adopt an existing standard for metadata and while they wouldn’t have been directly applicable, there is still a chance that they could reuse some of the taxonomy.

Lots of people asked if it would be possible to have metadata only tweets so that you could filter out all those annoying Foursquare checkin tweets and those “I’m listening to” tweets. This is now easily doable with the support of 3rd party client providers.

As with nearly all the previous new features released by Twitter, this may eventually be built into the twitter.com interface but it is initially planned as an API Platform feature.

Annotations will be made available to application developers  first as a preview within the next month sometime.

Others

There were several more sessions that I particularly enjoyed including Remy Sharp‘s (@rem) presentation on a conference dashboard he built he built for Chirp conference that works entirely with Javascript. Aral Balkan (@aral) also chaired what turned out to be an introduction and discussion of Twitter’s xAuth process (an adapted version of OAuth for Desktop Applications) which turned out to be a penny drop moment for me with the workflow process for OAuth Desktop applications.

I’m also keen to follow up on @ketan‘s proposal for a twitter:// protocol handler and to have a play around with the User Streaming API.

Thanks to the organisers, speakers and sponsors of warblecamp, I think it’s safe to say it was a well deserved great success. A special thanks to the primary sponsors, Guardian Open PlatformonefortyYahoo Developer NetworkPaypal X and O2 litmus and to the micro-sponsors for supporting us with food, drink and entertainment.

Tags: ,

One Comment

Leave a Comment