David Singleton

Aug 31

Scrobbling Heatmap Calendar

underpangs

Created by Martin D., one of my ex-colleagues from Last.fm, this is an amazing visualisation of scrobbling activity by year, month, day of the week and even time of day.

They’re manually generated at the moment, with a set for Last.fm staff past and present. The slideshow version is particularly interesting to compare varying peoples listening habits.

Each year of data is arranged in a row and horizontally grouped into 12 blocks, one for each month… Months are organised by an inner grid, where data is arranged in seven columns for the days of week and 24 rows for the hours of day. Weekdays are aggregated so that e.g. all Mondays of a particular month end up in the same column.

Colour is a measure of relative intensity: grey → green → yellow → red. A light grey strip highlights the most active hours of the day across the entire period.

Despite the depth of information it’s stunningly readable and rather interesting. You can pick out real patterns quite easily, as Martin notes;

Frequently people’s graphs are detailed enough to provide a fairly good summary of big life changes. New jobs, busy weekends, holidays, the month when they bought an iPod, or picked up running again, or moved to a different timezone, … I found that showing these graphs to the people portrayed often stimulated interesting conversation about their habits and their choices.

From my graph I was able to spot these behaviours:

Aug 28

Last.fm Now-Playing information radiator

This is a litte project I started a while back but only finished/cleaned up recently. It’s a simple information radiator that shows you what a Last.fm user is listening to right now.

Now playing screen

It was designed to let one (or many people) know what’s playing at a glance. I’ve been using it while listening to Last.fm radio while getting ready for work, shared office playback (it works really well here) and even on top of Spotify playlists at parties. You can stalk along to someone else’s listening too, for the curious.

Dev/ops have made heavy use of radiators/glanceables for a while (that’s a whole ‘nothing post), but it’s nice to see more social and experimental uses being applied. With “second screens” getting more popular I’m excited to see what else will get made soon.

Try the hosted demo version or fork the project on Github.

Responsiveness

DSCF0009

It’s not strictly a responsive-design (no media queries in sight) but it does work really well across multiple devices. Only needing some font size adjustments on larger screens. Matthew Sheret has been running this on an old iPod touch sat on his desk for a few weeks, quite happily.

How it works

There’s no server-side requirement, everything is HTML/CSS/JS, no PHP or Python dependency, which means it’s super easy to use locally. You can clone it from Github and use it immediately, no configuration. It includes a tiny Last.fm API client written in Javascript

Now-playing is part Last.fm scrobbling API, supported by most clients it’s sent at the begining of a track rather than at the end (like scrobbling). Becuase there’s no realtime API i’m polling on a with a tiny Javascript API client. By default it polls on a 5 second interval, which feels pretty snappy and sit well within the TOS (~1 req/sec max).

Checkout the source, on Github and have a play.

Aug 11

Leaving Last.fm

Brand Ambassadors 06

8 years of scrobbling, 55,000 tracks and 4 years with the incredible Last.fm team, it’s been a blast, but it’s time to move on and do something new.

It’s been a really difficult decision to leave. Last.fm was quite literally a dream job for me and at times a family. I’m proud to have worked on a brilliant product with so many great people. I feel like i’ve grown up with Last.fm. It’s been, and will always be, a big and very happy part of my life.

I really can’t describe how much I’ve learnt and how much fun i’ve had, but some of highlights include the ballpit, roof BBQs, Nerf wars, big releases, office music, stats and challenges, but that’s still just a tiny fraction.

What next? Well, a long, and probably wet, weekend camping in Wales. Chilling out a bit and relaxing.

Jul 03

Facebook photo facial recognition

[These are old, but I’m clearing my drafts queue]

Last Autumn a group of us visited Berlin, renting a bit apartment, exploring the city and taking lots of photos. I don’t upload photos to Facebook, I’m more of a Flickr guy, but it was interesting to watch Becky upload hers and compare the process.

The most striking, impressive and creepy difference was that Facebook seem to do facial detection identify faces then try to match them to your friends. I can see why they do it, more tagging = more engagement, but the unintended matching of drawn faces (and more so to real people) is pretty weird.

Berlin Wall mural of faces

Berlin wall face mural

Mural faces matched against a real person

Fiona matched with a wall face

More detected, but unmatched faces

Matched faces

I wonder how far this effect would stretch? Could Facebook start suffering from Pareidolia?

Jun 06

Artificial Scarcity and Entitlement

kapowaz:

The dawn of ubiquitous computing and communications has had a disruptive effect on numerous previously-stable business models. Those affected most are usually those which relied upon the scarcity of a resource, be that news, music or television programs.

Each one in turn has found itself confronted with a paradigm shift leading away from scarcity, and typically has failed to adapt to the new order that develops. What began with Napster in the late ‘90s has led to everything from news pay-walls to Digital Rights Management on movies and music. When these measures inevitably fail new legislation is lobbied for to plug the gaps, but by that point it’s too late to change perceptions: the resource is manifestly no longer scarce.

Once this becomes apparent to the general populace, the value of that resource tends to drop rapidly. Music once had an intrinsic value, but clearly it was tied to the scarcity of the medium — you could lend a CD to a friend and until they returned it you didn’t have that CD to play yourself.

Once the physical medium no longer played a part in that transaction and one could make a perfect copy at no cost it was only natural that people would share these copies with one another. Big media like to portray this as theft, but in reality, just as with the Replicators in science fiction, the scarcity has been replaced with abundance and with it the business models that rely upon it have become obsolete.

This sentiment isn’t the same as entitlement: it’s a natural reaction to artificial scarcity, and those that attack this reaction are simply struggling to rationalise what is in effect a significant transitionary point in civilisation. As Clay Shirky wrote in Cognitive Surplus, “When a resource is scarce, the people who manage it often regard it as valuable in itself, without stopping to consider how much of its value is tied to its scarcity.” There is no law written that states that every business model is entitled to exist in perpetuity.

In part, I disagree. The intrinsic value of music is in flux and is trying to find a steady state. It’s safe to say that value will be greater than 0, but what that value will be and how we get there I don’t know.

Anyway, I disagree that “[the] sentiment isn’t the same as entitlement”. In the case of Spotify it very much is. The scarcity, the limits they have recently imposed will directly relate to the inherent expense of being a music streaming service. Taking aside royalties for a moment, the costs will be bandwidth, hardware, hosting & staff, which is still significant. Royalties costs, something outside their control, are very much higher still and will make up the lions share of Spotify’s expenses.

Fundamentally it’s a question of cost vs revenue, if the former is unsustainably higher then Spotify will cease to exist. They chose to reduce costs (the alternative being increasing revenue, though they’ve done this by proxy) by limiting the loss from their loss leader, ad-supported radio. The funny thing about “ad-supported” is that it rarely is, off settings internet advertising (even heavier media ads) against streaming costs rarely turns a profit, but for Spotify that let’s people try the service before buying a subscription. I know nothing about the internal policy of Spotify but I imagine that too many people didn’t transition to the subscription model and “free” was being used in an unsustainable way.

Fundamentally I can’t understand the “it should be free” mentality. Spotify are clearly reacting to market pressure,

Perhaps the common online model of loss-leading to gain users is to blame, setting a poor expectation. A common difficulty of sites/applications looking to charge is that tacking on a subscription package on an existing product is hard. In some ways it’s better to charge from the start, better managing expectations and generally avoiding the “everything must be free” crowed altogether. Side note, this has been the theme of a number of discussions i’ve seen around pricing, don’t be afraid to value what you’ve built.

Ultimately why should you pay for Spotify? You can go download torrents easily (actually, most people can’t but we’ll ignore that here), you can listen to songs on Youtube, getting music isn’t that hard. But Spotify make it much easier, the service they’re providing is utility and easy of use. At a party you don’t want to be dicking about with a torrent client, or searching Youtube to play a track, you want to have a nice interface to all the music every and treat it like a jukebox. The hard work Spotify are doing is making that experience good, that’s what you should be paying for.

The true unreasonableness of the entitled group is that they don’t have to use Spotify. Don’t want to pay? Go pirate it instead, it’s cool, have fun. But some people feel entitled to the ease of use provided by Spotify. The scarcity you are paying for is not the music itself, but the way it is made available.

As a side note, that last point is more dangerous to Spotify (or any other streaming service) than anything else, that you risk becoming a commodity, a service layer that is swapped as easily as a real world utility provider. Making it possible for someone else to come along do it better and leaving little incentive for your users to stay with them. But ultimately this will happen, I fear. The entire online music industry has been an act of leap-frogging for a decade, each new company makes a little more progress than the last (which usually suffer a sad fate).

Case in point, the royalty deals made to Spotify will have been more an improvement those to Last.fm, and whatever follows Spotify will be better again. This ties with the value of music approaching a steady state, but until we get there the change is going to be slow going, with many small steps - short of revolution, of the major labels (and large indies, and aggregators) all crumbling, which is a popular idea, but an unlikely one.

As a second side note, I couldn’t help but laugh at some of the article’s comments. The ignorance is unsurprising but still enraging. One sensible comments points out that royalty rates have increased significantly over time, but then makes this utter blunder:

….then they could have negotiated for higher ad fees

I mean really. The solution is to ask advertising companies for more money? You don’t think perhaps ad revenue is already at market rate, that there is no room to say “can we have a bit more please”?

Apr 28

Migrating scrobbling authentication to 2.0

I received this email a few months ago, and it made me smile.

Anthony from The Hype Machine here. (http://hypem.com) Good news! We’ve just connected your Hype Machine account to Last.fm using a new secure method recommended by their team (it’s similar to OAuth, if you want to get technical)! To do this, we used the Last.fm username and a scrambled version of the password you’ve given us before. We’ve now deleted the scrambled version of your password from our database for extra security - with this new authentication method, we no longer need it! Have a great week!

Hype Machine is one of the many music sites that allow you to scrobble your listening there to Last.fm. This is done through our Scrobbling protocol, a part of our API, in fact the same way the official Last.fm clients (and even the website) record what you listen to. We’re proud to eat our own dogfood.

Scrobbling is core to Last.fm, each track is a chunk of attention data. It tells us more about what you like, building better recommendations of music, events, etc. The more you scrobble the better it gets, so the more 3rd parties scrobbling the better. “If it doesn’t scrobble it doesn’t count”, as Stefan says.

Having slowly evolved over 8 years the original scrobbling protocol was always a little complex and not very developer friendly. That’s something we’ve tried to improve with Scrobbling 2.0, which is a revolution, rather than evolution of the protocol.

Short history of scrobbling aside, I’m going to talk about how a 3rd party developer can migrate user authentication from the old scrobbling protocol to the new. The two systems use different authentication mechanisms, Submissions 1.3 requires a username + password, while Scrobbling 2.0 uses the same OAuth mechanism as our API.

Having a 3rd party store your Last.fm credentials is not great. Despite the protocol requiring the passport to be hashed (rather plaintext) it still increases the risk of your account being compromised, by a malicious 3rd party or just a careless one that accidentally exposes data (which happens more than anyone would like).

An advantage of OAuth style authentication is that it’s token based, meaning that a 3rd party will never need to ask for a users’ password. To get a token they direct the user to our OAuth endpoint on Last.fm where the user chooses to allow or deny the 3rd party. If allowed then a token is sent back to the 3rd party giving them access+write data for that user. Another advantage is that the user (or Last.fm) can easily revoke access for the 3rd party by deleting that token.

So, all in all OAuth is more secure and offers more control, a “Good Thing”. However, for there is a problem for users of the old srobbling protocol. In order to use Scrobbling 2.0 you need an OAuth token for a user, which is not something they’ll already have. One option would be to send all of their users through the OAuth authentication process and collect each token, but this sucks for everyone. It’s a big pain for the 3d party developers (writing new auth flow, maintaining to scrobbling protocols), the users (getting asked to allow something they’d already allowed) and for Last.fm (probably result in fewer people scrobbling as a result of the fuss). Not ideal.

A few months back one of our partners asked about this problem and we were able to come up with a novel solution that sidestepped that complexity. It relies on two things, 1/ Old-school scrobblers already store usernames and (hashed) passwords, 2/ We offer a second kind of API authentication, which exchanges a hashed username and password for an OAuth token, without direct user interaction.

So, having just said how much better OAuth based authentication is, why do we have offer an auth flow that circumvents it? Basically, user experience. The OAuth flow on mobile generally sucks, often a user won’t be logged in on a mobile device, or a mobile app 3rd party will have difficulty capturing the auth token. Sometimes a level of security has to be sarcraficed to make something more usable. It’s also not all bad, unlike the old scrobbling procotol it doesn’t need to store username and password, only use them once to get an auth token, which is still revokable by the user. It’s not ideal, but it’s an improvement.

Anyway, that authentication method requires the 3rd party to provide the username and password (in a hashed form), which older scrobbling clients already have - no need to ask the user. They can use the existing user credidentials to get an OAuth token via mobile authentication. After that they can store the new token and purge the old and insecure credentials, which is exactly what Anthony described in the opening quote. No fuss for the user, some work for the 3rd party developer, but all of it automatable (and extractable in to a publishable, resuable library).

I’m not going to go in to any actual code, in part because the post is already very long, but also because with the mobile authentication documentation it’s quite straight forward. But here’s the gist of it:

// This is the token format mobile auth expects:
md5(username + md5(password))

// How to generate it from Submissions 1.x auth details
username = scrob1_username
password_hash = scrob1_password_hash

// Generate the token and hit up the mobile auth API method
mobile_auth_token = md5(username + password_hash)

Bosh.

Jan 06

Erlang: For an absolute beginner

Last night I spent 30 minutes having a play with a new (to me) programming language, Erlang.

Its a language that will stretch my brain a bit, teach me to think about programming in a different way, and I know some folks using it (Mostly Smarkets and IRCCloud) who’s brains I can pick in the pub.

These are my quick notes from an Erlang newbie, hopefully useful to someone else getting started who justs wants to install it, write Hello World and go from there.

Installing Erlang on OSX

This is incredibly easy with Homebrew (a package manager for OSX). As simple as:

brew install erlang

I expect they’ll be some cases where I need more than that stock install, but it’s good enough to get started. I haven’t tried it on my server, but it looks like a simple apt-get should do the trick, rather than the ball-ache of compiling from source.

Writing ‘Hello World’

The Erlang FAQ has the simplest example:

-module(hello).
-export([hello_world/0]).

hello_world() -> io:fwrite("hello, world\n").

Save that as hello.erl (the name of the file needs to match the module name), run the Erlang shell (run erl) and compile it like so:

$ erl
1 > c(hello.erl).

This will generate an Erlang bytcode file called hello.beam in your working directory. To run it, switch back to the Erlang shell and call the exported function on your module.

2 > hello:hello_world().
hello, world
ok
3 > 

Simple. Thought i’d also recommend looking at a more complicated ‘Hello World’ that is a more idiomatic and Erlang-y version,

What next?

I asked on Twitter for some good Erlang resources, books, blogs, tutorials etc:

I guess my next step is to pick a small project to building in Erlang. It always seems better to have a simple project to help pick up a language.

At the moment I think i’m going to work through some of the Project Euler> problems and get to grips with the basics. Then perhaps build a toy web server, which has the added bonus of grokking more of the HTTP spec.

And finally…

It wouldn’t be a beginning Erlang blog post without the obligatory “Hello Joe” from the brilliant Erlang: The Movie

Dec 21

Tech Hub misrepresents Silicon Roundabout

Yesterday I had a brief conversation on Twitter with Elizabeth Varley of Tech Hub about their claims of 700% startup growth in the Old Street area. I’d suggest reading the original article and the Twitter conversation first. I was convinced to write up some broader thoughts by an excellent post by James Darling, so that is worth reading too.

It may seem pedantic to quibble over a number, but it is important when the number is in the headline of your article, and more importantly gets repeated by others and even Government. Whilst this may be a common tactic in PR it’s still fundementally misleading and irresponsible on their part.

To get this % increase they claim two sources:

  1. A “crowd sourced” map thrown together by Dopplr shortly after the (joke) term ‘silicon roundabout’ was coined.
  2. Further research by Tech Hub based on a detailed ethnographic survey of the area, done over a period of months, long after the term had stuck and more companies were eager to be associated

Now, they don’t claim the research behind the number is scientific (which is fair) but you shouldn’t use precise figures that appear to be so. Use appropriately unscientific terms like “significant growth” or other hand wavvy term instead.

Thats not to say the area hasn’t seen growth, but that 700% is an entirely misrepresentative number. To the extreme that using it is essentially making up a number to suit a purpose.

Most of all, despite being very careful not to explicitly say it, Tech Hub aligns themselves with this growth, suggestion they are in some way associated or responsible for it. Something I personally don’t believe to be true.

Recent events make me worry that Tech Hub have become the de-facto voice of “silicon roundabout”, despite being so new to the area, and rather different to the many existing co-working spaces. To me at least, this seems to be because they are entrepreneur and business focused, they’re not developers. Of course they’re going to be more willing to, and better at, speaking to the press and promoting themselves.

James does a much better job explaining that last point than me. I could happily write a few more hundred words on that alone, but I won’t tonight. I do feel compelled to make a point of

Tech Hub does not represent for me, or the whole of the area. It worries me that they appear to do so.

Nov 06

The importance of virtualised development environments

Gareth Rushgrove just wrote a great blog post entitled Why You Should Be Using Virtualisation, which puts out a great argument for using virtualised development environments.

The crux of the argument is that you should be testing and running your code in an environment as close as possible to production, and that virtualisation is the easiest way to do that. There are other benefits, but this is the most important by far.

To quote Gareth:

But if you’re running those tests against code executing on different hardware, on a different operating system, with different low level libraries or a different web server version or a different database server then you are not going to catch all the problems. If you take this to an extreme then you can only get rid of all of these problems by giving each developer a full production stack of their very own.

A short example of this: A few months back I spent the better part of a day tracking a bug that was happening in production by utterly unreproducable in our development environment. The cause was a difference in the Thrift setup between the two - one was running a C extension while the other was a native PHP extension - and one had a bug.

It’s not important what the bug was, but that if the two had been in sync then the problem would have been caught much quicker and more easily.

Current work setup

At Last.fm each web developer has their own hosted and internally routed Virtual Machine, all based on the same VM image, which matches 99% of our production stack. There are still the occasional niggling 1% problems and mismatches, but each time we encounter one it gets fixed in the original VM image, so it’s a step closer to being fixed forever.

In contrast, there was a period where we were running differing versions of Apache and PHP compared to live (and different again to some non-production, non-development internal servers). Believe me when I say: this causes no end of problems, avoid it at all costs. It will sap developer time and enthusiasm faster than you’d think. As happy as many developers are playing at sys-admin, the majority would much rather be writing code and making stuff, not fighting workflow.

Restore and rollback

Another benefit of visualisation is that you can get a new version of your VM running very quickly. This saves an awful lot of time for a new hire, but also if you lose your VM for any reason then you can generate a new VM or restore an archived version incredibly easily.

Due to a disk corruption I lost my development VM a few months ago, frustrating as it was I was able to get working again in under an hour. A new production-parity VM was generated and I pulled in code from our source repository, restarted Apache and away I went. All I lost was a (small) amount of uncommited code and a few shell scripts, which did teach me to treat a VM as a throwaway environment, commit or backup anything you would mind losing.

Downsides

I can understand hesitation about setting up a production environments for dev, especially for the awkward medium sized projects (bigger than small, but not a serious endeavour), but it can give you some pretty big wins. The earlier you encounter bugs, the quicker you fix them and if disaster strikes you can resurrect your environment easily. I think it also forces you to be more controlled when adding dependencies in your production environment.

Of course, it’s not a silver bullet, you’ll still encounter weird production vs dev bugs, there’ll always be differences. Not just load/the real world, but it’s also like you might be missing some configuration or software, routing access or even domain variance than only production has.

To test against production as much as possible we use a few production web servers, pulled from the main pool, which are configured to run candidate releases of our code base. Developers and testers can set a cookie that our load balancer will use to route them to the QA servers. This allows us to test an unreleased version of the site using the exact hardware/software we use in production, that’s hosted in the same data centers and addressed by the real domain.

You can’t get any closer than that without putting the code live.

Sep 28

New Twitter favouritism

Twitter have been slowly rolling out access to it’s new web interface , to much discussion, arguing and complaining - not to mention recent security issues.

Putting all that to one side the feature of #newtwitter I’m really fond of is showing your most recently favourited tweet.

New Twitter

Favouriting is an underrated feature on Twitter, though it’s slightly at odds with the concept of a Retweet and requires the use of external tools to get the most out of it. It’s not surprising it’s used so infrequently.

Displaying your most recent favourite, gives you an incentive to favourite more. Primarily it reminds you the feature exists and that you’ve used it. The temporal nature of a tweet also reminds you of the time since you last favourited something. That’s good motivation to favourite something new

It’s a minor detail amongst a massive UI overhaul, but I think it’s a clever and subtle way to encourage more use of the feature.