David Singleton
JSON, Octal Numbers & Validation

Now it’s launched in London I’ve been playing around with the Foursquare API. While it’s not the best API i’ve come across it gives you reasonable access to their data, so i’ve been pretty happily building some small tools using their data.

hit a rather usual bug with JSON parsing for a Foursquare venue, in this case the Bricklayers Arms (The home of Pub Standards). The JSON response looks something like this:

{
    "venue": {
        "id": 145975,
        "name": "Bricklayers Arms",
        "address": "31 Gresse St",
        "city": "Camden Town",
        "state": "Greater London",
        "zip": "W1T 1",
        "phone": 02076365593,
        "geolat": 51.5176421,
        "geolong": -0.1334817,
        "stats": {
            "checkins": 0 
        }
    }
}

Looks pretty reasonable, right? But my JSON parser chocked on this and called it invalid. Running it through JSON Lint gives a bit more information, but is still a bit vague:

syntax error, unexpected TNUMBER, expecting ‘}’ at line 9

Let’s see, line 9 is the phone number key/val, that seems like it should be valid, I mean the leading zero is odd but… oh. You may, or may not, remember that standard notation for octal numbers is a leading 0, it’s not something you run in to very often.

So why is an octal number invalid? Because the JSON spec doesn’t explicitly support octal numbers as native types and parsers aren’t compelled to either. Some may do, but this is probably more luck than judgement - a side effect of loose typing.

I’ve reported the bug to the Foursquare team and it sounds like it’ll be fixed shortly (as part of an API rewrite), in the mean time i’ll be using a dirty regex to quote the offending number before parsing.

If there’s one thing to take from this it’s how important validation is, even if it does nothing more than prove it’s not your bug. Looking at the original response and trying to work our why it didn’t parse with validation would have been very painful, it’s not a bug easily caught by a human, you need rigorous machine testing.

Testing the Tumblr API

Just testing the Tumblr API. I’ve been pulling content in to my personal site (http://dsingleton.co.uk) via the JSON API using the search parameter.

Looking up by slug for each article works for all except Open Tech 2009. For some reason it doesn’t seem to be in the search index, as it’s not even picked up when searching for the term “open”.

Possibly explanations

  1. Large articles aren’t indexed for the search API - My open tech post was ~3,500 words.
  2. That article is missing from the index due to a bug, possibly a caching error.

The former sounds a little unlikely. While Tumblr is intended for short snappy posts I’m sure there are much longer posts, and ~3,500 is not that big.

Equally, i’m not sure it’s a caching bug as i’ve edited it a couple of times in an attempt to invalid the cache. No joy there.

I’ve contacted one of the Tumblr developers, hoping they can shed some light on the problem - or at least explain the search nuances I may be missing.

I also asked for a new API parameter, getBySlug. I wouldn’t need to be mucking around with search if there was a simple method to look up an article by it’s public slug (not the semi-private internal post ID).

A short term fix for me will be caching a map of slug => id of the last 50 articles I added to my Tumble log (the most you can get form the API in one page) and using that to do the API lookup. It feels pretty kludgey requiring stateful information between requests to do this (or worse, make two requests per page).