UKGovcamp 2012 – The Ofsted Project

It’s UKGovcamp time again and this year is a little different. It runs over 2 days, with the Friday being the traditional unconference. The Saturday event is a hackday of sorts. And the organisers are looking for suggestions of projects to be developed. And I have a good idea.

There has been loads of opendata published in the schools arena in recent years with the initial Edubase data release being a key part of the launch. And last year the DfE released a new school comparison site (together with all the comparison data !!!) that does a really good job.

This means we now have 3 government sponsored schools data sites, Edubase, the comparison site and the DirectGov site. And there is one thing that’s missing from all of them. The judgement of the Ofsted inspectors during their last visit. The reason why that is missing is worth discussing but not right now.

Suffice to say I think it would be useful for prospective parents (and others) to see at a glance how Ofsted view each school especially in relation to its neighbours. So I propose that we build one on Saturday.

And it’s not as if the information isn’t available. All of the sites mentioned above provide links from each school’s page to an Ofsted home page for that school, listing the inspections of that school. But to find out the judgement of the inspectors – a very important piece of metadata – you need to open a pdf file and read through the report. If you are not familiar with Ofsted reports this is not an easy task. But Ofsted do actually store this metadata somewhere in their internal databases, but they don’t expose it on their website. It is published in a series of Excel files which Ofsted publish on a regular basis and have pointed to in response to a number of FoI requests.

The problem with these spreadsheets are twofold:

  1. the spreadsheets are poorly structured for data access
  2. they have a termly lag-time i.e. they are published termly in arrears

And this leads to the full proposal ….

The Pitch

I want us to build a prototype web service that will allow 3rd party sites (including sites) to grab some very useful Ofsted information in format(s) suitable for web use and display it on their site. The information to include:

  • the date of the last Ofted inspection
  • the overall judgement of the inspector(s) on the school at that date
  • a link to the schools Ofsted homepage, in order to provide context to the user if the want/need it

The second part of this project is for the non-geeks. It’s a policy/engagement issue. I’d really like to get some talking heads to put together some ideas for how we can engage with Ofsted and persuade them to do some or all of the following:

  • take over the hosting and publishing of the service
  • reduce the latency of the published data by included ALL recent inspections in the service
  • publish the source data in more open and accessible formats rather than the current cumbersome Excel files

There’s something there for everyone – developers, data bods, policy wonks, as well as the persuaders. I look forward to seeing you there.


I’ve hacked the spreadsheets previously and cobbled together a combined datasheet of all the relevant inspections from 2000-2009 and uploaded them to my public dropbox folder. That can be used as the source data for a prototype.

If also had a couple of thoughts on data structure and I guess I’m thinking that the returned data should probably be available in xml, json and perhaps (x)html.

An XML snippet might be something like:

<inspection urn="123456">
     <comment>This information should be interpreted in the context of the full report which is available from the Ofsted page below</comment>

The recent launch of has shown that there is much demand for raw data out there. But despite there being a fascinating array of educational data released, there are a couple of missing datasets.

A recent FoI Request of mine released the information that Ofsted have for many years been publishing the data from their inspections. This information seems not to be widely known, but is very welcome. The published format (Excel files) and presentation is not necessarily to my liking, but at least its available.

The second missing dataset is an odd one. The School Revenue Balances data is published annually by the DCSF and purports to show how much money is being hoarded by naughty headteachers and school governors, instead of being spent on the pupils. It doesn’t, at least not in the way presented. But that argument is for another post. It’s still a very useful dataset and a still bit of reverse engineering can give data such as individual school budgets going back to the millenium.

There are a couple of small problems with this data as presented. Firstly, schools are not identified by their URN, a primary key used in most other educational datasets. Instead it uses a 2 field key of LocalAuthority-EstablishmentNumber. This is the way schools were previously identified, but has been superceded by the URN, for quite some time now. The other issue is that it is not normalised. This is a database design term relating to how data is structured. But this is inherent in the data format (excel) and presentational requirements.

But that does mean that any meaningful queries of the data need to be preceded by some in depth hardcore data manipulation.

Both datasets are valuable and deserve inclusion on I’m working on improving the format and presentation and I’ll blog thatand make it available here when I’m done.

