Just a spacing test

This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers.

This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers. This is just a short test to see the effect that double spacing after a full stop, aka the JRM style guide, has visually in modern browsers.

Installing nifi on a macbookpro

I had a recent recommendation to try out nifi, and having a quick peruse of the project, I decided the simplest thing for testing would be to whack it on my MBP and work from there.
So I went to the install docs and noted the recommendation to use brew on the mac. I had brew installed for testing something previously but I hadn’t used it in a while. But it was worth a try.
My brew installation was out of date, so it did a couple of updates, before it was happy to
brew install nifi
The response I got was odd in that it didn’t seem to think that java was installed. I say odd, because although its not something I use regularly, I have used it in the past and so I believed it was installed. But the MBP was newish and the system had been ported from my previous MBP so it was possible that java didn’t survive that move, especially with Oracle’s current attitude to java.
Brew suggested I install adoptopenjdk, which I did, and then brew was happy to install nifi.
But nifi didnt start, and the startup wording in the documentation wasn’t helpful
“For Linux and OS X users, use a Terminal window to navigate to the directory where NiFi was installed.”
As I used Brew, I hadn’t the foggiest notion where Brew had put the software. I could guess. I rooted around in /opt and /var without success.
Going back to the documentation, I tried just downloading the tarball and extracting that and running it, but it didn’t work. So i joined the mailing list. The advice initially was to the point. The dependency is on Java 8. the default adoptopenjdk was v12. So nifi was not going to work. Additionally the $JAVA_HOME variable was not set.
The recommendation was to install jenv to manage multiple versions of the jdk. This worked although the documentation for installing jdk8 were a little tricky to find, but this was helpful.
Final hurdle was the instructions to to navigate to the nifi directory. This wasn’t necessary as nifi was on the path, so simply executing
nifi start
worked once I had eventually checked it was working. I was fixated on fixing the JAVA_HOME error and actually wasnt testing http://localhost:8080/nifi. Once I did (Doh!) the main nifi screen was visible.
  1. The java dependencies need to be clearer and need to be mentioned or linked to from the ‘Getting Started’ section
  2. The bin/nifi.sh start reference seems to be superseded by nifi start
Installation and packaging:
  1. The java dependency in brew needs to be addressed. If brew offers to install an incompatible jdk, that is a real issue.

Information Commissioners Office – List of Awareness Guidance documents

I made this FoI request to the ICO having come across a numbered awareness guidance document on the ICOs site.

Their response was a little confused, I thought, and I decided to clarify it here in my own words.

Firstly, they say the series has been discontinued, although some of them are still in play. In other words, they decided for some reason to replace some of them and not call the replacements “Awareness Guidance”. Fine.

Secondly, there used to be an index and they pointed me to it in the National Archives. But the old index only points to archived versions of the guidance. So I copied the relevant portions here below, and have edited it so that the guidance that is still relevant is linked from here while the no longer relevant stuff points to the archive. Hopefully I will be able to find the replacement documents and link them from here, which might make it all a whole lot clearer.

So in the list below, all the links point at the National Archives EXCEPT where it says “ICO” after the name. With this latter set, the link points at the current latest version on the ICO site.

Detailed specialist guides



Links from FoI Request

Awareness Guidance 5
Awareness Guidance 9
Awareness Guidance 10
Awareness Guidance 12
Awareness Guidance 13
Awareness Guidance 14
Awareness Guidance 19
Awareness Guidance 26
Awareness Guidance 27

Guide to Port Entry – innovation in print

Guide to Port Entry IS Innovation of the Year 2013

Guide to Port Entry IS Innovation of the Year 2013

I have been meaning to blog more often and really haven’t managed it. Good intentions and all that. But today I have a ‘real’ reason to blog so here goes. Last night at the PPA Digital Publishing Awards, our flagship publication Guide to Port Entry won the Innovation of the Year Award for 2013. Amid all the apps and websites and datafeeds of the UK digital publishing industry, a book was the most innovative entry. Sounds slightly mad, doesn’t it?

But it’s well deserved. The Guide has always been innovative, since our founder Colin Pielow put together the first edition from scraps of notes about ports he had been collecting. Back in 1971, no one was sure there would be a demand for a such a book, but Colin’s hunch proved correct. The 2nd Edition in 1973 added port plans and in 1975, the 3rd contained detailed information about Soviet-era Russian ports, the first western publication to do so.

Over the years, we at Shipping Guides have further expanded the Guide, adding hundreds of new ports and terminals to each subsequent edition, and broadening the scope of the information published. The shipping industry has recognised the quality and depth of the data by continuing to buy each new edition as it appears on the shelves.

And in mid-June 2012, as we prepared to put the 2013-2014 edition into production, we made a decision which has proved that there is still life in print media, despite the gloom and doom that has been spread by so called experts. We created thousands of QRCodes linked to our findaport.com digital platform and added one to each port in Guide to Port Entry. And by doing so, we enabled every reader of our flagship book to have immediate access to the latest information we hold about any port of their choice, just by scanning a QRCode on their phone.

QRCode to link to findaport.com

QRCode scanners are available for most modern mobile phones, free of charge in most cases. They are simple and easy to use. QRCodes can have a variety of information embedded within them including website links, emails, contact details, and more. More importantly, they are a proven technology and they work.

By embedding QRCodes in each port entry, we have given new life to a vintage edition. In this 21st century, our best selling book can continue to be used as a top quality reference source without the fear of being somehow dated. The latest information is always just a quick scan away.

And last evening, the UK publishing industry agreed with us. They named Guide to Port Entry – QRCodes as the 2013 PPA Digital Innovation of the Year at a packed dinner and awards ceremony in the City of London. And I was proud to accept the award. It’s been 40+ years in the making but the old girl deserves the recognition.

UKGovcamp 2012 – The Ofsted Project

It’s UKGovcamp time again and this year is a little different. It runs over 2 days, with the Friday being the traditional unconference. The Saturday event is a hackday of sorts. And the organisers are looking for suggestions of projects to be developed. And I have a good idea.

There has been loads of opendata published in the schools arena in recent years with the initial Edubase data release being a key part of the data.gov.uk launch. And last year the DfE released a new school comparison site (together with all the comparison data !!!) that does a really good job.

This means we now have 3 government sponsored schools data sites, Edubase, the comparison site and the DirectGov site. And there is one thing that’s missing from all of them. The judgement of the Ofsted inspectors during their last visit. The reason why that is missing is worth discussing but not right now.

Suffice to say I think it would be useful for prospective parents (and others) to see at a glance how Ofsted view each school especially in relation to its neighbours. So I propose that we build one on Saturday.

And it’s not as if the information isn’t available. All of the sites mentioned above provide links from each school’s page to an Ofsted home page for that school, listing the inspections of that school. But to find out the judgement of the inspectors – a very important piece of metadata – you need to open a pdf file and read through the report. If you are not familiar with Ofsted reports this is not an easy task. But Ofsted do actually store this metadata somewhere in their internal databases, but they don’t expose it on their website. It is published in a series of Excel files which Ofsted publish on a regular basis and have pointed to in response to a number of FoI requests.

The problem with these spreadsheets are twofold:

  1. the spreadsheets are poorly structured for data access
  2. they have a termly lag-time i.e. they are published termly in arrears

And this leads to the full proposal ….

The Pitch

I want us to build a prototype web service that will allow 3rd party sites (including .gov.uk sites) to grab some very useful Ofsted information in format(s) suitable for web use and display it on their site. The information to include:

  • the date of the last Ofted inspection
  • the overall judgement of the inspector(s) on the school at that date
  • a link to the schools Ofsted homepage, in order to provide context to the user if the want/need it

The second part of this project is for the non-geeks. It’s a policy/engagement issue. I’d really like to get some talking heads to put together some ideas for how we can engage with Ofsted and persuade them to do some or all of the following:

  • take over the hosting and publishing of the service
  • reduce the latency of the published data by included ALL recent inspections in the service
  • publish the source data in more open and accessible formats rather than the current cumbersome Excel files

There’s something there for everyone – developers, data bods, policy wonks, as well as the persuaders. I look forward to seeing you there.


I’ve hacked the spreadsheets previously and cobbled together a combined datasheet of all the relevant inspections from 2000-2009 and uploaded them to my public dropbox folder. That can be used as the source data for a prototype.

If also had a couple of thoughts on data structure and I guess I’m thinking that the returned data should probably be available in xml, json and perhaps (x)html.

An XML snippet might be something like:

<inspection urn="123456">
     <comment>This information should be interpreted in the context of the full report which is available from the Ofsted page below</comment>

Continue reading

Tech development in school

It is generally agreed that there is a real need to move forward with coding4kids agenda now. We’ve done the talking, now is time for action. So my action pledge is to develop an idea for structural change at the top of educational foodchain.

My idea is to require that in every English school (the devolved governments can make their own structural change), one of the performance management (PM) targets for the headteacher, will have a technological focus. Every year.

The governing body of each maintained school appoints 2 or 3 of it’s members as the PM committee whose role is

  • to set 3 PM targets for the Head for the current year
  • monitor and evaluate the Head’s progress towards those targets
  • make recommendations to the GB at the end of the year about salary uplift for the ead

If we can get government to recognise the tech deficit, then surely an overall programme of tech awareness, understanding and adeptness will begin to pay dividends. And our schools can be the cradle for this innovation, just like they were when that first BBC Micro arrived at the school reception 30 years ago. But we will need some subtle pressure and oversight to ensure it happens. So the Heads PM rules need to be tweaked to give our headteachers one of the lead roles in driving the country forward.

The tweaking needs to ensure that at least one of the PM targets for every Head will have a tech development focus. But that doesn’t require us to turn every Headteacher into a geek. On the contrary, many schools will probably need to look at their own tech infrastructure and resources – including human resources – rather than at the tech curriculum. At least in the first instance.

So let’s outline some examples of possible targets as a starting point:

  • To undertake a tech skills and resources baseline assessment of the entire school community (pupils, staff, parents, local community, at school, at home, in public buildings) and to publish the results back to the whole community in a 21st century format within the current academic year
  • To partner with local businesses and community to start a Computer or Coding Club and ensure it appeals to a wide cross section of pupils, parents and staff before the spring half-term
  • To take school website maintenance in-house and reduce the cost of maintaining a web presence by 70% in the next financial year

Each of these examples are SMART. I’m sure the dev and activist communities can produce another 10 example SMART targets within 24 hours.

If the government can commit to making this change to the PM rules, then the quid pro quo from us would be to create a free support and advisory infrastructure for Heads and Governing Bodies to enable them to make this work. A starting point would be some simple wiki pages, but that will need to expand into a database of local geeks able and willing to lend a hand in devising appropriate targets and in measuring success. But the will exists to do that now, so lets harness it now.

Expectations vs Offerings

Conversation overheard on NHS paediatric ward last evening:
15 year old boy: what’s the wifi password on this ward?
Nurse: what?
15yob: you know, the wifi? I can see it, but when i try to connect it’s asking me for a password. what’s the password?
Nurse: (slightly flustered) oh that! That’s not connected. You can’t use it. I’ll go and ask.
(and she scampers off – 15yob returns to phone game and then swaps over to his iPad)
(10 minutes later nursey returns looking triumphant )
Nurse: Well I have talked to the staff nurse, and she says that is only for the doctors, it’s not connected, and if it was you’d have to pay for it. So that’s that.
(and she turns and waltzes away)

15yob returns to his game.

A quick look around the ward shows that it is teaming with hitech kit. A PC, a PS2 and a PS3, a Wii, an Xbox (of incredible vintage), flat screen tv, old CRT for PS2, blue Ray, and an array of remote controls. I can understand why a 15yob might expect to be able to connect to the wifi that he can see.

Let’s look a little more closely at his expectations of the NHS and compare it to the actual offerings.

15 year old boy: what’s the wifi password on this ward?

Expectation 1: there is a wifi connection – I should be able to use it
Expectation 2: The nurse will know the password and will probably give it to me

Wow! I don’t know about you, but I find it extremely uplifting that our teenagers (and future electorate) feel that if there is a wifi signal in a public place they should be able to connect to it. That tells me where society is moving to and that we will eventually have a tech enabled society.

That 15yob would expect the nurse to know the password is a little strange to me. I wouldn’t expect that. But I think the implied expectation, that ‘those in the know’ would know the password, is one that again uplifts me and gives me a sense of hope.

Offering 1: There is a wifi connection available in the ward. It is password protected. Not sure if anyone/anything is actually using it. One thing is certain. Patients currently can’t use it.


So where does that leave us? The NHS is not living up to the expectations of our kids, because it doesn’t understand what our kids want or expect.

So politicians, please don’t make grandiose plans for the future of the NHS when the organisation doesn’t understand the basics.

Semantic Hackday Notes

This blogpost is liable to change rapidly. It should be treated as a work-in-progress

I’m at the Bright Lemon/Kasabi Open Government hackday today and thought that perhaps a blogpost about the day might prove to be a useful record of the day. After a set of round-the-tableintroductions, Leigh Dodds did a brief run through on Kasabi. This introduced the latest NHS datasets loaded onto Kasabi and their functionality.

Kasabi Default APIs

One of the most difficult things for me to get my head around were the 5 default APIs available for every Kasabi hosted dataset. I have linked to the documentation pages for each of these:

I also will discuss each of them below as I start to use and understand them.

SPARQL Endpoint

The SPARQL Endpoint for each dataset will provide access to a SPARQL  API Tester or Experimental API Explorer. It also links to any example SPARQL queries that have been defined for  a dataset e.g. List all UK Primary Care Trusts on some of the NHS data.

Dataset homepages

The homepage for each dataset shows a descriptive section above and, below, three tabs –

  • API – lists the APIs (including the default 5 above) available for a dataset
  • Explore – displays a list of ways that the data can be viewed (including the default “As linked data” layout  – the linked data void description default URL) –
  • Attribution – shows ways in which data or output can be attributed to Kasabi.

Getting started with SPARQL on Kasabi

In order to get started with a a dataset you can actually edit/play with some of the sample queries for a dataset, or, if there are no sample queries, try the following procedure (this procedure uses the CIA World FAct Book dataset):
  1. Open the dataset homepage
  2. Open the Explore tab and click on the “Browse as Linked Data” link in a new tab –
  3. This opens the ‘void description’ page for the dataset which is the default url for the linked data of the dataset and gives some very basic info about the dataset
  4. Go back to the API tab and click on the SPARQL Endpoint for the dataset
  5. Open the experimental API explorer for that dataset
  6. This allows writing some example queries

SPARQL Examples

On the ‘void description’ page, look for any definitions of vocabularies used. These will allow querying of the ‘types’ within the data. There are a couple of ways of doing this. Try this one – go to the experimental API explorer for the CIA dataset and type this in

PREFIX ns: <http://www4.wiwiss.fu-berlin.de/factbook/ns#>

SELECT ?s ?p ?o WHERE {
?p a ns:Country.

Running this will give a list of countries in the dataset. And this one will give the first 10 (‘LIMIT 10’) triples in the dataset

PREFIX ns: <http://www4.wiwiss.fu-berlin.de/factbook/ns#>

 ?s ?o ?p .  
} LIMIT 10

These queries both specify a single vocabulary – PREFIX ns: <http://www4.wiwiss.fu-berlin.de/factbook/ns#&gt; They then define the data to be returned (‘?s ?p ?o’ and ‘*’) and follow that by defining which data they should be extracted from. Another useful SPARQL query would be the describe query. Try this:

PREFIX ns: <http://www4.wiwiss.fu-berlin.de/factbook/ns#>

DESCRIBE <http://www4.wiwiss.fu-berlin.de/factbook/resource/Ireland>

This query should be returning the equivalent of this CIA page about Ireland, or this page on the German source of the CIA data

The vocab used by Kasabi with this dataset can be seen here http://www4.wiwiss.fu-berlin.de/factbook/page/Ireland This allows us to see a list of all the instances of a particular triple object such as this list of factbookcodes (these are in fact the FIPS_country_codes used by the US Federal government)

PREFIX ns: <http://www4.wiwiss.fu-berlin.de/factbook/ns#>

 ?s ns:factbookcode ?p .  

Another query might be to find what objects are defined in the dataset. This one does that:

PREFIX ns: <http://www4.wiwiss.fu-berlin.de/factbook/ns#>

SELECT distinct ?o WHERE {
 ?s ?o ?p .  

So, one of the next things you might like to do is to have that list as a separate URI. We can do that by creating an API on Kasabi.

Useful links

Note – GitHub:

There is a guthub user called kasabi (github.com/kasabi/kasabi-xsl) which can be used to add xsl files for transforming outputs in new APIs.

The “Michael Kay / Jeni Tennison / XML Summer School” Top XSLT Performance Tips

Michael Kay led a final afternoon Application Development Workshop at XML Summer School 2010, with Jeni Tennison in the front row.

The delegates were treated to a series of performance improvement tips from two of our leading XSLT practitoners.

I thought they were worth sharing more widely.

1. Use Keys

– No performance problem has ever been solved by NOT using keys.
– Learn to use and key()

2. Don’t use preceding:: when you mean preceding-sibling::

– And forget an index position i.e. preceding-sibling::p[1]

3. <xsl:variable/> – select vs value-of

– If you use <xsl:variable name-=”a”><xsl:value-of select=”node-a”/></xsl:variable> then stop. This is a very expensive unnecessary node creation operation.
– Use <xsl:variable name-=”a”  select=”node-a”/> instead.

4. Arithemetic

– In XSLT 2.0 think about using DOUBLE arithmetic, instead of DECIMAL arithmetic, especially on joins.

5. On the fly spreadsheets

– If generating on-the-fly stylesheets, e.g. in XProc pipelines, consider compile-time performance issues, which in other situations are probably not an issue.

6. Small Changes x Multiple Iterations = Poor Performance

– XSLT can have poor performance in a situation where multiple transforms, or iterations of the same transform, are changing small parts of a large source document. The overhead of multiple copying of large quantities of unchanged nodes may mean it would be preferable to choose a different technology. XQuery Update might be more suitable.

7. Profiling in XSLT (1)

– XSLT profiling is available in Saxon and is used by tools like Oxygen.

8. Profiling in XSLT (2)

– Subtractive Measurement can be used in profiling. If you are concerned about a particular part of your transform, you can measure the time cost by removing the operation and measuring again.

9. Benefits of typing

– Where possible you should use data typing in your schemas. It speeds up validation no end.

10.  minOccurs/maxOccurs

– If you use non-zero values in minOccurs/maxOccurs in your schemas, the larger the values then the slower the validation will be, because the parser will need to count the number of elements.
– So its much much quicker to validate minOccurs=1 than it is to validate minOccurs=101.

Linked data – there’s more … (as Jimmy Cricket used to say)

Ingrid Koehler recently posted a nice blog on Linked Data. Most of it I was aware of and subscribed to but there was one point which had never struck me. As I read it I had a DOH! moment thinking “… that’s so obvious, whay hadn’t I thought of it before!”.

It was her point 5:

Linked data does not have to be open data. Public services would benefit tremendously from using linked data formats. It means that we could stop spending resources on data aggregation and start spending it on analysis and action. Linked data can be used in secure settings to help partners share personal, sensitive or commercial information on performance and resources and help better target those in need or areas for improvement.

I just wonder if we can create tools to make it easier to convert data TO linked data formats, whether we would find more people publishing in those formats? You still need to be a bit of a geek to get data into Linked Data formats.

%d bloggers like this: