Having been in New Zealand for the last two years I’ve been a bit out of the loop with Gov 2.0 in Australia, but have spent the last two months since moving back catching up. Attending today’s Gov 2.0 Lunch was one of the ways of finding out what’s been going, not just here but also overseas.
The venue was rather classy, hosted in one of the Senate rooms at Parliament House courtesy of Senator Kate Lundy, but the event was lo-fi with no projector – just the presenters addressing the reasonable sized group without props or slide decks.
Stuart Coleman is the Commercial Director at the Open Data Institute in the UK. Exposing data for free public consumption is not enough. It’s not enough to publish some data and put it out there, to tick that box and be done with it. It’s not about transparency for transparency’s sake. Sure, there’ll be some developers who might be able to code something up and make use of that data, but government needs to share the right data and it needs to ensure it is useful.
That’s a point that Keitha Booth from the New Zealand Government made also, and in fact government departments and agencies in NZ are required to identify “high value” data for opening up. Most importantly, they are not permitted to make the call as to what is “high value” but rather must consult with the potential consumers of that data to identify the economic, social and other benefits of opening that data.
Stuart highlighted some of the cases where businesses have grown around open public data as part of the open data initiatives in the UK, like Mastodon C and the Prescribing Analytics portal which has saved (or is expected to save) the NHS around £200 million by highlighting the under-adoption of generic medications.
I enjoyed both presentations and it was great to have international guests come and share their stories and experiences, and also good to get an update from Pia Waugh, Director for Gov 2.0 in the Technology and Procurement Division of Finance, on Data.gov.au which will soon be getting a tech refresh with an update to CKAN – you can read more in Pia’s recent blog post about the roadmap.
Transcript of Stuart Coleman’s presentation
Listen to the audio of Stuart’s presentation on Soundcloud *
Thanks Pia and thank you everyone for giving up your lunch time – it’s not quite lunch time here – but to listen to me for 20 minutes. I’d very much like questions throughout the 20 minutes; if I can’t answer them and I need a bit more time to think on my feet I may push it back and answer it at the end.
Speaking to Pia before this visit I think there was a really number of questions that kept coming up as to what she would like me to talk through. The Open Data Institute is a very new initiative from the UK and I think there was a lot of questions about what are we actually here to do, so hopefully I can answer that today.
I’ve tried to get up to date with some of the activity going on down here; there was a paper published in February around the the open data strategy in Australia by your Information Commissioner. I’ve had a great chat with Keitha this morning, beforehand, and I think what I’d like to do is summarise quite succinctly what the Open Data Institute is (going) to do, how we were set up and the learnings we’ve had so far.
We were initially launched last year with government commitment, government funding in the UK. We’re really the brainchild of Sir Tim Berners-Lee and Professor Nigel Shadbolt. I’m sure most of you know who Tim Berners-Lee is; for those of you who don’t know Professor Nigel Shadbolt he is a luminary, really, in the data movement in the UK; he’s co-chaired with Tim down in Southampton University, he’s done a lot of great, early ground-breaking work with open data with successive governments.
I think it’s important to say that the initiative is not just a UK conservative government initiative, it’s spawned from successive government activity. The most notable piece that originally got the attention of government was forcing UK hospitals to publish all their data on MRSA about seven or eight years ago, which was particularly problematic in the UK. That got the rates of infection down massively and actually wiped off £s;18 million from the NHS’ budget (having) to deal with that, so it saved a lot of money as well as improved (patient) health care.
Audience: What did you say they published?
They published their performance rates on MRSA – it’s a particular challenge around maintaining cleanliness in hospitals, and infections that spread …
That was the first piece, the next piece was really around postcode address data which is still a challenge in the UK – we just had a recent pushback on data in that area – but just to give you a quick summary on the Open Data Institute as an organisation: we’re set up as not-for-profit company. We have £10 million of government investment over five years, so that’s £2 million a year. We have a commitment to match that investment with private sector investment, and we’ve been going six months. We officially launched on the first of October. We’ve already secured just under a million pounds of private sector investment; some internationally but quite a lot from European corporates who are seeing open data as both an opportunity for them to make use of new information, but also as potentially a chance for them for them to start opening up some of their data.
Just a short summary on sort of the core activities that we’re focused on: the first is what we term “Unlocking data supply”, so we don’t think it’s enough for government to just publish data. We’re not of the view that if they publish it people will come and use it. There are examples of where people do and there’s some great leadership there often from the developer communities where good things get done, but we need more of that and we need a focus on selecting the right data, so we’ve seen particular success stories in certain sectors; transport is one, healthcare is another one. We’re starting to see more around land use information which from what I’m seeing in this part of the world is potentially a very compelling area particularly given the geographical size of Australia; I think it’s a fantastic use case for opportunities.
So unlocking data supply is a part of our mission. It doesn’t mean we’re there to do the work for government but where we see an opportunity to help government accelerate opening up their supply our goal is to step in and help them accelerate the opportunity and enable them, so we’re not wanting to become a huge organisation doing loads of professional services or consultant work; our objective is to enable others and help them develop the skills.
The second piece is really around unlocking the demand, so how do we connect the different data sets that are available; I think you’ve got just over 1,100 datasets here in Australia at the moment, how do you know which ones are the most useful? And (when) I see on your front page you’ve got some statistics around what is being used, but have you actually drilled down and identified what that use translates to? So, connecting individuals, companies, sectors, leadership (in) innovation to the use of that data is critical, and that’s another area where we’ve struggled.
Our engagement with the startup community in the UK has been great; there’s been a great burgeoning growth of startup businesses around an initiative generally know as “Tech City” centred out of London but also with satellite representation in places like Manchester and Bristol and other cities, where there have been a number of new startup businesses starting to emerge and use in large part or in some part open data to bolster their business models, and I can talk some more of examples in a second.
The final pillar – so after we’ve improved and helped a lot more supply and connected that to demand – is around communicating value. So, I’m repeatedly asked “Hang on, how do you actually prove that this open data stuff is making a difference? Is it just driving a transparency agenda for government?” Well our fundamental belief is no, it is not. A part of our mission is to incubate and demonstrate the value of open data by working with innovative young companies who do things with that data that is valued.
What I’ll quickly do is just talk about a particular example that we’ve worked on in our first six weeks; if you are online here at the moment and you can go to prescribinganalytics.com you will see a interactive website that shows you a map of the UK and in particular it hones down on regions of England and parts of Wales called Primary Care Trusts. These are geographical regions controlled by general practitioners in the UK, so doctors providing public services back to the community. One of the challenges to government for some time that no one’s been really able to prove in the UK is been the is issues of lack of knowledge and bad practice around prescribing certain drugs to patients in the UK. One particular area is around the drug class called ‘statins’ which are traditionally prescribed for cardiovascular conditions of cholesterol and those types of things. There’s this cluster I think of three or four different brand name drugs you can have prescribed for such a condition, made by all the usual suspects – Glaxo, Phizer – the cost of those brand name drugs is sometimes twenty or thirty times the price of the generic alternative and the drugs are identical in their delivery agents. So, we knew, we felt, we believed there was an opportunity to use some sets of open data to try and highlight opportunities for savings in this area and the case where the malpractice was happening. So working with one of our startups, a company called Mastodon C – a slightly quirky name, named themselves after an extinct animal – but they are experts on working with big data using open source technologies like Hadoop and using rigorous statistical and data science techniques to drill into that data, they took a bunch of datasets primarily what’s called the Prescriptions Dataset UK.
So, each month the government publishes every single patient prescription written and it’s an interesting set of data because the public at large often jump on (statements) like that and say “Well, hang on, isn’t that our personal data?” and the answer is well yes, but it’s an asset that you can derive value from; it’s published as anonymised data so we don’t see that I, Stuart Coleman took a particular type of drug. What we do see is what was prescribed, where it was prescribed and which Primary Care Trust it fell in to. If you look at that map on the page that I’ve brought you to, you can actually drill in and see, by Primary Care Trust, the trends around prescribing brand name and generic alternatives. And rolling this data up and with engaging specific medical expertise we identified £200 million of savings for the NHS, and the National Health Service in the UK has a target to save a billion and this is just for one drug class remember; this data’s released every month and is updated.
So we have a very small company work with us to achieve this. They were pre-revenue – they had no revenue, these people – they’d left their day jobs at Google and a bank to set up this company and this piece of work not only helped highlight to a huge audience – they got publications in the Economist, the Financial Times and in various government press the opportunity around open data – it also showcased their skills. And as a result in the last three months they’ve secured over £300,000 of contracts and have grown their team. And they’re working with other open datasets; they’ve done some work in the energy space.
So what I really wanted to do was talk about one of the businesses we work with, because that’s part of our mission at the Open Data Institute, is to stimulate economic growth through the application and demonstration of the value in open data. And there’s a great example of one company; there’s actually five other companies in our incubation program, one of them has done some great stuff – a company called Placr – with transport data. They’ve now secured half a million pounds of venture capital. But that’s really in the ‘communicating value’ pillar of what we do.
If we look back into the demand and supply side we see a fascinating opportunity and challenge around enabling people to actually work with data and maximise the opportunity of open data. And if you look at most institutes around the world, that is where I’d say our business focus is, so our mission is to become self-sustaining and not depend on government funding as soon as we can. To that end we are engaging both private sector and public sector in training courses and activities, so addressing data literacy which is a theme we see as being a real challenge. So, how do we support senior civil servants to, for example, in the UK we’re (–) engage the procurement community in government to help work with them to embed open data into procurement principles so that when they write a contract with a supplier they can be guaranteed the data that’s going to result from those activities is something they retain as an asset (to) government and it doesn’t get owned by a supplier who can then charge an inordinate amount of money to change a particular asset.
So we are developing some specific training materials to that end. We have also been running some training courses around open data licensing and the (law) and that’s had a significant uptake in the private sector. So we’re very much listening to the community both in the developer side of the world; we’ve got a great leadership team, so very diverse team. Our CEO is an ex-entrepreneur whose last business was based on open data. He raised over £10 million in VC for that business. My background is partly in the dark side, so proprietary technology companies who’ve existed for a while, but also more recently with (the) venture capital community and open data.
And I guess the real practitioner in our leadership team is Jeni Tennison and she is a leadership figure in the open data community, has been behind legislation.gov.uk which is a significantly successful initiative to take the whole of the UK’s law and digitise it as open data. And that’s a big, big feat. The data and documentation dates back to 1267. Jeni’s architected that, and with a small team has achieved great things. Not only has she helped save a lot of money for government; she’s helped make legal information available to a much broader audience and we’re seeing some early signs of businesses using that data to create things like student-configurable texts on open data, and Jeni architected that over the last four years.
We’re never planning to be a large massive organisation; we’re around 15 people. Most of the people are actually delivering things. We have membership model for the private sector and just to mention that, why are the private sector interested? Well, we’ve got a company like Telefonica – which I guess would be similar to Telstra over here – they are starting to use open data, to combine it with some of their own proprietary data; they’re starting to sell new services based on network events, to (have them) measure the footfall of people in urban areas based on one of (our) network activity and combine that with census open data and then sell that to industry. Basically that’s an opportunity.
We’re working with Virgin Media who are the second largest Internet Service Provider. They’re looking to publish open data on their users’ behaviour which is potentially quite controversial. We’re going to anonymise it, but they’re going to help create what they see as hyperlocal communities of Internet users in areas that are perhaps disjointed across the UK and help share activity on browsing the web (to) provide new services and improve customer services. So they’re experimenting, and we’re providing a (breeding) ground for that experimentation.
I don’t really have a lot more to add – I’d quite like to open up for some question. Just to give you a bit more of our international flavour, again I’ve been asked by a number of people “What are the ODI’s international ambitions?” and first of all we do have them; we’re tempering our international appetite with the requirement to execute in the UK all the funding we’ve been given in the first year or so. We would like to set up what we’re seeing as ‘nodes’, so specific focussed communities around open data and supporting our mission and the summary of our mission is to capitalise on open data culture. I’m using the word ‘culture’ quite carefully.
We see a requirement to build on some of the fantastic work done by the likes of the Open Knowledge Foundation, MySociety, some of these leading, pioneering organisations who are trying to take open data to the mass. When we launched, we launched with a data-driven art exhibition, which really engaged a much broader community than would have been possible purely from (–). And I’ve had a great time wandering around (–) exhibitions you’ve got here; I’ve only seen a slight portion of, but I see you’ve got a piece of the Magna Carta here, which I think is quite inspiring, and ahead of ANZAC Day on the 25th, it’s been a real pleasure to be here.
I hope that slightly congealed message was of interest. I do have some content here I would happily make available so I will give that to Pia and she can stick that up online and you’re very welcome to use it and redistribute it and do whatever you want with it.
* Unfortunately Keitha spoke too softly and I couldn’t even transcribe the audio recording, so sorry folks and my apologies to Keitha for not getting her talk online – I tried.