Data is open when “anyone is free to use, reuse, and redistribute it”, as opposed to closed data that is restricted by licensing and often requires paying a usage fee. For 100’s of years open data has been a fundamental tenet of science.
In 2009 Gordon Brown met Tim Berners-Lee; the conversation reportedly began with Gordon Brown asking "How should the UK make the best use of the internet?" TBL replied "Just put all the government's data on it." GB simply said “OK, let’s do it”. This conversation reputedly led directly to OS OpenData, the Open Data Institute and data.gov.uk.
Tim Berners Lee invented the World Wide Web 33 years ago. The web, simply a set of linked documents, has been changing our world at an ever-increasing rate since. What will happen when we, and critically all our machines, can link disparate data in the same way the web lets us link documents? Open data is foundational to this idea, without it ‘cool stuff’ can’t happen, closed data licensing simply doesn’t cope with most of the new use cases.
Advantages and Reservations of Open Data
There are a whole range of benefits that arise from open data; less corruption, higher economic growth, more innovation, a more engaged citizenry, preventing cartels and monopolies, evidence-based health policy to name seven. Perhaps the greatest benefit though is that is has the potential to let everyone lead a smoother, simpler, less drab and more rewarding life.
Ten years ago open data was most often thought of as a way of the public sector sharing that which the citizenry have already paid for, this has changed. Businesses, like Geolytix, and individuals are increasingly publishing their open data. The two principle reasons many businesses hold off publishing open data are:
- Personal privacy and data protection concerns. If data can be used to identify someone it is personal and therefore sensitive and cannot be made open.
- Data as a source of competitive advantage. Many companies believe all the data they hold is what drives their competitive advantage.
Why companies and the government are able to collect and sell personal data, yet they or other organisations cannot do exactly the same collection but make the data open remains unclear. This is true in some cases but not many. Booksellers used to view their inventory and price lists as confidential sources of competitive advantage; Amazon changed their view on this. John Lewis customer’s shop with them because they are a damn fine retailer, not because JL know how big their stores are and how much money they take.
Data as a resource has very different properties to physical resources. It has a close to zero cost of re-distribution, therefore ‘tragedy of the commons’ problems do not exist. In fact, the more people use and exploit it the better it gets.
Geolytix publish Open Data
We have our own modest example. Geolytix maintain and publish an Open Supermarket Retail Points dataset. It includes over ten thousand stores, each with ‘roof-top’ co-ordinates, retailer, fascia and address. It has been downloaded directly over 1,000 times. By whom? We have no idea. What they’ve done with? Not a clue. Who they’ve shared it with? Pass. But we do know it is being used in many mobile apps, is re-distributed through many other platforms, is used to improve multiple paid-for products, and has found its way onto some of the world’s most trafficked websites. The public can now be confident of finding up to date, definitive, accurate and complete locations of supermarkets. For Geolytix; we become better known, grow our consulting business, receive more requests to license our closed data; and also get the warm glow that comes from helping others.
Questions we get asked
The exam questions Geolytix get asked aren’t always about which data to license, they are; “How many restaurants can this brand open?”, “Which properties should we show home buyers searching for ‘Islington’?”, “How much money will this new supermarket take?”, “Which stores should get Click and Collect modules?”, “If we close this branch what will pensioners do?”
To answer these questions we need data, lots of it, but also a lot more. The data alone takes us so far. I view open data’s primary transformative affect as the removal of a barrier. In the bad old good old days, assembling the data you needed to tackle our exam questions was a six or seven figure undertaking. A 1991 Census data and boundary pack was about £200,000 (a year!). Now a couple of downloads and off you go.
Open data is brilliant for us, it has shifted the point in the supply chain where the bulk of the value lies. We primarily compete and charge for the creative thinking bit. It makes customers pick partners, not on the amount of access to expensive restricted data they have, but on the depth and brilliance of their use of that data; exactly the area where we aim to win.
We don’t make all the data we create open, the stuff that helps big-time with some of our analyses, you’re going to have to pay to use that. But we are more than happy to make open some of the foundational bits that might otherwise stop or hinder analytical projects.
The Benefits and Downside of Open Data
This helps in a number of ways. First, and perhaps most importantly, it is the right thing to do. Geolytix benefit enormously from government published open data, it feels correct that we, in turn, give some of our data back. Second, it helps grow our corporate profile and reputation. Sixty projects during our first three years, and every single piece of business came from someone ringing us not us contacting them. Third, it differentiates us, there aren’t many businesses genuinely giving stuff away no strings attached. Fourth, it is the ultimate try before you buy, we make our open datasets as accurate, well-documented, and user-friendly as we can.
There are potential downsides; we know competitors use our open data to improve their data, we know some simply take our open data and sell it as closed data, competitors can download it and find the odd error. But these downsides aren’t really negatives at all. They don’t touch the selling proposition around answering our exam questions. Neither to they touch any of the other positives listed above, in fact they emphasise them.
A scale-up business publishing valuable open data isn’t nuts. We know this because we look at our active projects every morning, breath, and get ready for another hectic, creative, productive day.
Open Data Institute and the Data Decade
This year the Open Data Institute (ODI) celebrate 10 years. It was founded in 2012 by Tim Berners-Lee and Nigel Shadbolt to “connect, equip and inspire people around the world to innovate with data.” On the 8th November 2022 they will mark this milestone at their annual ODI Summit, to join virtually, tickets are available here.
Back in 2014 we were delighted to be the Business award winner at the ODI Open Data Awards, this is still one of our most cherished awards 8 years on.
Blair Freebairn, CEO and Louise Cross, Data Product Owner at GEOLYTIX