In a movement like open data, animated by the passion and skills of people who work tirelessly towards worthy ends, we are grateful that some of them — somehow — find the time to share what they think, and not just what they build.
Leigh Dodds’s writing is candid, motivating, constructive. He speaks from years of hands-on experience, but never takes an eye off the horizon. In this way, Leigh creates a circuit between the reality and promise of open data.
This interview contains some of Leigh’s thoughts on data literacy, data curation, local use of open data, working with code versus working with data, and more.
What’s the most important thing you’ve learned about how to work with “data people” vs. people who are less data literate?
The more time you spend working with data the easier it is to understand any new dataset. To get a sense for how it’s structured, what it might tell you, or some of the interesting ways in which to explore it. Those are quite abstract skills. So the thing I’ve learned when working with people who are less data literate, is that it’s important to quickly get to a shared understanding of the data and its potential. Visualisations and other ways to summarise and explore a dataset are a fantastic way to develop that understanding and to help demonstrate to people the ways in which the data could be applied to specific problems.
What is your favorite open dataset?
Can I have two?! OpenStreetMap and MusicBrainz. They’re my favourites for essentially the same reasons: they both exist because of a community who were intent on curating data that they’re passionate about. These are two of the earliest and most successful communities contributing towards the open data commons. As such they’ve had to mobilise around just collecting and managing data, but also building their own tools and processes to get the job done.
The next phase of development around open and linked data ought to be on making it easier for any community to do the same thing.
Maps and music also happen to be two things that I love so I also get a kick out of knowing that so many services I use are powered by open data :)
What is the most exciting thing happening today with open data?
I think the most exciting thing is seeing such a wide range of different communities working to understand how open data can be the basis around which they can collaborate and innovate to solve problems. I’m particularly excited by work that is going on in local communities, e.g. in cities and local areas.
You’re a consultant who works on data projects. What are the biggest differences between working with data and working with code?
I think the main difference is in the maturity of the tooling. I started my career as a software developer in 1997 and there’s been a massive amount of change in development practices over that period. With github, and the variety of other services that sit around it, we have a fantastically productive environment and set of tools to work with. Particularly when you’re working in the open. We’re a long way from that with publishing, linking and using data.
Who are the unsung heroes in either the open data or Linked Data movements — people or groups that deserve more credit for their contributions?
That’s a difficult one. I meet, and work with, a lot of people who are passionate about not just championing open data but also getting the hard work done to get data published and used. I’ve met so many people working in national and local government, as well as the private sector for whom open data is more than just a day job task. It’s always rewarding to get the opportunity to work with people who are so passionate about what they do.
If I were pressed to name a couple of people, then I’d mention Libby Miller and Dan Brickley. Libby and Dan created the Friend of a Friend project which was the first introduction to the semantic web for many people, myself included. I’ve always admired the effort they both put into building some of those early communities and their pragmatic approach to getting things done. Libby is now doing great work at the BBC and Dan is helping support the community around Schema.org.
You’ve been blogging since 1999. What is the most important piece you’ve written?
I’m not sure I’m the best judge of that! It’s not a blog post, but I’ve had some really nice feedback on the Linked Data Patterns book. I find patterns a useful way to codify things that I’ve learned. At the time I was doing a variety of Linked Data projects, including some training courses. The same questions kept coming up, so I wanted to try to produce a useful catalogue for people working with Linked Data, that will help capture some of the design patterns around working with graph based data. I’m pretty pleased with the outcome, although its a few years old now so definitely due for an update.
In my blog posts, I try to explore simple questions that might give people useful insight into how best to publish open and linked data. For example, I’ve been working on a series of basic questions about data which are motivated by discussions that come up in projects that I’ve been working on.
The web is about sharing what we know, so I like to do my bit :)
Have you seen private sector attitudes to open data change over the course of your work in the space?
Yes, definitely. There’s been an ever increasing level of interest and engagement from the private sector. Initially that focus was primarily around how they might use open data, particularly data from government. But the conversation is now shifting towards how the private sector can be a contributor to, and not just a beneficiary of, that ecosystem.
Has anything really surprised you about the way the open data movement has proceeded in the last few years?
It surprises me that we’re still so often debating some of the basic concepts. For example the importance of open licences. I suspect that’s a factor of the spread of open data into new communities and the need to educate people about what it takes to really build an open data commons.
I’ve also been a bit dismayed that despite all the successes we’ve had here in the UK, that so much of our national data infrastructure is still beyond closed or complex licensing arrangements. Addresses being the prime example. I often envy those of you working in the US with so much geographic data freely available!
On a more positive note I’m always surprised and excited at the speed with which an organisation or a community can not only buy into the idea of open data, but really go all out in making it a reality. The recent work of the Department of Environment, Food and Rural Affairs to launch its Open Defra programme and then release 8,000 datasets in a year is amazing.