#CODS15: Trends and attitudes in open data

[Summary: sharing slides from talk at Canadian Open Data Summit]

The lovely folks at Open North were kind enough to invite me to give some opening remarks at the Canadian Open Data Summit in Ottawa today. The subject I was set was ‘trends and attitudes in the global open data community’ – and so I tried to pick up on five themes I’ve been observing and reflecting on recently. The slides from my talk are below (or here), and I’ve jotted down a few fragmentary notes that go along with them (and represent some of what I said, and some of what I meant to say [check against delivery etc.]). There’s also a great take on some of the themes I explored, and that developed in the subsequent panel, in the Open Government Podcast recap here.

(These notes are numbered for each of the key frames in the slide deck. You can move horizontally through the deck with the right arrow, or through each section with the down arrow. Hit escape when viewing the deck to get an overview. Or just hit space bar to go through as I did when presenting…)

(1) I’m Tim. I’ve been following the open data field as both a practitioner and a social researcher over the last five years. Much of this work as part of my PhD studies, and through my time as a fellow and affiliate at the Berkman Centre.

(2) First let’s get out the way the ‘trends’ that often get talked about somewhat breathlessly: the rapid growth of open data from niche idea, to part of the policy mainstream. I want to look at five more critical trends, emerging now, and to look at their future.

(3) First trend: the move from engagement with open data to solve problems, to a focus on infrastructure building – and the need to complete a cyclical move back again. Most people I know got interested in open data because of a practical issue, often a political issue, where they wanted data. The data wasn’t there, so they joined action to make it available. This can cycle into ongoing work on building the infrastructure of data needed to solve a problem – but there is a risk that the original problems get lost – and energy goes into infrastructure alone. There is a growing discourse about reconnecting to action. Key is to recognise data as problem solving, and data infrastructure building, as two distinct forms of open data action, complementary, but also in creative tension.

(4) Second trend: there are many forms of open data initiative, and growing data divides. For more on this, see the Open Data Barometer 2015 report, and this comparison of policies across six countries. Canada was up 1 place in the rankings from the first to second editions of the ODB. But that mainly looks at a standard model of doing open data. Too often we’re exporting an idea of open data based on ‘Data Portal + License + Developers & Apps = Open Data Initiative’ – but we need to recognise that there are many different ways to grow an open data initiative, and activity – and to be opening up space for a new wave of innovation, rather than embedding the results of our first years experimentation as the best practice.

(5) Third trend: the Open Data Barometer hints that impact is strongest where there are local initiatives. Urban initiatives? How do we ensure that we’re not designing initiatives that can only achieve impact with a critical mass of developers, community activists and supporting infrastructures.

(6) Fourth trend: There is a growing focus on data standards. We’ve moved beyond ‘Raw Data Now’ to see data publishers thinking about standards on everything from public budgets, to public transit, public contracts and public toilets. But when we recognise that our data is being sliced, diced and cooked, are we thinking about who it is being prepared for? Who is included, and who is excluded? (Remember, Raw Data is an Oxymoron). Even some of the basics of how to do diverse open data are not well resolved right now. How do we do multilingual data for example? Or how do we find measurement standards to assess open data in federal systems? Canada has a role as a well-resourced multi-lingual country in finding good solutions here.

(7) Fifth trend: There are bigger agendas on the policy scene right now than open data. But open data is still a big idea. Open data has been overtaken in many settings by talk of big data, smart cities, data revolutions and the possibility of data-driven governance. In the recent African Data Consensus process, 15 different ‘data communities’ were identified, from land data, and geo-data communities, to health data and conflict data communities. Open data was framed as another ‘data community’. Should we be seeing it this way? Or as an ethic and approach to be brought into all these different thematic areas: a different way of doing data – not another data domain. We need to look to the ideas of commons, and the power to create and collaborate that treating our data as a common resource can unlock. We need to reclaim the politics of open data as an idea that challenges secrecy, and that promotes a foundation for transparency, collaboration and participation. Only with this can we critique these bigger trends with the open data idea – and struggle for a context in which we are not database objects in the systems of the state, but are collaborating, self-determining, sovereign citizens.

(8) Recap & take-aways:

  • Embed open data in wider change
  • Innovate and experiment with different open data practices
  • Build community to unlock the impact of open data
  • Include users in shaping open data standards
  • Combine problem solving and infrastructure building

Slow down with the standards talk: it’s interoperability & information quality we should focus on

[Summary: cross-posting a contribution to the discussions on the International Open Data Conference blog]

There is a lot of focus in the run up the International Open Data Conference in Ottawa next week. Two of the Action Area workshops on Friday are framed in terms of standards – at the level of data publication best practices, and collaboration between the standards projects working on thematic content standards at the global level.

It’s also a conversation of great relevance to local initiatives, with CTIC writing on the increasing tendancy of national open data regulations to focus on specific datasets that should be published, and to prescribe data standards to be used. This is trend mirrored in the UK Local Government Transparency code, accompanied by schema guidance from Local Government Association, and even where governments are not mandating standards, community efforts have emerged in the US and Australia to develop common schemas for publication of local data – covering topics from budgets to public toilet locations.

But – is all this work on standards heading in the right direction? In his inimitable style, Friedrich Lindenberg has offered a powerful provocation, challenging those working on standards to consider whether the lofty goal of creating common ways of describing the world so that all our tools just seamlessly work together is really a coherent or sensible one to be aiming for.

As Friedrich notes, there are many different meanings of the word ‘standard’, and often multiple versions of the word are in play in our discussions and our actions. Data standards like the the General Transit Feed Specification, International Aid Transparency Initiative Schema, or Open Contracting Data Standard are not just technical descriptions of how to publish data: they are also rhetorical and discplinary interventions, setting out priorities about what should be published, and how it should be represented. The long history of (failed) attempts to find general logical languages to describe the world across different contexts should tell us that data standards are always going to encode all sorts of social and cultural assumptions – and that the complexity of our real-world relationships, and all that we want to know about the different overalapping institutional domains that affect our lives will never be easily rendered into a single set of schema.

This is not to say we should not pursue standardisation: standards are an important tool. But I want to suggest that we should embed our talk of standards within a wider discussion about interoperability, and information quality.

An interop approach

I had the chance to take a few minutes out of IODC conference preparations last week to catch up with Urs Gaser, co-author of Interop: The Promise and Perils of Highly Interconnected Systems, and one of the leaders of the ongoing interop research effort. As Urs explained, an interoperability lens provides another way of thinking about the problem standards are working to address.

Where a focus on standards leads us to focus on getting all data represented in a common format, and on using technical specifications to pursue policy goals – an interoperability focus can allow us to incorporate a wider range of strategies: from allowing the presence of translation and brokering layers between different datasets, to focussing on policy problems directly to secure the collection and disclosure of important information.

And even more importantly, an interop approach allows us to discuss what the right level of interoperability to aim for is in any situation: recognising, for example, that as standards become embedded, and sunk into our information infrastructures, they can shift from being a platform for innovation, to a source of innertia and constraints on progress. Getting the interopabiliy level right in global standards is also important from a power perspective: too much interoperability can constrain the ability of countries and localities to adapt how they express data to meet their own needs.

For example, looked at through a standards lense, the existence of different data schema for describing the location of public toilets in Sydney, Chennai and London is a problem. From the standards perspective we want everyone to converge on the same schema and to use the same file formats. For that we’re going to need a committee to manage a global standard, and an in-depth process of enrolling people in the standard. And the result with almost undoubtedly be just one more standard out there, rather than one standard to rule them all, as the obligatory XKCD cartoon contends.

But through an interoperability lense, the first question is what level of interoperability do we really need? Andwhat are the consequences of the level we are striving for?. It invites us to think about the different users of data, and how interoperablity affects them. For example, a common data schema used by all cities might allow a firm providing a loo-location app in Ottawa to use the same technical framework in Chennai, but is this really the ideal outcome? But the consequences of this could be to crowd out local developers who could build something much more culturally contextualised. And there is generally nothing to stop the Ottawa firm from building a translation layer between the schemas used in their app, and the data disclosed in other cities – as long as the disclosure of data in each context include certain key elements, and are internally consistent.

Secondly, an interoperability lens encourages us to consider a whole range of strategies: from regulations that call consistent disclosure of certain information without going as far as giving schema, to programmes to develop common identification infrastructures, to the development and co-funding of tools that bridge between data captured in different countries and contexts, and the fostering of collaborations between organisations to work together on aggregating heterogenous data.

As conversations develop around how to enable collaboration between groups working on open aid data, public contracts, budgets, extractives and so-on, it is important to keep the full range of tools on the table for how we might enable users to find connections between data, and how the interoperability of different data sources might be secured: from building tools and platforms, working together on identifiers and small building-blocks of common infrastructure, to advocating for specific disclosure policies and, of course, discussing standards.

Information quality

When it comes down to it – for many initiatives, standards and interoperability are only a means to another end. The International Aid Transparency Initiative cares about giving aid recieving governments a clear picture of the resources available to them. The Open Contracting Partnership want citizens to have the data they need to be more engaged in contracting, and for corruption in procurement to be identified and stopped. And the architects of public loo data standards don’t want you to get caught short.

Yet often our information quality goals can get lost as we focus on assessing and measuring the compliance of data with schema specs. Interoperability and quality are distinct concepts, although they are closely linked. Having standardised, or at least interoperable data, makes it easier to build tools which go some of the way to assessing information quality for example.

interop-and-quality

But assessing information quality goes beyond this. Assessments need to take place from the perspective of real use-cases. Whilst often standardisation aims at abstraction, our work on promoting the quality, relevance and utility of data sharing – at both the local and global levels – has to be rooted in very grounded problems and projects. Some of the work Johanna Walker and Mark Frank have started on user-centered methods for open data assessment, and Global Integrity’s bottom-up Follow The Money work starts us down this path, but we’ve much more work to do to make sure our discussions of data quality are substantive as well as technical.

Thinking about assessing information quality distinct from interoperability can also help us to critically analyse the interoperability ecosystems that are being developed. We can look at whether an interoperability approach is delivering information quality for a suitable diverse range of stakeholders, or whether the costs of getting information to the required quality for use are falling disproportionately one one group rather than another, or are leading to certain use-cases for data being left unrealised.

Re-framing the debate

I’m not calling for us to abandon a focus on standards. Indeed, much of the work I’m committed to in the coming year is very much involved in rolling out data standards. But I do want to invite us to think about framing our work on standards within a broader debate on interoperability and information quality (and ideally to embed this conversation within the even broader context of thinking on Information Justice, and an awareness of critical information infrastructure studies, and work on humanistic approaches to data).

Exactly what shape that debate takes: I don’t know yet… but I’m keen to see where it could take us…