Weeknotes – 17th June

[Cross-posted from Connected by Data blog]

It’s been a week of planning & strategising, in-between two conference and panel-heavy weeks last week and next. On that note, do join me for the State of Open Data panel on AI next Wednesday (1pm BST), and at Open Future’s first salon looking at Business to Government Data Sharing on Thursday. Plus, I’m hoping to make it along to some of the online components of the Data Power conference.

Iterating on the case database

It looks like we’re getting into a good pattern of Monday and Wednesday team meetings, which offers a mix of focus on what we need to deliver (Monday meetings with a work planning spreadsheet) and a space to reflect on what we’re learning through the week (Wednesday meetings, where I experimented this week with bringing a sketch of the case database development for team feedback).

I’ve been getting a bit stuck with working out how to move forward the work I’ve been doing to build a dataset of cases of participatory data governance, particularly working out how to align this with our wider advocacy and practice work. So, picking up on the suggestion that it is sometimes easier to brainstorm in slides than in a prose document, I pulled together a short deck outlining where I’ve got to, and providing some rough mock ups of possible ways to expose the case study research on the Connected by Data website.

Mock up image containing the text: Hundreds of organisations have already been engaging communities in data governance > Explore case studies. Freom public dialogues to citizens panels, tried and tested models exist that can put collective data governance into practice > Explore methods. You don't need to go it along > Connect with othe practitioners. For participation to build trust in data governance, it needs the right fit > Find the approact that will work for you.

Caption: Rough mock up of Connected by Data website with four ‘calls to action’ that build on the case database work.

Feedback from Jeni and Jonathan pointed to a number of useful areas to explore more, including thinking about how far we editorialise cases to highlight our opinions on what best practice is, how we might work with partners to provide a long-term home to any case and method library resource we create, and how, when allowing users to browse by methods, we clearly communicate that effective participatory governance often requires a mix of methods.

In the deck I shared a few experiments that try and get at this latter point – visually presenting the ‘structure’ of the different cases I’ve surveyed to highlight that they involve multiple related components. I had initially thought that it might be possible to generate a ‘graph’ of relationships between components, but experimenting with mermaid.js graphs (And its nifty text to graph syntax) quickly revealed that it was going to be tricky to generate elegant presentations this way. Instead, I turned to a more linear approach to showing the structure of an example case, using icons from the noun project to start to pull out relevant facts about each component of a participatory data governance case, such as whether engagement activities are one-off, repeated, or ongoing, and whether they involved a single group over time, or multiple groups.

Image showing network graph, and linear graph, of case components: Rapid review; Dialogues (weighted sample), participant led research, specificailly impacts group sessions, and analysis and report.

I’m going to do some work in the coming weeks to explore engaging with a designer on a next iteration of this, helping to firm up some of the key concepts we want to communicate about getting the practice of participatory governance right.

Sector selection

As Jeni has explored in her weeknotes, we spent some time this week looking at selecting a small number of sectors in which to focus our work over the next year, settling on a shortlist of debt, education and housing. I’ve started writing up a scoping document for our sectoral focus on Debt, (incorporating consumer finance and gambling) to sketch out some of the key data governance issues, key stakeholders, and potential policy influence opportunities related to data governance. At this stage, the focus is on rapid research to validate whether or not this should be a focus sector for us, and to develop our shared understanding of the scope of the sector.

Campaign strategy

I also spent a bit of time this week talking with Jonathan about our next steps of campaign planning, and how to facilitate our next stage of work on the Data Rights Bill. More on that in the coming weeks.

Other notes

Workshop on Governing Knowledge Commmons

On Monday I dropped into an online session of the Workshop on Governing Knowledge Commons set-up for discussions of ‘half baked research ideas’ linked to smart cities and knowledge commons. There were a couple of really useful insights from the discussions, including tips from Brett Fischman on making sense of complex phenomena (like adoption of smart city technology, or, indeed, collective governance of data) through analysing in different action arenas from Macro (i.e. how is the city as a whole adopting a collective approach to data governance?), to Meso (how is work in the housing sector in the city adopting collective data governance?), to Micro (how is a particular project making use of a collective approach to data governance?).

Katherine Strandberg pointed to the particular features of the Governing Knowledge Commons (GKC) framework, as opposed to Ostrom’s commons governance work, in dealing with the fact that “knowledge commons are especially likely to have impact (positive or negative) beyond the community obviously involved in creating the knowledge”, such as in cases of patients involved in rare disease research.

In response to some of my musing on how we can use our Connected by Data case research to understand the kinds of governance appropriate to different situations, Brett offered the concept of externalities as one tool to use. Depending on the data and context in play, there may be different positive or negative externalities from data collection and use to worry about, and different kinds of governance institutions may be more or less effective at managing these.

Indigenous Data Sovereignty

Thanks to Jeff Doctor for sparing the time to chat through some of the ways Indigenous tech firm Animikii are thinking about data governance, and about some of the data (and wider) issues facing Indigenous communities. We touched on the challenge of identifying the legitimate collectives that have a role in governing data, particularly in cases where the claim of states to jurisdiction over territory and peoples remains contested, and the need to recognise the ongoing struggle that many Indigenous people face to find security, and to avoid being criminalised or marginalised through data-driven forms of surveillance and control. This brings into relief some of the challenge of designing participatory data governance approaches that engage those most affected by data use, whilst respecting that the point at which individuals and communities most experience data-based harms may be the point at which they have least capacity to engage in wider governance debates.

It was also insightful, amidst the talk that sometimes comes up around the Datasphere initiative of navigating data governance in a post-Westphalian order, to be reminded of the many Indigenous nation’s claims on land, that have long challenged the settled international boundaries taken for granted in so much work. Jeff pointed me in particular to the Land Rights statement of The Council of Chiefs of the Haudenosaunee, making a connection between data rights and land rights.

RightsCon

Building on last week’s weeknotes, I added a few bits into our write up of RightsCon which you can find here.

Visualising processes

On Wednesday I had a catch up with Mel Flanagan of Nook Studios, whose work seeks to make complex processes much more accessible through careful information design. Mel shared updates on the work they have been doing to join the dots between different open government initiatives and data silos, but we also talked briefly about ways the process-visualisations developed for this could be applied in data governance dialogue processes.

Legitimate interests research

Lastly, I ended the week by firing off a few ‘test requests’ for Legitimate Interest balancing tests from a selection of companies whose privacy policies invite users to request these.

Using the Princeton-Leuven Longitudinal Privacy Policy Dataset I’ve searched for “balancing test” and then identified a number of large websites that have text in their current privacy policies to the effect that: they process certain data on the basis of legitimate interests; they have carried out balancing tests; these balancing tests can be requested by emailing them.

Conscious of the controversy from the Princeton-Radboud Study on Privacy Law Implementation which sent simulated messages to request GDPR related implementation information to a large number of websites, triggering significant work by in-house legal teams, I’ve taken care to clearly identify in the outgoing messages that this is part of Connected by Data research work, and is not strictly a customer request, so I will be interested to see what replies, if any, we get.

This will help shape any future research work into how balancing tests are currently used, particularly relevant with the upcoming details of the Data Reform Bill.

(Mid-)weeknotes – 1st June 2022

[Cross-posted from Connected by Data blog]

These weeknotes are falling a little late, but as the Jubilee double bank holiday in the UK means it’s now a really short-week, I’ll cover two weeks for the price of one, while trying to focus in on just a few themes from the last 12 days of CONNECTED BY DATA work.

Narrative, Policy & Practice

A big focus of last week was our first team day. Besides being fantastic to properly meet my new colleague Jonathan, and have the day working face-to-face with Jeni and Jonathan, we were able to take a deep-dive into some of our Theory of Change, and to think about what it means to frame CONNECTED BY DATA as a campaign.

One of the key insights for me from the day was to more clearly articulate our position as a ‘bridge’. It’s not that there is a shortage of great thinking and experimentation out there about taking more collective approaches to data governance, nor that there is an absence of policy appetite for addressing data governance challenges. The gap to be addressed is arguably around testing, translating and communicating how data can be done differently. As Jeni has explored in her weeknotes, this involves being able to better articulate both the manifest harms from current data practices, and less tangible impacts that come from the data traces of our lives being treated as a natural resource for commercial exploitation.

I have the feeling that focussing on this bridging role will be really important to scope our future research work, and identify the kinds of activities that fit well within CONNECTED BY DATA , and those that we might support and work on with partners, but not directly lead on.

Framing governance & decision making

In a couple of conversations in the last two weeks, including a research design session with the team at Research ICT Africa, I’ve been feeling the need for a clearer conceptual map of where data governance decisions are made. In short, I’m looking to have a clearer articulation of the activities that data governance addresses (data collection, design, analysis, use, sharing, etc.), and the particular tools of governance, such as setting principles and policies, operational decision making, oversight, scrutiny and evaluation.

In thinking about what it means to make decisions more participatory, I’ve spent a bit of time looking back over my past work on youth participation: in particular a 2008 Open University Study Guide that accompanied a chapter in Leading Work with Young People. One of the study activities we included there was to ask youth workers to document every decision made as part of a recent event or session, and then to work out which were the decisions that mattered. The point was to emphasise that empowerment does not come from simply sharing every decision: but involves working out which decisions need to be shared, and what kind of participatory process there needs to be around them.

I was also struck reading Katya Abazajian’s critique of the use of Open 311 data (which, incidentally, hints at many broader issues of collective data governance) by the discussion of how the way questions are framed significantly shapes the outcomes.

We also touched on this point a little at our team day when Aidan Peppin from the Ada Lovelace Institute joined and shared insights from the public dialogues and citizen’s juries that he has been involved in. While some dialogues have led to participants adopting quite collective language for thinking about their _data, others have taken a more individualistic tone. As Jonathan has explored, some of this seems to be to do with how people feel about the _institutions behind the data. It also appears related to the framing of the subject matter being considered (health data vs. location data for example). However, I’m curious as to whether there are particular ways to sensitively offer collective language into public dialogues on data.

I had a go at thinking about this in providing some asynchronous feedback into the stakeholder group for a Data Stewardship Dialogue being run by the Open Data Institute for the NHS AI Lab, where I tried to draw out the distinction between ‘collective decision making’, in terms of decisions made by a group (but where participants may still be making their decisions based on individualism and self-interest, and the outcome might be based on simple majority voting for example), and ‘decision making for collective benefit’, where the process encourages greater thinking about our interdependence.

Building on all of this, in the next few weeks I’ll be seeking out, or sketching out, some sort of small methodological tools that might help with better mapping out and describing the detail of data governance decision making, to sharpen up how we both research existing practice, and how we frame our vision of what future policy and practice should be.

Other things

  • I was left scrambling a bit on Tuesday when my main work computer, just about to be used to webcast a community meeting hosted at the lovely Stroud Brewery, had a run-in with a pint of ale. It’s at the repair shop hopefully drying out – but thank goodness for backups (apart, frustratingly, for five hours worth of data governance literature review write-up).
  • I’ve submitted our proposal for a session on Collective Data Governance at the Internet Governance Forum (thanks to everyone who contributed!), and have been in conversations about a few other convening opportunities around research and policy, including chats with Christian Perone from ITS Rio, and Preeti Raghunath from Monash University.
  • It was great to connect with other Datasphere Initiative fellows for our monthly meeting on Friday – where we were also hearing from Martin Pompéry of SINE Foundation on some of their work deploying both technical and organisational approaches to govern data sharing for carbon emission reporting across supply chains.
  • I’ve been listening to this interview between Divya Siddarth and Douglas Rushkoff on Team Human, which offers some great insights into how tech communities are drawing on concepts of co-ops, commons and the pluriverse, including weaving it into the Declaration on the Interdependence of Cyberspace.
  • I’m looking forward to being a delegate at RightsCon next week, and have been starting to put together a list of sessions to tune into over the week, as well as planning to keep my diary a bit more open for ad-hoc remote-conferencing connections.

Weeknotes – 20th May 2022

[Cross-posted from Connected by Data blog]

It’s been a week of both thinking big about how we might, in the coming months and years, shift public and policy narratives about collective data governance, and focussing in on the details of what it might mean in practice to make data governance more participatory. In the iterative process of building out our case study database, and reviewing reports from a number of public dialogues, citizen’s juries and participatory processes linked to data governance, I’ve been reflecting on three themes, outlined under inevitably alliterative headings below.

Defaults and decisions

Skimming the Ada Lovelace review on UK Public Attitudes to regulating data and data-driven technologies reveals a fairly clear picture, and one backed up by many of the data dialogue reports I’ve read this week , that the public want to see more and better regulation of data, and expect innovative uses of data to be aligned with the public good.

At the same time, the Ada Lovelace review finds that “more research is needed on what the public expects ‘better’ regulation to look like” and ”determining what constitutes ‘public benefit’ from data requires ongoing engagement with the public”.

Having noticed that a lot of the cases I’ve gathered so far of public dialogue around data governance tend to inform, or design, fairly general principles or recommendations on how data should be handled, I’m finding it useful to think about participatory data governance on two levels:

1) How should public input shape the defaults that are in place for any uses of data?

There may be different defaults for different sectors, or potentially for different user groups (although Afsaneh Rigot’s recent report on Design from the Margins would suggest there are strong advantages in setting the overall default based on the needs of those most affected by a technology).

Not every organisation with data to govern will necessarily need to run its own engagement process to identify the right defaults: in many cases, desk research might identify clear public-backed principles to work with.

The legitimacy of any defaults may be affected by the extent to which they are derived from considering the particular impacts that a category or type of data may have, and the extent to which populations and communities affected by those impacts were part of developing the defaults.

2) What are the appropriate mechanisms for public engagement in the specific decisions that put those defaults into practice?

Where default setting might be periodic or one-off, there are many aspects of data governance which require day-to-day engagement. Where broad public engagement might be important for setting defaults, making decisions might require more focussed approaches, potentially with participants who have more background, training or ongoing role. I’m particularly interested in the coming weeks to try and explore different models being applied to data governance here: whether focussed on shaping decisions, sharing decisions, or providing scrutiny to decisions made.

The purpose of participation

I’ve been thinking a lot this week about the distinction between different institutional designs for data governance (including novel proposals for trusts, commons and co-ops), and questions of how decisions actually get made whatever the institutional structure. I was particularly struck by this critical piece from Rachel Adams at Research ICT Africa on the problems of reaching for a data trusts model in an African context.

I’ve found it helpful (this week at least!) to break down my thinking as follows:

  • Fundamentally, participation aims to align the outcomes of any process with the interests of those affected by it.

    (This is compatible with recognising that some interests, and the way people or communities understand or articulate them, are not fixed, and may be refined or revised through participatory process).

  • In the context of democratically governed public authorities, competitive markets, and/or a balance of power between actors, then participatory processes can help to align the interests of data powers and communities; but
  • In conditions of vastly unequal power, other institutional mechanisms are required to create conditions in which the interests of authorities or firms and communities will end up aligned.

    Mechanisms, for example, like trusts, commons or co-ops that seek to change where decisions are made, and what the backstops are to protect against individual, private or external interests being put above community or collective interests.

Thinking about the purpose of participation in terms of aligning actions or outcomes from data governance with the interests of the populations affected also brings into relief the points that (a) different communities may have interests that are not aligned, or even entirely compatible; and (b) a lot hinges on how the community whose interests are to be explored is defined.

Components, connections and configurations

I’ve been starting to reflect on how to present the relationship between different parts of the participatory data governance cases I’ve been gathering. The case study schema has each case with multiple components that feed into each other, so that, for example, a citizens jury case might be broken down into the initial desk review and design, feeding into a set of roundtables that design materials, which are then used by main jury events, and which feed into an analysis process that produces a report. I’ve been thinking about whether it is useful to present this schematically (essentially a graph of the component relationships), and how this might also start to show where the participation process interfaces with organisational decisions. I might manage to get a quick prototype sorted soon.

Other stuff

I’ve also been:

Weeknotes – May 13th 2022

[Cross-posted from Connected by Data blog]

It’s been a busy week, not least because on Wednesday the project I was working on just before joining the Connected by data team, the Global Data Barometer, had its launch event. Alongside sharing, celebrating and reflecting on the Barometer, I’ve been digging deep into the development of a schema for our work to gather examples of collective data governance, and I’ve been thinking about potential events and convenings Connected by data might be part of in the coming year. Plus writing up some thoughts on what the implementation of social audits for India’s rural employment scheme in Andhra Pradesh can teach us about data governance, an initial bit of work on fundraising, and sending the first of many emails out to set up conversations with researchers I’m hoping to get some input from.

Cases, components and templates

I’ve been going through daily iterations to develop the structure for our case study database, taking a couple of different approaches to explore the right prompts, categories and structures that might create a useful library of collective data governance examples. I’ve been exploring:

  • adding the classification categories, and library of methods, from Participedia as lookup tables, and using them as a starting point, but adding/expanding categories when needed. This has been particularly useful when it comes to methods, as when I’m considering coding a case against a given method I can check the detailed Participedia description to check whether the code is appropriate. I’ve also set-up one-click searching so I can see if the way I initially think a category should be used matches how it has been used by Participedia contributors.

Case database draft schema

  • coding up one case each day, and, if needed, modifying the case database schema to capture it better, before going back to re-code existing cases with the modifications. This has led to a ‘case’, ‘component’ and ‘method’ separation, so that any case of collective data governance might involve multiple components (e.g. design workshops; citizens jury & opinion polling), and these are each treated as particular cases of applying one or more participatory methods.
  • drafting user stories (‘As an X I need Y so that Z’) to get a clearer sense of who the case database is for, and why. Thinking about the categories and data that users might actually want provides a useful counterbalance to the temptation to keep adding fields and more nuance when starting from reading diverse individual cases.
  • writing out a templated summary paragraph for a case, and then working out the different variables needed to populate it. I’ve found it particularly useful to then frame the prompts in the case database around these sentence components, making it easier to think about how each category will be used when the case database is made available to others.

Screenshot of templated summary paragraph

Next week I’ll be trying to get a few more cases documented, and then to start exploring strategies to make sure we cover a wide range of kinds of examples. Right now, the examples I’ve coded up, and those in the pipeline, have a strong leaning towards participatory processes that generate quite general recommendations, rather than processes that directly shape or make specific data governance decisions.

Events and outreach

I started the week by registering forRightsCon, which is taking place online from 6 – 10th June. It’s a long time since I’ve been able to focus on attending a conference, rather than juggling logging into one or two virtual sessions between other work, so I’m looking forward to that.

On our discord I raised the possibility of proposing a workshop to the 2022 Internet Governance Forum on Collective Data Governance, and I’ll be sketching that out more next week and looking to see if we have potential collaborators.

We’ve also been starting to think about other potential events or outreach activities we might want to plan for, and I’ve done some initial work on research fundraising strategy: although realising I’ve got quite a bit of work to do in order to identify how to best track research funding opportunities that are well-aligned with what we want to do.

Participation, pluralism and the public good

I put out a thread of reflections on the launch of the Global Data Barometer, but want to pick up in particular on one in these notes. The Barometer study is framed around “data for the public good”. One of the big conceptual challenges for the design of the project which was taking place in 2019/2020, was balancing demands for cross-country comparison, with an openness to diversity of data governance, provision and use practices.

In the introduction to the report we wrote:

“Fundamentally, our approach to the public good recognizes that the construction of public good is an ongoing, unfinished and contested process. “

And

“There are many publics, many different visions of how society should be organized, and there are many views on the goals we should individually and collectively work towards.”

But, particularly after starting to read a copy of Pluriverse – a post development dictionary which arrived last week, I’m not sure the more pluralist ambitions of the project were fully realised (understandably so). With hindsight, and the framings of Connected by data, I suspect some of that might have been addressed by giving greater prominence to questions of participation in metrics on data governance.

However, the Barometer method also required each metric to make reference to globally agreed norms or principles that would support country assessment on that point. This raises the interesting question of which global norms can already ground a collective and participatory model of data governance, and where there are significant policy gaps that might need addressing to put communities at the heart of data governance.

Other reading this week

  • Bussu, S. et al. (2022) ‘Embedding participatory governance’, Critical Policy Studies – a compelling case to talk about embedding, rather than (or in addition to) institutionalising participatory governance, considering temporal (sustained over time), spatial (including presence of participation in different decision making spaces) and practice (habitual recourse to participatory process) dimensions of embeddedness.
  • Van de Velde, L. (2022) Gender and Beneficial Ownership Transparency– paper from Open Ownership that explores some of the tensions in designing datasets, particularly when it comes to the potential for data collected for one task (beneficial ownership transparency), to be used for other public goods (e.g. promoting greater gender equity in enterprise). I’m curious how a more collective and participatory data governance lens might help address some of the issues the paper explores. But – ran out of time to explore that in depth.

New role & weeknotes: we are connected by data

[Summary: new role focussing on participatory data governance, and starting to write weeknotes]

Last week I started a new role as Research Director for Connected by data, a new non-profit established by Jeni Tennison to focus on shifting narratives and practice around data governance. It’s a dream job for me, not least for the opportunity to work with Jeni, but also because it brings together two strands that have been woven throughout my work, but that I’ve rarely been able to bring together so clearly: governance of technology and participatory practice.

You can find the Connected by data strategic vision and roadmap here describing our mission to “put community at the centre of data narratives, practices and policies”, and our goals to work on challenging individual frameworks of data ownership, whilst showing how collective models offer a clearer way forward. We’ll also be developing practical guidance that helps organisations to adopt collective and participatory decision making practice, and a key focus for the first few weeks of my work is on building a library of potential case studies to learn from in identifying what works in the design of more participatory data governance.

Jeni’s organisational designs for Connected by data include a strong commitment to working in the open, and one of the practices we’re going to be exploring is having all team members produce public ‘weeknotes’ summarising activities, and most importantly, learning from the week. You can find the full of weeknotes over here, but in the interests of trying to capture my learning here too (and inviting any feedback from anyone still following this blog), I’ll try and remember to cross-post here too.

Last week’s weeknotes (6th May)

Hello! It’s the end of my first week as Research Director (and with the May day holiday in the UK, it’s been a short week too). I’ve been getting stuck into the research strand of the roadmap, as well as checking off some of the more logistical tasks like getting different calendars to talk to each other (calmcalendar to the rescue), posting my Bio on the website here, and setting up new systems. On that note, thanks to Jeni for the tip on logseq which seems to be working really nicely for me so far as both a knowledge-base, and a journal for keeping track of what’s happened each week to make writing up weeknotes easier.

The week has been bookended by scoping out how we’ll develop case studies of where organisations have adopted participatory approaches in data governance. I’ve started an AirTable dataset of potential case leads, and have been looking at if/how we could align some of our data collection with the data model used by Participedia (an open wiki of participation cases and methods). Over the next few weeks I’m anticipating an iterative process of working out the questions we need to ask about each case, and the kinds of classifications of cases we want to apply.

The middle of the week was focussed on responding to a new publication from the Global Partnership on Sustainable Development Data’s Data Values Project: a white paper on Reimagining Data and Power. The paper adopts a focus on collective engagement with data, and on participatory approaches to data design, collection, governance and use, very much aligned with the Connected by data agenda. Not only was the paper a source of a number of potential case study examples, but it also prompted a number of useful questions I’m hoping to explore more in coming weeks around the importance/role of data literacy in participatory data governance, and the interaction of what the paper terms ‘informal’ participatory models, with formal models of regulation and governance. Some of those thoughts are captured in this twitter thread about the report, and this draft response to the Data Values Project consultation call for feedback.

I also spent some time reviewing Jeni’s paper on ‘What food regulation teaches us about data governance’, and reflecting in particular on how the food analogy works in the context of international trade, and cross-border flows.

Finally, I’ve been helping the Global Data Barometer team put some finishing touches to the first edition report which will (finally!) launch next week. Although I handed over the reigns on the Global Data Barometer project to Silvana Fumega in the middle of last year, I’ve been back working on the final report since December: both on the data analysis and writing, and, trying (not always successfully) to have a reproducible workflow from data to report. Data governance is one of the key pillars of the report: although in the first edition there is relatively little said about _participatory _approaches, at least on the data creation and governance side. I’ll aim to write a bit more about that next week, and to explore whether there are missing global metrics that might help us understand how far a more collective approach to data is adopted or enacted around the world.

Global Data Barometer – First Edition published

[Summary: Data and analysis on ‘data for the public good’ across 109 countries.] 

On Wednesday the first edition of the Global Data Barometer was published. You can find the full report here, and all the data from the study for download is here. 

I was involved in setting up the Barometer project back in 2019/2020, and had the privilege of coming back into the project in December to work on the final report.  

I’ve written up a bit more background and reflection in this twitter thread:

It’s already encouraging to see all the places the Barometer findings and data are being picked up, and whilst getting the report out feels like the finish line for, what has been, both marathon and sprint for the team – having the data out there for further analysis also feels like the starting line for lots of deeper research and exploration. 

In particular, it feels like debates about ‘data for the public good’ have been developing at pace in parallel to the Barometer’s data collection, and I’m keen to see both how the Barometer data can contribute to those debates, and what future editions of the project might need to learn from the way in which data governance debates are shaping up in 2022.

A look at the UK Open Government Partnership 2021-23 National Action Plan

[Summary: Critical reflections and comments on the context and content of the UK’s 2021-23 Open Government Partnership National Action Plan]

Screenshot of https://www.gov.uk/government/publications/uk-national-action-plan-for-open-government-2021-2023/uk-national-action-plan-for-open-government-2021-2023

If you’re in the UK, you might be excused for paying more attention to the other report released today, but around the same time as Sue Gray’s report on rule-breaking lockdown parties at Number 10 Downing Street was published, the UK’s 2021-23 Open Government Partnership National Action Plan (NAP) also surfaced on gov.uk.

I was involved in civil society aspects of developing of the UK’s 2nd and 3rd NAPs, and have written critiques of the others, so, although I’ve had minimal involvement in this NAP (I attended a few of the online engagement sessions, mainly on procurement transparency commitments, before they appeared to peter out) I thought I should try and unpack this one in the same kind of way.

By way of context, it’s a very tough time to be trying to advance the open government agenda in the UK. With Sue Gray’s report, and Prime Ministerial responses to it today, confirming the lack of integrity and the culture of dishonesty at the very centre of Number 10; just over a week after a ministerial resignation at the despatch box over government failures to manage billions of pounds of fraud during the COVID response; and on the day that government promised to pursue a post-Brexit deregulatory agenda; we have rarely faced a greater need, yet a less hospitable environment, for reforms that can strengthen checks and balances on government power, reduce space for corrupt behaviour, and bring citizens into dialogue about solving pressing environmental, social and political problems. As a key Cabinet Office civil servant notes, it’s a credit to all involved from the civil service and civil society, that the NAP was published at all in such difficult circumstances. But, although the plans’ publication shows that embers of open government are still there in Whitehall, the absence of a ministerial foreword, the lack of ambition in the plan, and the apparent lack of departmental ownership for the commitments it does contain (past plans listed the responsible stakeholder for commitments; this one does not), suggests that the status of open government in the UK, and the political will to take action on government reform within the international framework of the OGP, has fallen even further than in 2019.

When I wrote about the 2019 plan, I concluded that “The Global OGP process is doing very little to spur on UK action”. Since then, the UK has been called out and placed under review by the OGP Criteria & Standards Subcommittee in 2021 for missing action plan deadlines, and falling short of minimum requirements for public involvement in NAP development. Today’s published plan appears to admit that not enough has yet been done to rectify this, noting that:

In order to meet this criteria the government will amend and develop the initial commitment areas in NAP5 with civil society over the course of 2022.

Notably, past promises to civil society of adding to commitments to the NAP after the OGP deadline were not met (in part, if I recall correctly, because of issues with how this would interact with the OGP’s Independent Review Mechanism process), and so, with this line, civil society have a tactical choice to make: whether to engage in seeking to secure updates to the plan with assurance these will be taken forward, or whether to focus on ‘outsider’ strategies to put pressure on future UK OGP engagement. As Gavin Freeguard writes, we may be running up against the limits of “a one-size-fits-all international process that can’t possibly fit into the rhythms and rituals of UK government”. If this is so, then there is a significant challenge ahead to find any other drivers that can help secure meaningful open governance reforms in the UK: recognising that the coming years may be as much about the work of shoring up, and repair, as about securing grand new commitments.

A look at NAP5 commitments

Given the wider context, it hardly seems worth offering a critique of the individual commitments (but, erm, I ended up writing one anyway…) . It’s certainly difficult to extract any sense of a SMART (Specific, Measurable, Achievable, Realistic, Time-bound) milestone from any of them, and those that do appear to have some sort of measurable target betray a woeful lack of ambition*.

Take for example “publishing 90% of ‘above threshold’ contract awards within 90 days calendar days [of award presumably]”. Not only does that leave absolutely massive loopholes (any contract that it would be convenient not to publish could fall into the 10%; and 90-days takes disclosure of information on awards far beyond the period during which economic operators who lose out on a bid could be able to challenge a decision), but, this is more or less a commitment rolled over from the last National Action Plan. Surely, with the learning from the last few years of procurement scandals, and learning from the fact that Open Contracting commitments from the past have been poorly implemented, a credible National Action Plan would be calling for wholesale reform of procurement publication, following other OGP members who make award publication a binding part of a contract being enforceable, or invoices against it payable?

(*To be clear: I believe the vast majority of the fault for this lies with Ministers, not with the other individuals inside and outside government who have engaged in the NAP process in good faith).

Other milestones are almost comical in their framing. I’m not sure I’ve seen a sentence squeeze in quite as many caveats as the ‘commitment’ to build on the interesting but limited foundations of a draft Algorithmic Transparency ‘Data’ ‘Standard’, by working:

with internal and external stakeholders to gauge the feasibility of conducting a scoping exercise focused on mapping existing legal requirements for appeal mechanisms, for example due to administrative law, data protection law, or domain-specific legislation; with a view to sharing this information with the public. [my emphasis]

If I’m reading this right that could well be: a conversation with unspecified stakeholders to gauge whether it’s even possible to work out the scope of a mapping that then may or may not take place, may nor may not be comprehensive, and may or may not result in outputs shared with the public. Even read more charitably (let’s assume the scoping exercise involves the mapping. not just scopes it!), surely the point of the National Action Plan development process is have the conversations with internal and external stakeholders to ‘gauge the feasibility’ of an open government action taking place?

Others have commented on the backsliding in commitments to Open Justice, and I’ll leave it to those more involved at present in combatting the UK’s role in Illicit Financial Flows to comment on the limited new commitments there. However, I do want to pick up two comments on the health section in the NAP. Firstly, while inclusion of health within the NAP, as a topic much more legible in many people’s daily lives (and not only in the last two years) than topics like procurement or stolen asset recovery, is broadly welcome, the health section betrays a worrying lack of distinction between:

• Patient data;

• Health system data;

The State of Open Data: Histories and Horizon’s chapter on Health offers a useful model for thinking about this. In general, Open Government should be concerned with planning and operational data, service information, and research outputs. Where open government and personal data meet, it should be about the protection of individuals data rights: recognising elements of citizen privacy as foundational for open government.

Openness of data based on type and intended use (Source: State of Open Data - Mark Irura)
Appropriate openness/transparency of health data based on type and intended use (Source: State of Open Data – Mark Irura)

In practice, when we talk of transparency, we need to be very clear to distinguish transparency about how (personal) health data is used (generally a good thing), and transparency of (personal) health data (usually a sign that something has gone profoundly wrong with data protection!). To talk about transparency of health data without qualifiers risks messy policy making, and undermining trust in both open government and health data practices. After reading it over a few times, I *think*Objective 1: accountability and transparency’ under the health heading is about being transparent and accountable about how data is used,  but there is little room for sloppy drafting in these matters. The elision of agendas to create large health datasets (with mixed public and private-sector users), with the open government agenda, has been something civil society have had to be consistently watchful of in the history of UK NAPs, and it appears this time around is no different.

Secondly, and perhaps related, it’s not at all clear to me why a a “standards and interoperability strategy for adoption across health and adult social care” (under Health ‘Objective 2: Data standards and interoperability’) belongs in an Open Government National Action Plan. Sure, the UK health system could benefit from greater interoperability of clinical systems, and this might have an impact on patient welfare. But the drivers for this are not open government: they are patient care. And an OGP National Action Plan is going to do little to move the needle on a challenge that the health sector has been tackling for decades (I recall conversations around the dining room table with my Dad, then an NHS manager, twenty years ago, about the latest initiatives then to move towards standardised models for interoperable patient data and referrals).

It might seem hair-splitting to say that certain reforms to government fall outside the scope of open government, but for the concept to be meaningful it can’t mean all and any reform of government systems. If we were talking about ways of engaging citizens in the design process for interoperability standards, and thinking critically about the political and social impact that categorisations within health records have, we might have something worthy of an open government plan, but we don’t. Instead, we have an uncritical focus on centralising data, and a development approach that will only involve “vendors, suppliers, digital technologists, app developers and the open source community”, but not actual care-service users, or people affected by the system design*.

(*I know that in practice there are many fantastic service and technology designers around the NHS who are both critically aware of the cost and benefit tradeoffs of health system interoperability, and a personal/professional commitment to work with service users in all design work; but the absence of service-users from the text of the NAP commitment is notable.)

Lastly, the plan includes a placeholder for forthcoming commitments on “Local transparency”, to be furnished by the Department for Levelling Up, Housing and Communities (DLUHC) sometime in 2022. In past rounds of the NAP, civil society published a clear vision for the commitments they would like to see under certain headings, and the NAP has named the civil society partners working to develop and monitor commitments. Not this time around it seems. Whilst OGP colleagues in Northern Ireland have been running workshops to talk about open local government, I can’t find evidence of any conversations that might show what might fall under this heading when, or if, the current Westminster NAP evolves.

Still looking for a way forward…

As I wrote in 2019, I generally prefer my blogging (and engagement) to be constructive: but that’s not been easy recent Open Government processes in the UK. At the same time, I did leave a recent session on ‘The (Absolute) State of Open Government’ at the latest UKOpenGovCamp unconference feeling surprisingly optimistic. Whilst any political will from the Conservative government for meaningful open government is, at least at present, sorely lacking, open working cultures within some pockets of government seem to have been remarkably resilient, and even appear to have deepened over the course of the pandemic. The people of open government are still there, even if the political leadership and policies are missing in action.

All the ambitious, necessary, practical and SMART commitment ideas that didn’t make it into this NAP need to be implementation-ready for any openings for reform that may come in the volatile near-future of UK politics. Just as civil society successfully used the UK’s Chairmanship and hosting of the OGP Summit back in 2012/13 to lock in stronger open beneficial ownership data commitments, civil society needs to be ready with ideas that, while they may get no traction right now, might find an audience, moment and leverage in future – at least if we manage to protect and renew our currently fragile democratic system.

I’ve long said that the OGP should be a space for the UK to learn from other countries: forgoing ideas of UK exceptionalism, and recognising that polities across the world have grapled with the kinds of problems of populist and unaccountable leadership we’re currently facing. As I work on finalising the Global Data Barometer Report, I’ll be personally paying particular attention to the ideas and examples from colleagues across OGP civil society that are particularly relevant to learn from.

And if you are in anyway way interested in open government in the UK, even though the process right now feels rather stuck and futile, you can sign-up to the UK civil society Open Government Network mailing list to be ready to get involved in revitalising open government action in the UK when the opportunity is there (or, perhaps, when we collectively make it arise).

A data portal deep dive

Over the last week I’ve been sharing a short series of articles exploring the past, present and future of (open) data portals. This comes as part of a piece of work I’m doing for the Open Data Institute on ‘Data Platforms and Citizen Engagement’.

The work starts from the premise that data portals have been an integral part of the open data movement. Indeed, for many (myself included) the open data movement was crystallised with, or first discovered through, the launch of platforms like Data.gov and Data.gov.uk. However, we are going on to ask whether, a decade on, portals still have a role to play? And if so, what might that role most usefully be? Ultimately, we’re asking if, and if so, how, portals might be (re-)shaped as effective platforms to support ongoing ambitions for open data to support meaningful citizen participation in all its forms.

Over the course of a short rapid research sprint I’ve been pulling at a couple of threads that might contribute to that inquiry. The goal has been to carry out some groundwork to support the next stage of the project: which we are hoping will take the form of some sort of design excercises, accompanied by a number of deeper conversations and possibly further research. I overshot my initial plan of spending five days ‘catching up’ with what’s been happening in the portal landscape since I last looked, not least because the simple answer is – a lot’s been happening. And, at the same time, if you compare a portal from 2012 with the same one today, the answer to the question ‘What’s changed?’ often also seems to be, not very much. The breadth and depth of work constructing and critiquing portals across the world is both impressive, and oppressive. It seems that, collectively, we know there are problems with portals, but, there is much less consensus on the way forward.

Each post in this series has tried to look at ‘the portals problem’ from one specific perspective, aiming to provide some shared context that might assist in future conversations. The posts are all over on PubPub, where they’re open to comment (free sign-up needed):

Terminology: When is a portal not a portal?

Technology: A genealogy of data portals

Research: The pressure on portals: an hourglass approach

Academia: Evidence and insights: other findings from research

Experiments: Selected examples of data portals

Organisational: The people and processes behind the portals

Engagement: Portals and participation

Speculation: Focussed futures: the portal as…

If, after exploring some of these, you think you might be interested in joining some of the open design sprint work we’re planning for next year  to build on this exploration – and on parallel strands of research that have been taking place (likely involving some online or in-person full and half-day sessions in early Feb) do drop me a line via twitter or (for this project only) my ODI e-mail address: tim.davies@theodi.org and I can share more info as plans firm up.

Data portals and citizen engagement: participation in context

I’m cross-posting this from a deep-dive series of working drafts I’ve been developing for The Open Data Institute, providing ground work for exploring potential future developments that could support data portals and platforms to function better as tools of civic participation. It provides a general history of the development of citizen participation, primarily in the UK context, that I hope may be of interest to a wide range of readers of this blog, as well as setting this in the context of data portals as participation tools (possibly more of a niche interest..). You can find the full series of posts which talk a lot more about data portals, here.

A key cause of data portal dissatisfaction is the apparent failure of portals to provide effective platforms for citizen participation in government and governance. The supposed promise of portals to act as participatory platforms can be read into the 2009 Obama Open Government Memo on transparent, participatory and collaborative government, and the launch of data.gov.uk amongst the hackathons and experiments with online engagement that surrounded the Power of Information report and taskforce. Popular portal maturity models have envisioned them evolving to become participatory platforms [1] [2] and whilst some work has acknowledged that there are different forms of participatory engagement with the state, ranging from monitorial democracy, to the co-production of public services [3], the mechanisms by which portals can help drive participation, and the forms of participation in focus, have been frequently under-theorised.

In the current policy landscape, there is a renewed interest in some forms of participatory engagement. Citizens assemblies, deliberative fora, and other forms of mini-public are being widely adopted as ways to find or legitimate ways forward on thorny and complex issues. Amidst concerns about public trust, democratic control, and embedded biases, there are calls for participatory processes to surround the design and deployment of algorithmic systems in particular [4], creating new pressure on participatory methods to engage effectively with data. However, public participation has a long history, and these latest trends represent just one facet of the kinds of processes and modes of engagement we need to have in mind when considering the role of data portals in supporting citizen engagement. In this short piece I want to briefly survey the history of public participation, and to identify potential insights for the development of data portals as a support for participatory processes. My focus here is primarily on the UK landscape, although I’ll try and draw upon wider global examples where relevant.

A short history of citizen participation

In the blog post ‘A brief history of participation’, historian Jo Guldi explores the roots of participatory governance ideas, tracing them as far back as the early mediaeval church, and articulating ideas of participatory governance as a reaction to the centralised bureaucracies of the modern nation state. Guldi points to the emergence of “a holistic political theory of self-rule applicable to urban planning and administration of everyday life” emerging in the 1960s, driven by mass youth movements, mass media, and new more inclusive notions of citizenship in an era of emerging civil rights. In essence, as the franchise, and education, expanded, default models of ‘elite governance’ came to be challenged by the idea that the public should have a greater voice in day to day decision making, if not greater direct ownership and control of public authority.

In Guldi’s global narrative, the emphasis of the 1970s and 80s was then on applying participatory ideas within the field of International Development, particularly participatory mapping – in which marginalised citizens are empowered to construct their own maps of territory: in a sense creating counter-data to secure land rights, and protect customary resources from logging or other incursions. Guldi points in particular to the role of institutions such as the World Bank in promoting participatory development practises, a theme also found in Leal’s ‘Participation: the ascendancy of a buzzword in the neo-liberal era[5]. Leal highlights how, although participatory methods have their roots in the emancipatory pedagogy of Paulo Friere and in Participatory Action Research, which aims at a transformation of individual capabilities alongside wider cultural, political and economic structures – the adoption of participation as a tool in development can act in practice as a tool of co-option: depoliticising critical decisions and offering participants only the option to modify, rather than fundamentally challenge, directions of development. Sherry Arnstein’s seminal ‘A ladder of citizen participation’ article [6], published in 1969 in an urban planning journal, has provided a reliable lens for asking whether participation in practice constitutes decoration, tokenism, or genuine citizen power.

Illustration of the ladder of participation from Arnstein’s original article, showing eight rungs, and three categories of participation, from ‘nonparticipation’, to ‘degrees of tokenism’ and up to ‘degrees of citizen power’.

In the UK, whilst radical participatory theory influenced grassroots community development work throughout the 1980s, it was with the election of the New Labour Government in 1997 that participation gained significant profile in mainstream policy-making: with major initiatives around devolution, the ‘duty to consult’, and an explosion of interest in participatory methods and initiatives. Fenwick and McMillan describe participation for New Labour as ‘something at the heart of the guiding philosophy of government’, framed in part as a reaction to the consumer-oriented marketised approach to public management of the Thatcher era.  Yet, they also highlight a tension between an ideological commitment to participation, and a managerial approach to policy that sought to also ‘manage’ participation and its outcomes. Over this period, a particular emphasis was placed on participation in local governance, leading top-down participation agendas to meet with grassroots communities and community development practices that had been forged through, and often in opposition to, recent decades of Conservative rule. At its best, this connection of participatory skill with space to apply it provided space for more radical experiments with community power. At its worst, and increasingly over time, it led to co-option of independent community actors within state-directed participation: leading ultimately to a significant loss of both state-managed and community-driven participatory practice when the ‘era of austerity’ arrived in 2010.

The 2000s saw a proliferation of guides, handbooks and resources (e.g.) outlining different methods for citizen participation: from consultation, to participatory budgeting, citizens panels, appreciative inquiries, participatory research, and youth fora. Digital tools were initially seen broadly as another ‘method’ of participation, although over time understanding (albeit still relatively limited) has developed of how to integrate digital platforms as part of wider participatory processes – and as digital development has become more central in policy making, user-involvement methodologies from software development have to be critically considered as part of the citizen participation toolbox. Concepts of co-production, co-design and user-involvement in service design have also increasingly provided a link-point between trends in digital development and citizen participation.

Looking at the citizen participation landscape in 2021, two related models appear to be particularly prominent: deliberative dialogues, and citizens assemblies. Both are predicated on bringing together broadly representative groups of citizens, and providing them with ‘expert input’, generally through workshop-based processes, and encouraging deliberation to inform policy, or to generate recommendations from an assembly. Notably, deliberative methods have been adopted particularly in relation to science and technology, seen as a way to secure public trust in emerging scientific or technological practice, including data sharing, AI and use of algorithmic systems. Whilst deliberative workshops and citizens assemblies are by no means the only participatory methods in use in 2021, they are notable for their reliance on expert input: although the extent to which direct access to data features in any of these processes is perhaps a topic for further research.

By right, or by results

Before I turn to look specifically at the intersection of data and participation, it is useful to briefly remark on two distinct lines of argument for participation: values or rights-based, vs. results based.

The rights-based approach can be found both in theories of participatory democracy that argue democratic mandate is not passed periodically from voters to representatives, but is constantly renewed through participatory activities engaging broad groups of citizens, and in human-rights frameworks, including notably the UN Convention on the Rights of the Child (UNCRC), which establishes children’s rights to appropriate participation in all decisions that affect them. Guidance on realising participation rights adopted in 2018 by the UN Human Rights Council explicitly makes a link with access to information rights, including proactive disclosure of information, efforts to make this accessible to marginalised groups, and independent oversight mechanisms.

A results-based approach to citizen participation is based on the idea that citizen engagement leads to better outcomes: including supporting more efficient and effective delivery of public services, securing greater citizen trust in the decisions that are made, or reducing the likelihood of decisions being challenged. Whilst some user and human-centred design methodologies may make reference to rights-based justifications for inclusion of often marginalised stakeholders, in general, these approaches are rooted more in a result-based than a rights-based framework: in short, many firms and government agencies have discovered projects have greater chance of success when you adopt consultative and participatory design approaches.

Participation, technology and data

Although there have been experiments with online participation since the earliest days of computer mediated communication, the rise of Web 2.0 brought with it substantial new interest in online platforms as tools of citizen engagement: both enabling insights to be gathered from existing online social spaces and digital traces, and supporting more emergent, ad-hoc or streamlined modes of co-creation, co-production, or simply communication with the state (as, for example, in MySociety’s online tools to write to public representatives, or report street scene issues in need of repair). There was also a shift to cast the private sector as a third stakeholder group within participatory processes – primarily framed as originator of ideas, but also potentially as the target of participation-derived messages. As the Open Government Partnership’s declaration puts it, states would “commit to creating mechanisms to enable greater collaboration between governments and civil society organizations and businesses.”

With rising interest in open data, a number of new modes and theories of participation came to the fore: the hackathon [7][8][9], the idea of the armchair auditor [10], and the idea of ‘government as a platform’ [11][12] each invoke particular visions of citizen-state and private-sector engagement.

A focus in some areas of government on bringing in greater service-design approaches, and rhetoric, if not realities, of data-driven decision making have also created new spaces for particular forms of participatory process, albeit state-initiated, rather than citizen created. And recent discussions around data portals and citizen participation have often centred on the question of how to get citizens to engage more with data, rather than how data can support existing or potential topic-focussed public participation.

In my 2010 MSc thesis on ‘Open Data, Democracy & Public Sector reform: open government data use from data.gov.uk’ I developed an initial typology of civic Open Government Data uses, based on a distinction between formal political participation (representative democracy), collaborative/community based participation (i.e. participatory democracy or utility-based engagement), and market participation (i.e. citizen as consumer). In this model, the role data plays, and the mechanisms it works through, vary substantially: from data being used through media to inform citizen scrutiny of government, and ultimately discipline political action through voting; to data enabling citizens to collaborate in service design, or independent problem solving beyond the state; and to the consumer-citizen driving change through better informed choices of access to public services. In other words, greater access to data theoretically enables a host of different genres of participation (albeit there’s a normative question over how meaningful or equitable each of these different forms of participation are) – and many of these do not rely on the state hosting or convening the participation process.

What is notable about each of these ‘mechanisms of change’ is that data accessed from a portal is just one component of a wider process: be that the electoral process in its entirety, a co-design initiative at the community level, or some national market-mechanism supported by intermediaries translating ‘raw data’ into more accessible information that can drive decisions over which hospital to use, or which school to choose for a child. However, whilst many participatory initiatives have suffered in an era of austerity, and enthusiasm for the web as an open agora for public debate has waned in light of a more hostile social media environment, portals have persisted as a primary expression of the ‘open government’ era: leaving considerable pressure placed upon the portal to deliver not only transparency, but also participation and collaboration too.

Citizen participation and data portals

What can we take from this brief survey of citizen participation when it comes to thinking about the role of data portals?

Firstly, the idea that portals as technical platforms can meaningfully ‘host’ participation in its entirety appears more or less a dead-end. Participation takes many varied forms, and whilst portals might be designed (and organisationally supported) in ways that position them as part of participatory democracy, they should not be the destination.

Secondly, different methods of citizen participation have different needs. Some require access to simple granular ‘facts’ to equalise the balance of power between citizen and state. Others look for access to data that can support deep research to understand problems, or experimental prototyping to develop solutions. Whilst in the former case, quick search and discovery of individual data-points is likely to be the priority, in these latter cases, greater understanding of the context of a dataset is likely to be particularly valuable, as would, in many cases, the ability to be in contact with a datasets’ steward.

Third, the current deliberative wave appears as likely to have data as its subject (or at least, the use of data in AI, algorithmic systems or other policy tools), as it is to use open data as an input to deliberation. This raises interesting possibilities for portals to surface and support great deliberation around how data is collected and used, as a precursor to supporting more effective use of that data to drive policy making.

Fourth, citizen participation has rarely been a ‘mass’ phenomena. Various research suggest that at any time less than 10% of the population are engaged in any meaningful form of civic participation, and only a percentage of these are likely to be involved in forms of engagement that are particularly likely to benefit from data. Portals should not carry the burden of solving a participation deficit, but there may be avenues to design them such that they connect with a wider group of active citizens than their current data-focussed constituency.

Fifth, and finally, citizen participation is not invented with the portal – and we need to be conscious of both the long history, and contested conceptualisations, of citizen participation. The government portal that seeks to add participatory features is unlikely to be able to escape the charge that it is seeking to ‘manage’ participation processes: although independently created or curated portals may be able to align with more bottom-up community participation action and operate within a more emancipatory, Frierian notion. Both data, and participation, are, after all, about power. And given power is generally always contested, the configuration of portals as a participatory tool may be similarly so.

Citations

  1. Alexopoulos, C., Diamantopoulou, V., & Charalabidis, Y. (2017). Tracking the Evolution of OGD Portals: A Maturity Model. In Lecture Notes in Computer Science (pp. 287–300). Springer International Publishing. https://doi.org/10.1007/978-3-319-64677-0_24

  2. Zhu, X., & Freeman, M. A. (2018). An evaluation of U.S. municipal open data portals: A user interaction framework. Journal of the Association for Information Science and Technology, 70(1), 27–37. https://doi.org/10.1002/asi.24081

  3. Ruijer, E., Grimmelikhuijsen, S., & Meijer, A. (2017). Open data for democracy: Developing a theoretical framework for open data use. Government Information Quarterly, 34(1), 45–52. https://doi.org/10.1016/j.giq.2017.01.001

  4. Wilson, C. (2021). Public engagement and AI: A values analysis of national strategies. Government Information Quarterly, 101652. https://doi.org/10.1016/j.giq.2021.101652

  5. Leal, P. A. (2007). Participation: The Ascendancy of a Buzzword in the Neo-Liberal Era. Development in Practice, 17(4/5), 539–548.

  6. Arnstein, S. R. (1969). A Ladder Of Citizen Participation. Journal of the American Institute of Planners, 35(4), 216–224. https://doi.org/10.1080/01944366908977225

  7. Johnson, P., & Robinson, P. (2014). Civic Hackathons: Innovation, Procurement, or Civic Engagement? Review of Policy Research, 31(4), 349–357. https://doi.org/10.1111/ropr.12074

  8. Sieber, R. E., & Johnson, P. A. (2015). Civic open data at a crossroads: Dominant models and current challenges. Government Information Quarterly, 32(3), 308–315. https://doi.org/10.1016/j.giq.2015.05.003

  9. Perng, S.-Y. (2019). Hackathons and the Practices and Possibilities of Participation. In The Right to the Smart City (pp. 135–149). Emerald Publishing Limited. https://doi.org/10.1108/978-1-78769-139-120191010

  10. O’Leary, D. E. (2015). Armchair Auditors: Crowdsourcing Analysis of Government Expenditures. Journal of Emerging Technologies in Accounting, 12(1), 71–91. https://doi.org/10.2308/jeta-51225

  11. O’Reilly, T. (2011). Government as a Platform. Innovations: Technology, Governance, Globalization, 6(1), 13–40. https://doi.org/10.1162/inov_a_00056

  12. The OECD digital government policy framework. (2020, October 7). OECD Public Governance Policy Papers. Organisation for Economic Co-Operation and Development  (OECD). https://doi.org/10.1787/f64fed2a-en

Fostering open ecosystems around data: The role of data standards, infrastructure and institutions

[Summary: an introduction to data standards, their role in development projects, and critical perspectives for thinking about effective standardisation and its social impacts]

I was recently invited to give a presentation as part of a GIZ ICT for Agriculture talk series, focussing on the topic of data standards. It was a useful prompt to try and pull together various threads I’ve been working on around concepts of standardisation, infrastructure, ecosystem and institution – particularly building on recent collaboration with the Open Data Institute. As I wrote out the talk fairly verbatim, I’ve reproduced it in blog form here, with images from the slides. The slides with speaker notes are also available shared here. Thanks to Lars Kahnert for the invite and opportunity to share these thoughts.

Introduction

In this talk I will explore some of the ways in which development programmes can think about shaping the role of data and digital platforms as tools of economic, social and political change. In particular, I want to draw attention to the often dry-sounding world of data standards, and to highlight the importance of engaging with open standardisation in order to avoid investing in new data silos, to tackle the increasing capture and enclosure of data of public value, and to make sure social and developmental needs are represented in modern data infrastructures.

Mind map of Tim's work, and logos of organisations worked with, including: Open Data Services, Open Contracting Partnership, Open Ownership, IATI, 360 Giving, Open Data in Development Countries, Open Data Barometer, Land Portal and UK Open Government Civil Society Network.

By way of introduction to myself: I’ve worked in various parts of data standard development and adoption – from looking at the political and organisational policies and commitments that generate demand for standards in the first place, through designing technical schema and digging into the minutiae of how particular data fields should be defined and represented, to supporting standard adoption and use – including supporting the creation of developer and user ecosystems around standardised data. 

I also approach this from a background in civic participation, and with a significant debt to work in Information Infrastructure Studies, and currently unfolding work on Data Feminism, Indigenous Data Sovereignty, and other critical responses to the role of data in society.

This talk also draws particularly on some work in progress developed through a residency at the Rockefeller Bellagio Centre looking at the intersection of standards and Artificial Intelligence: a point I won’t labour – as I fear a focus on ‘AI’ – in inverted commas – can distract us from looking at many more ‘mundane’ (also in inverted commas) uses of data: but I will say at this point that when we think about the datasets and data infrastructures our work might create, we need to keep in mind that these will likely end up being used to feed machine learning models in the future, and so what gets encoded, and what gets excluded from shared datasets is a powerful driver of bias before we even get to the scope of the training sets, or the design of the AI algorithms themselves. 

Standards work is AI work.

Enough preamble. To give a brief outline: I’m going to start with a brief introduction to what I’m talking about when I say ‘data standards’, before turning to look at the twin ideas of data ecosystems and data infrastructures. We’ll then touch on the important role of data institutions, before asking why we often struggle to build open infrastructures for data.

An introduction to standards

Each line in the image above is a date. In fact – they are all the same date. Or, at least, they could be. 

Showing that 12/10/2021 can be read in the US as 10th December, or in Europe as 12th October.

Whilst you might be able to conclusively work out most of them are the same date, and we could even write computer rules to convert them, because the way we write dates in the world is so varied, some remain ambiguous.

Fortunately, this is (more or less) a solved problem. We have the ISO8601 standard for representing dates. Generally, developers present ‘ISO Dates’ in a string like this:

2021-10-12T00:00:00+00:00

This has some useful properties. You can use simple sorting to get things in date order, you can include the time or leave it out, you can provide timezone offsets for different countries, and so-on. 

If everyone exchanging date converts their dates into this standard form, the risk of confusion is reduced, and a lot less time has to be spent cleaning up the data for analysis.

It’s also a good example of a building block of standardisation for a few other reasons:

  • The ISO in the name stands for ‘International Organization for Standardization’: the vast international governance and maintenance effort behind this apparently simple standard, which was first released in 1988, and last revised just two years ago.
  • The ‘8601’ is the standard number. There are a lot of standards (though not all ISO standards are data standards)
  • Uses of this one standard relies on lots of other standards: such as the way the individual numbers and characters are encoded when sent over the Internet or other media, and even standards for the reliable measurement of time itself.
  • And, like many standards, ISO 8601 is, in practice, rarely fully implemented. For example, whilst developers talk of using the ISO standard, what they actually rely on is from RFC3339, which leaves out lots of things in the ISO standard such as imprecise dates. As a rule of thumb: people copy implementations rather than read specifications.

Diagram showing ISO8601 as an interchange standard

ISO8601 is called an Interchange standard– that is, most systems don’t internally store data in ISO8601 when they want to process it, and it’s a cumbersome form that makes everyone write out the date in ISO format, instead – the standard sits in the middle – limiting the need to understand the specific quirks of each origin of data, and allowing receivers to streamline the import of data into their own models and methods.

And to introduce the first critical issue of standardisation – as actually implemented – it constrains what can be expressed: sometimes for good, and sometimes problematically.

Worked example: (a) The event took place in early November ; (b) 2021-11 fails validation; (c) To enter data, user adds arbitrary day: 2021-11-01; (d) Data from multiple sources can be analysed (all dates standardised) but the data might mislead: “The 1st of the Month is the best day to run events.”

For example, RFC3339 omits imprecise dates. That is, if you know that something happened in October 2021, but not which date – your data will fail validation if you leave out the day. So to exchange data using the standard you are forced to make up a day – often 1st of the month. A paper form would have no such constraint: users would just leave the ‘day’ box blank. The impact may be nothing, or, if you are trying to exchange data from certain places where, for legacy reasons, the day of the month of an event is not easily known – that data could end up distorting later analysis.

So – even these small building blocks of standardisation can have significant uses – and significant implications. But, when we think of standardisation in, for example, a data project to better understand supply chains, we might also be considering standards at the level of schema- the agreed way to combine lots of small atoms of data to build up a picture of some actions or phenomena we care about.

Diagrams showing table-based and tree-structured data.

A standardised data schema can take many forms. In data exchange, we’re often not dealing with relational database schemas, but with schemas that allow us to share an extract from a database or system.

That extract might take the form of tabular data, where our schema can be thought of as the common header row in a spreadsheet accompanied by definitions for what should go in each column.

Or it might be a schema for JSON or XML data: where the schema may also describe hierarchical relationships between, say a company, its products, and their locations. 

At a very simplified and practical level, a schema usually needs three things:

  • Field names (or paths)
  • Definition
  • Validation rules

Empty table with column headings: site_code, commodity_name, commodity_code, quantity and unit

For example, we might agree that whenever we are recording the commodities produced at a given site we use the column names site_code commodity_name, commodity_code, quantity and unit. 

We then need human readable definitions for what counts as each. E.g:

site_code - (string) a unique identifier for the site. commodity_name - (string) the given name for a primary agricultural product that can be bought and sold. Note: this is used for labelling only, and might be over-ridden by values taken from the commodity_code reference list. commodity_code- (string; enum) a value from the approved codelist that uniquely identifies the specific commodity. quantity- (number) the number of unit of the commodity. unit- (string; enum) the unit the quantity is measured in, from the list kg for kilograms or tonne for metric tonne. If quantities were collected in an unlisted unit, they should be converted before storage.

And we need validation rules to say that if we find, for example, a non-number character in the quantity column – the data should be rejected – as it will be tricky for systems to process.

Note that three of these potential columns point us to another aspect of standardisation: codelists.

A codelist is a restricted set of reference values. You can see a number of varieties here:

The commodity code is a conceptual codelist. In designing this hypothetical standard we need to decide about how to be sure apples are apples, and oranges are oranges. We could invent our own list of commodities, but often we would look to find an existing source.

We could, for example, use ‘080810’ taken from the World Custom Organisations Harmonised System codes

Or we could c_541 taken from AGROVOC – the FAO’s Agricultural Vocabulary.

The choice has a big impact: aligning us with export applications of the data standard, or perhaps more with agricultural science uses of the data.

site_code, by contrast, is not about concepts – but about agreed identifiers for real-world entities, locations or institutions. 

Without agreement across different dataset of how to refer to a farm, factory or other site, integrating data and data exchange can be complex: but maintaining these reference lists is also a big job, a potential point of centralisation and power, and an often neglected piece of data infrastructures.

For example: The Open Apparel Registry has developed unique production site identifiers by combining different existing datasets

Now – I could spend the rest of this talk digging deeper into just this small example – but let’s surface to make the broader points.

  1. Data standards are technical specifications of how to exchange data

The data should be collected in the way that makes the most sense at a local level (subject to ability to then fit into a standard) – and should be presented in the way that meets users needs. But when data is coming from many sources, and going to many destinations, a standard is a key tool of collaboration.

The standard is not the form.

The standard is not the report.

But, well designed, the standard is the powerful bit in the middle that ties together many different forms and sources of data, with many different reports, applications and data uses.

  1. Designing standards is a technical task

It needs an understanding of the systems you need to interface with.

  1. Designing standards goes beyond the technical tasks

It needs an awareness of the impact of each choice – each alignment – each definition and each validation rule.

There are a couple of critical dimensions to this: from thinking about whose knowledge is captured and shared through a standard, and who gains positions of power by being selected as the source of definitions and codelists. 

At a more practical level, I often use the following simple economic model to consider who bears the costs of making data interoperable:

Diagram showing 'Creator-->Intermediary-->User'

In any data exchange there is a creator, there may be an intermediary, and there is a data user.

The real value of standards comes when you have multiple creators, multiple users, and potentially, multiple intermediaries.

If data isn’t standardised, either an intermediary needs to do lots of work cleaning up the data….

Diagram showing different 'colours' of data from three creators, made interoperable by the work of an intermediary, to support three users.

…or each user has all that work to do before they can spend any time on data use and analysis.

The design choices made in a standard distribute the labour of making data interoperable across the parties in a data ecosystem. 

There may still be some work for users and intermediaries to do even after standardisation. 

For example, if you make it too burdensome for creators to map their data to a standard, they may stop providing it altogether. 

Diagram showing three creators, two of which have decided not to provide standardised data.

Or if you rely too much on intermediaries to clean up data, they may end up creating paywalls that limit use.

Or, in some cases, you may be relying on an exploitative ‘click-work’ market to clean data, that could have been better standardised at source.

So – to work out where the labour of making data interoperable should and can be located involves more than technical design.

You need to think about the political incentives and levers to motivate data creators to go to the effort of standardising their data.

You need to think about the business or funding model of intermediaries.

And you need to understand the wide range of potential data users: considering carefully whose needs are served simply, and who will end up still having to carry out lots of additional data collection and cleaning before their use-cases are realised. 

But, hoping I’ve not put you off with some of the complexity here, my fourth and final general point on standards is that: 

4. Data standards could be a powerful tool for the projects you are working on.

And to talk about this, let’s turn to the twin concepts of ecosystem and infrastructure.

Data ecosystems

Diagram showing: Decentralised and centralised networks. Supporting text: Standards can support decentralisation and innovation.

When the Internet and World Wide Web emerged, they were designed as distributed systems. Over recent decades, we’ve come to experience a web that is much more based around dominant platforms and applications. 

This ‘application thinking’ can also pervade development programmes – where, when thinking about problems that would benefit from better data exchange, funders might leap to building a centralised platform or application to gather data.

Diagram showing closed and open networks. Supporting text: Approaches open to decentralisation can support greater generativity, freedom and resilience

But – this ‘build it and they will come’ approach has some significant problems: not least that it only scales so far, it creates single points of failure, and risks commercial capture of data flows. 

By contrast, an open standard can describe the data we want to see better shared, and then encourage autonomous parties to create applications, tools, support and analysis on top of the standard. 

It can also give data creators and users much greater freedom to build data processes around their needs, rather than being constrained by the features of a central platform.

In early open data thinking, this was referred to as the model of ‘let many flowers bloom’, and as ‘publish once, use everywhere’ – but over time we’ve seen that, just like natural ecosystems – creating a thriving data ecosystem can require more intentional and ongoing action.

Diagrams showing natural and human-built 'ecosystems'.

Just like their natural counterparts, data ecosystems are complex, dynamic, and equilibrium can be hard to reach and fragile. Keystone species can support an ecosystems growth; whilst a local resource drought harming some key actors could lead to a cascading ecosystem collapse.

To give a more concrete example – a data ecosystem around the International Aid Transparency Initiative has grown up over the last decade – with hundreds of aid donors big and small sharing data on their projects in a common data format: the IATI Standard. ‘Keystone species’ such as D-Portal, which visualises the collected data, have helped create an instant feedback loop for data publishers to ‘see’ their data, whilst behind the scenes, a Datastore and API layer feeds all sorts of specific research and analysis projects, and operational systems – that one their own would have had little leverage to convince data publishers to send them standardised data.

However, elements of this ecosystem are fragile: much data enters through a single tool – AidStream – which over time has come to be the tool of choice for many NGOs and if unavailable would diminish the freshness of data. Many users accessing data rely on ‘the datastore’ which aggregates published IATI files and acts as an intermediary so users don’t need to download data from hundreds of different publishers. If the datastore is down, many usage applications may fail. Recently, when new versions of the official datastore hit technical trouble, an old open source software was brought back initially by volunteers.

Ultimately, this data ecosystem is more resilient than it would otherwise be because it’s built on an open standard. Even if the data entry tool, or datastore become inaccessible, new tools can be rapidly plugged in. But that they will can’t just be taken for granted: data ecosystems need careful management just as natural ones do.

Support standards over apps

The biggest point I want to highlight here is a design one. Instead of designing platforms and applications, it’s possible to design for, and work towards, thriving data ecosystems. 

That can require a different approach: building partnerships with all those who might have a shared interested in the kind of data you are dealing with, building political commitments to share data, investing in the technical work of standard development, and fostering ecosystem players through grants, engagement and support.

Building data ecosystems through standardisation can crowd-in investment: having a significant multiplier effect. 

Screenshot of Open Contracting Partnership worldwide map from https://www.open-contracting.org

For example, most of the implementations of the Open Contracting Data Standard have not been funded by the Open Contracting Partnership which stewards it’s design – yet they incorporate particular ideas the standard encodes, such as providing linked information on tender and contract award, and providing clear identifiers to the companies involved in procurement. 

For the low millions of dollars invested in maintaining OCDS since it’s first half-million dollar year-long development cycle – many many more millions from a myriad of sources have gone into building bespoke and re-usable tools, supported by for-profit and non-profit business models right across the world. 

And innovation has come from the edges, such as the adoption of the standard in Ukraine’s corruption-busting e-procurement system, or the creation of tools using the standard to analyse paper-based procurement systems in Nigeria.

As a side note here – I’m not necessarily saying ‘build a standard’: often times, the standards you need might almost be there already. Investing in standardisation can be a process of adaptation and engagement to improve what already exists.

And whilst I’m about to talk a bit more about some of the technical components that make standards for open data work well, my own experience helping to develop the ecosystem around public procurement transparency with the Open Contracting Data Standard has underscored for me the vitally important human community building element of data ecosystem building. This includes supporting governments and building their confidence to map their data into a common standard: walking the narrow line between making data inter-operable at a global level, and responding to the diverse situations in terms of legacy data systems, relevant regulation and political will different that countries found themselves in.

Infrastructures

Icon for concept of Infrastructure

I said earlier that it is productive to pair the concept of an ecosystem with that of an infrastructure. If ecosystems contain components adapted to each niche: an infrastructure, in data terms, is something shared across all sites of action. We’re familiar with physical infrastructures like the road and rail networks, or energy grid. These can provide useful analogies for thinking about data infrastructure. Well managed, data infrastructures are the underlying public good which enable an ecosystems to thrive.

Some components of data infrastructure: Schema & documentation; Validation tools; Reference implementations & code; Reference data; Data registries; Aggregators and APIs

In practice, the data infrastructure of a standard can involve a number of components:

  • There’s the schema itself and it’s documentation.
  • There might be validation tools that tell you if a dataset conforms with the standard or not.
  • There might be reference implementations and examples to work from.
  • And there might be data registries, or data aggregators and APIs that make it easier to get a corpus of standardised data.

Just like our physical infrastructure, there are choices to make in a data infrastructure over whose needs it will be designed around, how it will be funded, and how it will be owned and maintained.

For example, if the data ecosystem you are working with involves sensitive data, you may find you need to pair an open standard with a  more centralised infrastructure for data exchange, in which standardised data is available through a clearing-house which manages who has access or not to the data, or which assures anonymisation and privacy protecting practices have taken place.

By contrast, a data standard to support bi-lateral data exchange along a supply chain may need a good data validator tool to be maintained and provided for public access, but may have little need for a central datastore.

There’s a lot more written on the concept of data infrastructures: both drawing on technology literatures, and some rich work on the economics of infrastructure. 

But before sharing some closing thoughts, I want to turn briefly to thinking about ‘data institutions’ – the organisational arrangements that can make infrastructures and ecosystems more stable and effective – and that can support cooperation where cooperation works best, and create the foundations for competition where competition is beneficial.

Institutions

Standards adoption requires trust. Ownership, stewardship and institutions matter

A data standard is only a standard if it achieves a level of adoption and use. And securing that requires trust.

It requires those who might building systems that will work with the standard to be able to trust that the standard is well managed, and has robust governance. It requires users of schema and codelists to trust that they will remain open and free for use – rather than starting open and then later enclosed like the bait-and-switch that we’ve seen with many online platforms. And it requires leadership committing to adopting a standard to trust that the promises made for what it will do can be delivered. 

Behind many standards and their associated infrastructures – you will find carefully designed organisational structures and institutions.

Image showing governance structure of the Global Legal Entity Identifier Foundation

For example, the Global Legal Entity Identifier – designed to identify counter-parties in financial transactions – and avoid the kind of contagion of the 2008 financial crash, has layers of international governance to support a core infrastructure and policies for data standards and management, paired with licensed ‘Local Operating Units’ who can take a more entrepreneurial approach to registering companies for identifiers and verifying their identities.

The LEI standard itself has been taken through ISO committees to deliver a standard that is likely to secure adoption from enterprise users. 

Image showing the revision process described at https://standard.open-contracting.org/latest/en/governance/#revision-process

By contrast, the Open Contracting Data Standard I mentioned earlier is stewarded by an independent non-profit,

However, OCDS – seeking as I would argue it should – to disrupt some of the current procurement technology landscape – has not taken a formal standards body route – where there is a risk that well resourced incumbents would water down the vision within the standard. Instead, the OCDS team have  developed a set of open governance processes for changes to the standard that aim to make sure it retains trust from government, civil society and private sector stakeholders, whilst also retaining some degree of agility.

We’ve seen over the last decade that standards and software tools alone are not enough: they need institutional homes, and multi-disciplinary teams who can provide the ongoing support that technical maintenance work, with stakeholder engagement, and strategic focus on the kinds of change the standards were developed to deliver.

If you are sponsoring data standards development work that’s aiming for scale, are you thinking about the long-term institutional home that will sustain it?

If you’re not directly developing standards, but the data that matters to you is shaped by existing data standardisation, I’d also encourage you to ask: who is standing up for the public interest and the needs of our stakeholders in the standardisation process?

For example, over the last few weeks we’ve heard how a key mechanism to meet the goals of the COP26 Climate Conference will be in the actions of finance and investment – and a raft of new standards, many likely backed by technical data standards – are emerging for corporate Environmental, Social and Governance reporting.

There are relatively few groups out there like ECOS, the NGO network engaging in technical standards committees to champion sustainability interests. I’ve struggled to locate any working specifically on data standards. Yet, in the current global system of ‘voluntary standardisation’, standards get defined by those who can afford to follow the meetings and discussions and turn-up. Too often, that restricts those shaping standards to corporate and developed government interests alone.

If we are to have a world of technical and data standards that supports social change, we need more support for the social change voices in the room.

Closing reflections & challenges

As I was preparing this talk, I looked back at The State of Open Data – the book I worked on with IDRC a few years ago to review how open data ecosystems had developed across a wide range of sectors. One of things that struck me when editing the collection was the significant difference between how far different sectors have developed effective ecosystems for standardised, generative, and solution-focussed, data sharing. 

Whilst there are some great examples out there of data standards impacting policy, supporting smart data sharing and analysis, and fostering innovation – there are many places where we see dominant private platforms and data ecosystems developing that do not seem to serve the public interest – including, I’d suggest (although this is not my field of expertise) in a number of areas of agriculture.

So I asked myself why? What stops us building effective open standards? I have six closing observations, and in that, challenges, to share.

(1) Underinvestment

Data standards are not, in the scheme of things, expensive to develop: but neither is the cost zero – particularly if we want to take an inclusive path to standard development.

We consistently underinvest in public infrastructure, digital public infrastructure even more so. Supporting standards doesn’t deliver shiny apps straight away – but it can prepare the ground for them to flourish.

(2) Stopping short of version 2

How many of you are using Version 1.0 of a software package? Chances are most of the software you are using today has been almost entirely rewritten many times: each time building on the learning from the last version, and introducing new ideas to make it fit better with a changing digital world. But, many standards get stuck at 1.0. Funders and policy champions are reluctant to invest in iterations beyond a standard’s launch.

Managing versioning of data schema can involve some added complexity over versioning software – but so many opportunities are lost by a funding tendency to see standards work as done when the first version is released, rather than to plan for the ‘second curve’ of development.

(3) Tailoring to the dominant use case and (4) Trying to meet all use cases

Standards are often developed because their author or sponsor has a particular problem or use-case in mind. For GTFS, the General Transit Feed Specification that drives many of the public transport directions you find in apps like GoogleMaps, that problem was ‘finding the next bus’ in Portland Oregon. That might be the question that 90% of data users come to the data with; but there are also people asking: “Is the bus stop and bus accessible to me as a wheelchair user?” or “Can we use this standard to describe the informal Matatu routes in Nairobi where we don’t have fixed bus stop locations?”

A brittle standard that only focusses on the dominant use case will likely crowd out space for these other questions to be answered. But a standard that was designed to try and cater for every single user need would likely collapse under its own weight. In the GTFS case, this has been handled by an open governance process that has allowed disability information to become part of the standard over time, and an openness to local extensions.

There is an art and an ethics of standardisation here – and it needs interdisciplinary teams. Which brings me to recap my final learning on where things can go wrong, and what we need to do to design standards well

(5) Treating standards as a technical problem and (6) Neglecting the technical details

I suspect many here would not self-identify as ‘data experts’: yet everyone here could have a role to play in data standard development and adoption. Data standards are, at their core, a solution to coordination and collaboration problems, and making sure they work as such requires all sorts of skills, from policy, programme and institution design, to stakeholder engagement and communications. 

But – at the same time – data standards face real technical constraints, and require creative technical problem solving.

Showing that 12/10/2021 can be read in the US as 10th December, or in Europe as 12th October.

Without technical expertise in the room after all, we may well end up turning up a month late. 

Coda

I called this talk ‘Fostering open ecosystems’ because we face some very real risks that our digital futures will not be open and generative without work to create open standards. As the perceived value of data becomes ever higher, Silicon Valley capital may seek to capture the data spaces essential to solving development challenges. Or we may simply end up with development data locked in silos created by our failures to coordinate and collaborate. A focus on data standards is no magic bullet, but it is part of the pathway to create the future we need.