Join the Open Data Services Co-operative team…

Open_Data_Services_logos_1

[Summary: Developer and analysts jobs with a workers co-operative committed to using open data for social change]

Over the last year I’ve had the immense pleasure of getting to work with a fantastic group of colleagues creating ‘Open Data Services Co-operative‘. It was created in recognition of the fact that creating and using distributed open data requires ongoing labour to develop robust platforms, support publishers to create quality data, and help users access data in the forms they need.

Over the last year we’ve set up ongoing support systems for the Open Contracting Data Standard and 360Giving, and have worked on projects with NCVO, NRGI and the Financial Transparency Coalition, amongst others – focussing on places where open data can make a real difference to governance, accountability and participation. We’ve been doing that with a multidisciplinary team, combining the capacity to build and maintain technical tools, such as CoVE, which drives an accessible validation and data conversion tool, with a responsive analysis team – able to give bespoke support to data publishers and users.

And we’ve done this as a workers co-operative, meaning that staff who joined the team back in October last year are now co-owners of the company, sharing in setting it’s direction and decision making over how we use co-op resources to provide a good working environment, and further our social goals. A few weeks back we were able to vote on our first profit distributions, committing to become corporate sponsors of a number of software projects and social causes we support.

The difference that being organised as co-op makes was particularly brought home to me at a recent re-union of my MSc course: where it seemed many others have graduated into a start-up economy which is all about burning through staff, with people spending months rather than years in jobs, and constantly having dealing with stressful workloads. Operating as a workers co-op challenges us to create good and sustainable jobs.

Any that’s what we’re trying to do again now: recruiting for two new people to join the team.

We’re looking for a developer to join us, particularly someone with experience of managing technical roadmaps for projects; and we’re looking for someone to work with us as an analyst – combining a focus on policy and technology, and ready to work on outreach and engagement with potential users of the open data standards we support.

You can find more details of both roles over on the Open Data Services website, and applications are open until 14th March.

Practical Participation – 2016 update

pp-logo-2014-alpha-largeAlthough this year my primary focus is on PhD write-up, I’m still keeping active with the two companies I’ve co-founded. So, a couple of updates – firstly, the annual Practical Participation newsletter, compiled by Jennie Fleming.

Practical Participation 2016 – looking back and looking ahead

We wanted to get in contact with you with our annual update of what we are doing at Practical Participation. Tim, Bill and Jennie – are a team with complementary skills, backgrounds and interests and have extensive experience in a range of areas. If you are interested in working with any of the team, do please contact them personally to discuss how we can work together.  

Over the last year, Tim has been working on incubating and spinning out a couple of open data and engagement projects. 2015 started with Practical Participation acting as host to newly formed technical help-desk services for the Open Contracting Data Standard, and 360 Giving standard for philanthropy data. Those are now transferred over to a new workers co-operative, Open Data Services, where a growing team is supporting work to open up data for public good across the world. Tim also spent a large part of 2015 working on the International Open Data Conference (IODC) in Ottawa. His main role was facilitating an ‘action track’ – running a participatory process to bring together threads of discussion at the conference into a global roadmap for open data collaboration. The result is available online here. He’s continued to support the Global Open Data for Agriculture and Nutrition  (GODAN) network, working with the team on inputs to the Open Government Partnership (OGP) Summit in Mexico last October, and on a range of other research projects. 

Also at the OGP Summit, Tim co-hosted a workshop on the development resources to support the implementation of the recently launched International Open Data Charter. Over 2016 he’ll be working with the Open Data Charter network to support the creation of ‘Sector Packages’, showing key ways open data can make a difference in anti-corruption, amongst other places. You can contact Tim at tim@practicalparticipation.co.uk.

Jennie’s been continuing her evaluation work with the Children’s Society Young Carers in Focus project and Enthusiasm youth projects. She also undertook a review of the work of the Youth Team at Trafford Housing Trust. The review considered the activity and impact of the youth team to learn from the previous years’ work and to inform proposals for the future. With Practical Participation associate Sarah Hargreaves and young advisor Ruth Taylor she undertook research for Heritage Lottery Fund about youth involvement in decision making about a new grant programme they are establishing. The report reviewed current good practice in the area and set out models for how young people could be meaningfully involved in the decision making processes for the grants.

Jennie is also providing non-line managerial support to the Youth and Community team at Valley House and the youth worker at The Nottingham Refugee Forum. With CRAE’s merger with Just for Kids Law she is now a Trustee of Just for Kids Law and the Chair of the Policy and Strategic Litigation sub-committee. If you think Jennie’s skills and expertise could be useful to you – do get in contact with her jennie@practicalparticipation.co.uk

Bill’s main focus is supporting four local communities as part of the resident-led Lottery funded Big Local programme of £1m over ten years in 150 neighbourhoods in England. Each area has built a dynamic community conversation as the foundation for their plans. Each is seeing great outcomes for residents across a range of priorities they themselves have set. It’s an exciting and replicable model of community empowerment and control

Work relating to Children and Young People Improving Access to Psychological Services has taken Bill back to Rotherham, with a focused piece of work scoping children and young people’s voice and influence in mental health services and offering a practical model to help map and plan improvement. 

Bill remains involved with a number of youth services and especially with youth work within the Housing sector, facilitated by Joe Rich of Affinity Sutton. Youth services continue their freefall with occasional glimmers of hope as in Brighton where the worst of cuts were averted in part we hope through our support to young people’s voices being heard.

Work with young carers has continued through partnership with The Children’s Society and Carers Trust in the Making a Step Change programme. Working across a number of local authorities is a reminder of the power of the voice of experience, coupled with vital leadership and management.

And finally, Bill continues as a practice educator with three social work students this last year, helping retain the vital focus on quality of direct inter-personal practice.

A workshop on open data for anti-corruption

Last autumn the International Open Data Charter was launched, putting forward six key principles for governments to adopt to pursue an ‘open by default’ approach to key data.

However, for the Charter to have the greatest impacts requires more than just high-level principles. As the International Open Data Conference explored last year, we need to focus on the application of open data to particular sectors to secure the greatest impact. That’s why a stream of work has been emerging to develop ‘Sector Packages’ as companion resources to the International Open Data Charter.

The first of these is focussing on anti-corruption. I’ve been supporting the Technical Working Group of the Charter to sketch a possible outline for this in this consultation document, which was shared at the G20 meeting last year. 

To build on that we’ve just launched a call for a consultant to act as co-ordinating author for the package (closing date 28th Jan – please do share!), and a few weeks back I had the chance to drop into a mini-workshop at DFID to share an update on the Charter, and talk with staff from across the organisation about potential areas that the anti-corruption package should focus on. 

Slides from the talk are below, and I’ve jotted down some brief notes from the discussions as well. 

Datasets of interest

In the session we posed the question: “What one dataset would you like to see countries publish as open data to address corruption?”

The answers highlight a range of key areas for exploration as the anti-corruption sector package is developed further. 

1) Repository of registered NGOs and their downstream partners – including details of their bank accounts, board, constitution and rules etc.

This kind of data is clearly useful to a donor wanting to understand who they are working with, or considering whether to work with potential partners. But it is also a very challenging dataset to collate and open. Firstly, many countries either lack comprehensive systems of NGO registration, or have thresholds that mean many community-level groups will be non-constituted community associations rather than formally registered organisations. Secondly, there can be risks associated with NGO registration, particularly in countries with shrinking civil society space, and where lists of organisations could be used to increase political control or restrictions on NGO activity. 

Working these issues through will require thought about where to draw the lines between open and shared data, and how organisations can pool their self-collected intelligence about partnr organisations, whilst avoiding harms, and avoiding the creation of error-prone datasets where funding isn’t approved because ‘computer says no’. 

2) Data on the whole contracting chain – particularly for large infrastructure projects.

Whilst issolated pockets of data on public contracts often exist, effort is needed to join these up, giving a view of the whole contracting chain. The Open Contracting Data Standard has been developing the technical foundations for this to happen, and work is not beginning to explore how it might be used to track the implementation of infrastructure projects. In the UK, civil society are calling for the next Open Government National Action Plan to include a committment to model contract clauses that encourage contractors to disclose key information on subcontracting arrangements, implementation milestons and the company’s beneficial owners.

3) Identifying organisations and the people involved

The challenge of identifying the organisations who are counterparty to a funding transaction or a contract is not limited to NGOs. Identifying government agencies, departments, and the key actors within them, is also important. 

Government entity identifiers is a challenge the International Aid Transparency Initiative has been grapling with for a few years now. Could the Open Data Charter process finally move forward some agreement on the core data infrastructure describing the state that is needed as a foundation for accountability and anti-corruption open data action?

4) Beneficial ownership

Benefial ownership data reveals who is ultimately in control of, and reaping the profits from, a company. The UK is due to publish an open beneficial ownership register for the first time later this year – but there is still much to do to develop common standards for joined-up data on beneficial ownership. For example, the UK register will capture ownership information in bands at 25%, 50% and 75%, where other countries are exploring either detailed ownership percentage publication, or publication using other, non-overlapping bands. Without co-ordination on interoperability, potential impacts of beneficial ownership open data may be much harder to secure. 

5) Localised datasets and public expenditure tracking data

In thinking about the ‘national datasets’ that governments could publish as part of a sector package for anti-corruption, it is also important to not lose sight of data being generated and shared at the local level. There are lots of lessons to learn from existing work on Public Expenditure Tracking which traces the disbursement of funds from national budgets, through layers of administration, down to local services like schools. With the funding flows posted on posters on the side of school buildings there is a clearer answer to the question: “What does this mean to me?”, and data is more clearly connected with local citizen empowerment. 

Where next

Look out for updates about the anti-corruption sector package on the Open Data Charter website over the first part of 2016.

Following the money: preliminary remarks on IATI Traceability

[Summary: Exploring the social and technical dynamics of aid traceability: let’s learn what we can from distributed ledgers, without thinking that all the solutions are to be found in the blockchain.]

My colleagues at Open Data Services are working at the moment on a project for UN Habitat around traceability of aid flows. With an increasing number of organisations publishing data using the International Aid Transparency Initiative data standard, and increasing amounts of government contracting and spending data available online, the theory is that it should be possible to track funding flows.

In this blog post I’ll try and think aloud about some of the opportunities and challenges for traceability.

Why follow funds?

I can envisage a number of hypothetical use cases traceability of aid.

Firstly, donors want to be able to understand where their money has gone. This is important for at least three reasons:

  1. Effectiveness & impact: knowing which projects and programmes have been the most effective;
  2. Understanding and communication: being able to see more information about the projects funded, and to present information on projects and their impacts to the public to build support for development;
  3. Addressing fraud and corruption: identifying leakage and mis-use of funds.

Traceability is important because the relationship between donor and delivery is often indirect. A grant may pass through a number of intermediary organisations before it reaches the ultimately beneficiaries. For example, a country donor may fund a multi-lateral fund, which in turn commissions an international organisation to deliver a programme, and they in turn contract with country partners, who in turn buy in provision from local providers.

Secondly, communities where projects are funded, or where funds should have been receieved, may want to trace funding upwards: understanding the actors and policy agendas affecting their communities, and identifying when funds they are entitled to have not arrived (see the investigative work of Follow The Money Nigeria for a good example of this latter use case).

Short-circuiting social systems

It is important to consider the ways in which work on the traceability of funds potentially bypasses, ‘routes around’ or disrupts* (*choose your own framing) existing funding and reporting relationships – allowing donors or communities to reach beyond intermediaries to exert such authority and power over outcomes as they can exercise.

Take the example given above. We can represent the funding flows in a diagram as below:

downwards

But there are more than one-way-flows going on here. Most of the parties involved will have some sort of reporting responsibility to those giving them funds, and so we also have a report

upwards

By the time reporting gets to the donor, it is unlikely to include much detail on the work of the local partners or providers (indeed, the multilateral, for example, may not report specifically on this project, just on the development co-operation in general). The INGO may even have very limited information about what happens just a few steps down the chain on the ground, having to trust intermediary reports.

In cases where there isn’t complete trust in this network of reporting, and clear mechanisms to ensure each party is excercising it’s responsibility to ensure the most effective, and corruption-free, use of resources by the next party down, the case for being able to see through this chain, tracing funds and having direct ability to assess impacts and risks is clearly desirable.

Yet – it also needs to be approached carefully. Each of the relationships in this funding chain is about more than just passing on some clearly defined packet of money. Each party may bring specific contextual knowledge, skills and experience. Enabling those at the top of a funding chain to leap over intermediaries doesn’t inevitably having a positive impact: particularly given what the history of development co-operative has to teach about how power dynamics and the imposition of top-down solutions can lead to substantial harms.

None of this is a case against traceability – but it is a call for consideration of the social dynamics of traceability infrastructures – and considering of how to ensure contextual knowledge is kept accessible when it becomes possible to traverse the links of a funding chain.

The co-ordination challenge of traceability

Right now, the IATI data standard has support for traceability at the project and transaction level.

  • At the project level the related-activity field can be used to indicate parent, child and co-funded activities.
  • At the transaction level, data on incoming funds can specify the activity-id used by the upstream organisation to identify the project the funds come from, and data on outgoing funds can specify the activity-id used by the downstream organisation.

This supports both upwards and downwards linking (e.g. a funder can publish the identified of the funded project, or a receipient can publish the identifier of the donor project that is providing funds), but is based on explicit co-ordination and the capture of additional data.

As a distributed approach to the publication of open data, there are no consistency checks in IATI to ensure that providers and recipients agree on identifiers, and often there can be practical challenges to capture this data, not least that:

  • A) Many of the accounting systems in which transaction data is captured have no fields for upstream or downstream project identifier, nor any way of conceptually linking transactions to these externally defined projects;
  • B) Some parties in the funding chain may not publish IATI data, or may do so in forms that do not support traceability, breaking the chain;
  • C) The identifier of a downstream project may not be created at the time an upstream project assigns funds – exchanging identifiers can create a substantial administrative burden;

At the last IATI TAG meeting in Ottawa, this led to some discussion of other technologies that might be explored to address issues of traceability.

Technical utopias and practical traceability

Let’s start with a number of assorted observations:

  • UPS can track a package right around the world, giving me regular updates on where it is. The package has a barcode on, and is being transferred by a single company.
  • I can make a faster-payments bank transfer in the UK with a reference number that appears in both my bank statements, and the receipients statements, travelling between banks in seconds. Banks leverage their trust, and use centralised third-party providers as part of data exchange and reconciling funding transfers.
  • When making some international transfers, the money has effectively disappeared from view for quite a while, with lots of time spent on the phone to sender, recipient and intermediary banks to track down the funds. Trust, digital systems and reconciliation services function less well across international borders.
  • Transactions on the BitCoin Blockchain are, to some extent, traceable. BitCoin is a distributed system. (Given any BitCoin ‘address’ it’s possible to go back into the public ledger and see which addresses have transferred an amount of bitcoins there, and to follow the chain onwards. If you can match an address to an identity, the currency, far from being anonymous, is fairly transparent*. This is the reason for BitCoin mixer services, designed to remove the trackability of coins.)
  • There are reported experiments with using BlockChain technologies in a range of different settings, incuding for land registries.
  • There’s a lot of investment going into FinTech right now – exploring ways to update financial services

All of this can lead to some excitement about the potential of new technologies to render funding flows traceable. If we can trace parcels and BitCoins, the argument goes, why can’t we have traceability of public funds and development assistance?

Although I think such an argument falls down in a number of key areas (which I’ll get to in a moment), it does point towards a key component missing from the current aid transparency landscape – in the form of a shared ledger.

One of the reasons IATI is based on a distributed data publishing model, without any internal consistency checks between publishers, is prior experience in the sector of submitting data to centralised aid databases. However, peer-to-peer and block-chain like technologies now offer a way to separate out co-ordination and the creation of consensus on the state of the world, from the centralisation of data in a single database.

It is at least theoretically possible to imagine a world in which the data a government publishes about it’s transactions is only considered part of the story, and in which the recipient needs to confirm receipt in a public ledger to complete the transactional record. Transactions ultimately have two parts (sending and receipt), and open (distributed) ledger systems could offer the ability to layer an auditable record on top of the actual transfer of funds.

However (as I said, there are some serious limitations here), such a system is only an account giving of the funding flows, not the flows themself (unlike BitCoin) which still leaves space for corruption through maintaining false information in the ledger. Although trusted financial intermediaries (banks and others) could be brought into the picture, as others responsible for confirming transactions, it’s hard to envisage how adoption of such a system could be brought about over the short and medium term (particularly globally). Secondly, although transactions between organisations might be made more visible and traceable in this way, the transactions inside an organisation remain opaque. Working out which funds relate to which internal and external projects is still a matter of the internal businesses processes in organisations involved in the aid delivery chain.

There may be other traceability systems we should be exploring as inspirations for aid and public money traceable. What my brief look at BitCoin leads me to reflect on is potential role over the short-term of reconciliation services that can, at the very least, report on the extent to which different IATI publisers are mutually confirming each others information. Over the long-term, a move towards more real-time transparency infrastructures, rather than periodic data publication, might open up new opportunities – although with all sorts of associated challenges.

Ultimately – creating traceable aid still requires labour to generate shared conceptual understandings of how particular transactions and projects relate.

How much is enough?

Let’s loop back round. In this post (as in many of the conversations I’ve had about traceable), we started with some use cases for traceability; we saw some of the challenges; we got briefly excited about what new technologies could do to provide traceability; we saw the opportunities, but also the many limitations. Where do we end up then?

I think important is to loop back to our use cases, and to consider how technology can help but not completely solve, the problems set out. Knowing which provider organisations might have been funded through a particular donors money could be enough to help them target investigations in cases of fraud. Or knowing all the funders who have a stake in projects in a particular country, sector and locality can be enough for communities on the ground to do further research to identify the funders they need to talk to.

Rather than searching after a traceability data panopticon, can we focus traceability-enabling practices on breaking down the barriers to specific investigatory processes?

Ultimately, in the IATI case, getting traceability to work at the project level alone could be a big boost. But doing this will require a lot of social coordination, as much as technical innovation. As we think about tools for traceability, thinking about tools that support this social process may be an important area to focus on.

Where next

Steven Flower and the rest of the Open Data Services team will be working on coming weeks on a deeper investigation of traceability issues – with the goal of producing a report and toolkit later this year. They’ve already been digging into IATI data to look for the links that exist so far and building on past work testing the concept of traceability against real data.

Drop in comments below, or drop Steven a line, if you have ideas to share.

Three cross-cutting issues that UK data sharing proposals should address

[Summary: an extended discussion of issue arising from today’s discussion of UK data sharing open policymaking discussions]

I spend a lot of time thinking and writing about open data. But, as has often been said, not all of the data that government holds should be published as open data.

Certain registers and datasets managed by the state may contain, or be used to reveal, personally identifying and private information – justifying strong restrictions on how they are accessed and used. Many of the datasets governments collect, from tax records to detailed survey data collected for policy making and monitoring fall into this category. However, the principle that data collected for one purpose might have a legitimate use in another context still applies to this data: one government department may be able to pursue it’s public task with data from another, and there are cases where public benefit is to be found from sharing data with academic and private sector researchers and innovators.

However, in the UK, the picture of which departments, agencies and levels of government can share which data with others (or outside of the state) is complex to say the least. When it comes to sharing personally identifying datasets, agencies need to rely on specific ‘legal gateways’, with certain major data holders such as HM Revenue and Customs bound by restrictive rules that may require explicit legislation to pass through parliament before specific data shares are permitted.

That’s ostensibly why the UK Government has been working for a number of years now on bringing forward new data sharing proposals – creating ‘permissive powers’ for cross-departmental and cross-agency data sharing, increasing the ease of data flows between national and local government, whilst increasing the clarity of safeguards against data mis-use. Up until just before the last election, an Open Policy Making process, modelled broadly on the UK Open Government Partnership process was taking place – resulting in a refined set of potential proposals relating to identifiable data sharing, data sharing for fraud reduction, and use of data for targeted public services. Today that process was re-started, with a view to a public consultation on updated proposals in the coming months.

However, although much progress has been made in refining proposals based on private sector and civil society feedback, from the range of specific and somewhat disjointed proposals presented for new arrangements in today’s workshop, it appears the process is a way off from providing the kinds of clarification of the current regime that might be desirable. Missing from today’s discussions were clear cross-cutting mechanisms to build trust in government data sharing, and establish the kind of secure data infrastructures that are needed for handling personal data sharing.

I want to suggest three areas that need to be more clearly addressed – all of which were raised in the 2014/15 Open Policymaking process, but which have been somewhat lost in the latest iterations of discussion.

1. Maximising impact, minimising the data shared

One of the most compelling cases for data sharing presented in today’s workshop was work to address fuel poverty by automatically giving low-income pensioners rebates on their fuel bills. Discussions suggested that since the automatic rebate was introduced, 50% more eligible recipients are getting the rebates – with the most vulnerable who were far less likely to apply to recieve the rebates they were entitied to the biggest beneficiaries. With every degree drop in the temperature of a pensioners home correlating to increased hospital admissions – then the argument for allowing the data share, and indeed establishing the framework for current arrangements to be extended to others in fuel poverty (the current powers are specific to pensioners data in some way), is clear.

However, this case is also one where the impact is accompanied by a process that results in minimal data actually being shared from government to the private companies who apply the rebates to individuals energy bills. All that is shared in response to energy companies queries for each candidate on their customer list is a flag for whether the individual is eligible for the rebate or not.

This kind of approach does not require the sharing of a bulk dataset of personally identifying information – it requires a transactional service that can provide the minimum certification required to indicate, with some reasonable level of confidence, that an individual has some relevant credentials. The idea of privacy protecting identity services which operate in this way is not new – yet the framing of the current data sharing discussion has tended to focus on ‘sharing datasets’ instead of constructing processes and technical systems which can be well governed, and still meet the vast majority of use-cases where data shares may be required.

For example, when the General Records Office representative today posed the question of “In what circumstances would it be approciate to share civil registration data (e.g. Birth, Adoption, Marriage and Death) information?”, the use-cases that surfaced were all to do with verification of identity: something that could be achieved much more safely by providing a digital service than by handing over datasets in bulk.

Indeed, approached as a question of systems design, rather than data sharing, the fight against fraud may in practice be better served by allowing citizens to digitally access their own civil registration information and to submit that as evidence in their transactions with government, helping narrow the number of cases where fraud may be occurring – and focussing investigative efforts more tightly, instead of chasing after problematic big data analysis approaches.

(Aside #1: As one participant in today’s workshop insightfully noted, there are thousands of valid marriages in the UK which are not civil marriages and so may not be present in Civil Registers. A big data approach that seeks to match records of who is married to records of households who have declared they are married, to identify fraudulent claims, is likely to flag these households wrongly, creating new forms of discrimination. By contrast, an approach that helps individuals submit their evidence to government allows such ‘edge cases’ to be factored in – recognising that many ‘facts’ about citizens are not easily reduced to simple database fields, and that giving account of ones self to the state is a performative act which should not be too readily sidelined.)

(Aside #2: The case of civil registers also illustrates an interesting and significant qualitative difference between public records, and a bulk public dataset. Births, marriages and deaths are all ‘public events’: there is no right to keep them private, and they have long been recorded in registers which are open to inspection. However, when the model of access to these registers switches from the focussed inspection, looking for a particular individual, to bulk access, they become possible to use in new ways – for example, creating a ‘primary key’ of individuals to which other data can be attached, eroding privacy in ways which was not possible when each record needed to be explored individually. The balance of benefits and harms from this qualitative change will vary from dataset to dataset. For example, I would strongly advocate the open sharing of company registers, including details of beneficial owners, both because of the public benefit of this data, and because registering a company is a public act involving a certain social contract. By contrast, I would be more cautious about the full disclosure of all civil registers, due to the different nature of the social contract involved, and the greater risk of vulnerable individuals being targetted through intentional or unintentional misuse of the data.)

All of which is a long way to say:

  • Where the cross-agency or cross-departmental use-cases for access to a particular can be reduced to sharing assertions about individuals, rather than bulk datasets, this route should be explored first.

This does not remove the need for governance of both access and data use. However, it does ease the governance of access, and audit logs of access to a service are easier to manage than audit logs of what users in possession of a dataset have done.

Even the sharing of a ‘flag’ that can be applied to an individuals data record needs careful thought: and those in receipt of such flags need to ensure they govern the use of that data carefully. For example, as one participant today noted, pensioners have raised fears that energy companies may use a ‘fuel poverty’ flag in their records to target them with advertising. Ensuring that later analysts in the company do not stumble upon the rebate figures in invoices, and feed this into profiling of customers, for example, will require very careful data governance – and it is not clear that companies practices are robust enough to protect against this right now.

2. Algorithmic transparency

Last year the Detroit Digital Justice Coalition produced a great little zine called ‘Opening Data’ which takes a practical look at some of the opportunities and challenges of open data use. They look at how data is used to profile communities, and how the classifications and clustering approaches applied to data can create categories that may be skewed and biased against particular groups, or that reinforce rather than challenge social divides (see pg 30 onwards). The same issues apply to data sharing.

Whilst current data protection legislation gives citizens a right to access and correct information about themselves, the algorithms used to process that data, and derive analysis from it are rarely shared or open to adequate scrutiny.

In the process of establishing new frameworks for data sharing, the algorithms used to process that data should be being brough in view as much as the datasets themselves.

If, for example, someone is offered a targetted public service, or targetted in a fraud investigation, there is question to be explored of whether they should be told which datasets, and which algorithms, led to them being selected. This, and associated transparency, could help to surface otherwise unseen biases which might otherwise lead to particular groups being unfairly targetted (or missed) by analysis. Transparency is no panacea, but it plays an important role as a safeguard.

3. Systematic transparency of sharing arrangements

On the theme of transparency, many of the proposals discussed today mentioned oversight groups, Privacy Impact Assessments, and publication of information on either those in receipt of shared data, or those refused access to datasets – yet across the piece no systematic framework for this was put forward.

This is an issue Reuben Binns and I wrote about in 2014, putting forward a proposal for a common standard for disclosure of data sharing arrangements that, in it’s strongest form would require:

  • Structured data on origin, destination, purpose, legal framework and timescales for sharing;
  • Publication of Privacy Impact Assessments and other associated documents;
  • Notices published through a common venue (such as the Gazette) in a timely fashion;
  • Consultation windows where relevant before a power comes into force;
  • Sharing to only be legally valid when the notice has been published.

Without such a framework, we are likely to end up with the current confused system in which no-one knows which shares are in place, how they are being used, and which legal gateways are functioning well or not. With a scattered set of spreadsheets and web pages listing approved sharing, citizens have no hope of understanding how their data is being used.

If only one of the above issues could be addressed in the upcoming consultation on data sharing, then I certainly hope progress could be made on addressing this missing piece of a robust common framework for the transparency principles of data sharing to be put into practice.

Towards a well governed infrastructure?

Ultimately, the discussion of data sharing is a discussion about one aspect of our national data infrastructure. There has been a lot of smart work going on, both inside and outside government, on issues such as identity assurance, differential privacy, and identifying core derived datasets which should be available as open data to bypass need for sharing gateways. A truly effective data sharing agenda needs to link with these to ensure it is neither creating over-broad powers which are open to abuse, nor establishing a new web of complex and hard to operate gateways.

Further reading

My thinking on these issues has been shaped in part by inputs from the following:

Data & Discrimination – Collected Essays

White House Report on Big Data, and associated papers/notes from The Social, Cultural & Ethical Dimensions of “Big Data.” conference

A distributed approach to co-operative data

[Summary: rough notes from a workshop on cooperative sector data.]

Principle 6 of the International Co-operative Alliance calls for ‘co-operation amongst co-operatives’. Yet, for many co-ops, finding other worker owned businesses to work with can be challenging. Although there are over 7,000 co-operatives in the UK, and many more worldwide, it can be challenging to find out much about them.

This was one of the key drivers behind a convening at the Old Music Hall in Oxford just before Christmas where cooperators from the Institute for Solidarity Economics, Open Data Services Co-operative, Coops UK and Transformap gathered to explore opportunities for ‘Principle 6 Data’: open data to build up a clearer picture of the co-operative economy.

We started out articulating different challenges to be explored through the day, including:

  • Helping researchers better understand the co-operative sector. With co-ops employing thousands of people, and co-operatives adding £37bn to the UK economy last year, having a clearer picture of where they operate, what they do and how they work is vital. Yet information is scarce. For researchers at the Institute for Solidarity Economics, there is a need to dig beyond headline organisation types to understand how the activities of organisations contribute to worker owned, social impact enterprise.

  • Support trade between co-operatives. For example, earlier this year when we were planning a face-to-face gathering of Open Data Services Co-op we tried to find co-operatively run venues to use, and we’ve been trying to understand where else we could support co-ops in our supply chain. Whilst Coops UK provide a directory of co-operatives, it is focussed on business-to-consumer, not business-to-business information.

  • Enabling distributed information management on co-ops. Right now, the best dataset we have for the UK comes from Coops UK, the membership body for the UK sector, who hold information on 7000 or so co-operatives, built up over the years from various sources. They have recently released some of this as open data, and are keen to build on the dataset in future. Yet if it can only be updated via Coops UK this creates a bottleneck to the creation of richer data resources.

My Open Data Services colleague, Edafe Onerhime, did some great work looking at the existing Coops UK dataset, which is written up here, and Dan from ISE explored ways of getting microformat markup into the Institute for Solidarity Economics website to expose more structured data about the organisation, including the gender profile of the workforce. We also took at look at whether data from the .coop domain registry might provide insights into the sector, and set about exploring whether microformats were already in use on any of the websites of UK co-operatives.

Building on these experiments, we came to an exploration of potential social, organisational and technical challenges ahead if we want to see a distributed approach to greater data publication on the co-op sector. Ultimately, this boiled down to a couple of key questions:

  • How can co-operatives be encouraged to share more structured data on their activities?

  • How can the different data needs of different users be met?

  • How can that data be fed into different data-driven projects for research, or cooperative collaboration?

There are various elements to addressing these questions.

There is a standards element: identifying the different kinds of things about co-operatives that different users may want to know about, and looking for standards to model these. For example, alongside the basic details of registered organisations and their turnover collected for the co-operative economic report, business-to-business use cases may be interested in branch locations and product/service offerings from co-ops, and solidarity economics research may be interested in the different value commitments a co-operative has, and details of its democratic governance. We looked at how certifications, from formal Fairtrade certifications for products of a co-op, to social certifications where someone a user trusts vouches for an organisation, might be an important part of the picture also.

For many of the features of a cooperative that are of interest, common data standards already exist, such as those provided by schema.org. Although these need to be approached critically, they provide a pragmatic starting point for standardisation, an example with Coops UK Co-Operative economy dataset can be seen here.

There is a process element of working out how data that co-operatives might publish using common standards will be validated, aggregated and made into shared datasets. Here, we looked at how an annual process of data collection, such as the UK Co-operative Economy report might bootstrap a process of distributed data publishing.

Imagine a platform where co-operatives are offered three options to provide their data into the annual co-operative economy report:

  1. Fill in a form manually;

  2. Publish a spreadsheet of key data to your own website;

  3. Embed JSON-LD / microformat data in your own website;

Although 2 and 3 are more technically complex, they can provide richer and more open and re-useable data, and a platform can explain the advantages of taking the extra steps on these. Moving co-operatives from Step 1 to Step 2 can be bootstrapped by the form driven process generating a spreadsheet for co-ops to publish on their own sites at a set location, and then encouraging them to update those sheets in year 2.

With good interaction design, and a feedback loop that helps validate data quality and show the collective benefits of providing additional data, such a platform could provide the practical component of a campaign for open data publication and use by co-ops.

This points to the crucial **social element: **building community around the open data process, and making sure it is not a technical exercise, but one that meets real needs.

Did we end the day with a clear picture of where next for co-op sector data? Not yet. But it was clear that all the groups participating will continue to explore this space into 2016, and we’ll be looking for more opportunities to collaborate together.

Principles for responding to mass surveilance and the draft Investigatory Powers Bill

[Summary: notes written up on the train back from Paris & London, and following a meeting with Open Rights Group focussing on the draft Investigatory Powers Bill]

It can be hard to navigate the surveillance debate. On the one hand, whilstblower revalations, notably those from Edward Snowden, have revealed the way in which states are accumulating collection of mass communications data, creating new regimes of deeply intrusive algorithmic surveillance, and unsettling the balance of power between citizens, officials and overseers in politics and the media. On the other, as recent events in Paris, London, the US and right across the world have brought into sharp focus, there are very real threats to the life and liberty posed by non-state terrorist actors – and meeting the risks posed must surely involve the security services.

Fortunately, rather than the common pattern of rushed legislative proposals after terrorist attacks, after the attacks in Paris, the UK has kept to the planned timetable for debate of the proposed Investigatory Powers Bill.

The Bill primarily works to put on a legal footing many of the actions that surveillance agencies have already been engaged in when it comes to bulk data collection and bulk hacking of services (equipment interference, and obtaining data). But the Bill also proposes a number of further extensions of powers, including provisions to mandate storage of ‘Internet Connection Records’ – branded as creating a ‘snoopers charter’ in media debates because of the potential for law enforcement and other government agencies to gain access to this detailed information in individuals web browsing histories.

Page 33 of the draft includes a handy ‘Investigatory Powers at a Glance’ table, setting out who will have access to Communications Data, powers of Interception and Bulk Datasets – and what the access and oversight processes might be.

PowersAtAGlance

Reading through the case for new powers put in the pre-amble to the Bill it is important to critically unpack the claims made for new powers. For example, point 47 notes that “From a sample of 6025 referrals to the Child Exploitation and Online Protection Command (CEOP) of the NCA, 862 (14%) cannot be progressed”. The document extrapolates from this “a minimum of 862 suspected paedophiles, involved in the distribution of indecent imagery of children, who cannot be identified without this legislation.”, yet this is premised on the proposed storage of Internet Connection Records being a ‘magic bullet’ to secure investigation of all these suspects. In reality – the number may be much lower.

Yet, getting drawn into a calculus of costs and benefits, trading off the benefits of the protection of one group, with the harms of surveillance to another group, is a tricky business, and unlikely to create a well reasoned surveillance debate. We’re generally not very good at calculating as a society where risks are involved. And there will always be polarisation between those who weight apparently opposing goods (security/liberty?) particularly highly.

The alternative to this cost/benefit calculus is to develop a response based on principles. Principles we can check against evidence, but clear guiding principles none-the-less.

Here’s my first attempt at four principles to consider in exploring how to response to the Investigatory Powers Bill:

(1) Data minimisation without suspicion. We should collect and store the minimum possible amount of data about individuals where there is no reason to suspect the threat of harm to others, or of serious crime.

This point builds upon both principles and pragmatism. Individuals should be innocent until proven guilty, and space for individual freedom of thought and action respected. Equally, surveillance services need more signal, not more noise.

When it comes to address terrorism, creating an environment in which whole communities feel subject to mass surveilance is an entirely counterproductive strategy: undermining rather than promoting the liberal values we must work to protect.

(2) Data maximisation with suspicion. Where there is suspicion of individuals posing a threat, or of serious crime, then proportionate surveillance is justified, and should be pursued.

As far as I understand, few disagree with targetted surveillance. Unlike mass surveillance, targetted approachs can be intelligence rather than algorithmically led, and more tightly connect information collection, analysis and consideration of actions that can be taken against those who pose threats to society.

(3) Strong scrutiny. Sustained independent oversight of secret services is hard to achieve – but is vital to ensure tagetted surveillance capabilities are used responsibly, and to balance the power this gives to those who weild them.

The current Investigatory Powers Bill includes notable scrutiny loopholes, in which once issued, a Warrant can be modified to include new targets without new review and oversight.

(4) A safe Internet. Bulk efforts to undermine encyption and Internet security are extremely risky. Our societies rely upon a robust Internet, and it is important for governments to be working to make the network stronger for all.


Of course, putting principles into practive involves trade offs. But identifying principles is an important starting point to a deeper debate.

Do these principles work for you? I’ll be reflecting more on whether they capture enough to provide a route through the debate, and what their implications are for responding to the Investigatory Powers Bill in the coming months.

(P.S. If you care about the future of the Investigatory Powers Bill in the UK, and you are not already a member of the Open Rights Group – do consider joining to support their work as one of very few dedicated groups focussing on promoting digital freedoms in this debate.

Disclosure: I’m a member of the ORG Advisory Council)

Is Generation Open Growing Up? ODI Summit 2015

[Summary: previewing the upcoming Open Data Institute Summit (discount registration link)]

ODISummit

In Just over two week’s time the Open Data Institute will be convening their second ‘ODI Summit‘ conference, under the banner ‘Celebrating Generation Open’.

The framing is broad, and rich in ideals:

“Global citizens who embrace network thinking

We are innovators and entrepreneurs, customers and citizens, students and parents who embrace network thinking. We are not bound by age, income or borders. We exist online and in every country, company, school and community.

Our attitudes are built on open culture. We expect everything to be accessible: an open web, open source, open cities, open government, open data. We believe in freedom to connect, freedom to travel, freedom to share and freedom to trade. Anyone can publish, anyone can broadcast, anyone can sell things, anyone can learn and everyone can share.

With this open mindset we transform sectors around the world, from business to art, by promoting transparency, accessibility, innovation and collaboration.”

But, it’s not just idealistic language. Right across the programme are programme are projects which are putting those ideals into action in concrete ways. I’m fortunate to get to spend some of my time working with a number of the projects and people who will be presenting their work, including:

Plus, my fellow co-founder at Open Data Services Co-operative, Ben Webb, will be speaking on some of the work we’ve been doing to support Open Contracting, 360Giving and projects with the Natural Resource Governance Institute.

Across the rest of the Summit there are also presentations on open data in arts, transport, biomedical research, journalism and safer surfing, to name just a few.

What is striking about this line up is that very few of these projects will be presenting on one-off demonstrations, but will be sharing increasingly mature projects: and projects which are increasingly diverse, as they recognise that data is one element of a theory of change, and being embedded in specific sectoral debates and action is just as important.

In some ways, it raises the question of how much a conference on open data in general can hold together: with so many different domains represented, is open data a strong enough thread to bind them together. On this question, I’m looking forward to Becky Hogge’s reflections when she launches a new piece of research at the Summit, five years on from her widely cited Open Data Study. In a preview of her new report, Becky argues that “It’s time for the open data community to stop playing nice” – moving away from trying to tie together divergent economic and political agendas, and putting full focus into securing and using data for specific change.

With ‘generation open’ announced: the question for us then is how does generation open cope with growing up. As the projects showcased at the summit move beyond the rhetoric, and we see that whilst in theory ‘anyone can do anything’ with data – in practice, access and ability is unequally distributed – how will debates over the ends to which we use the freedoms brought by ‘open’ play out?

Let’s see.


I’ll be blogging on the ideas and debates at the summit, as the folk at ODI have kindly invited Open Data Services as a media supporter. As a result they’ve also given me this link to share which will get anyone still to book 20% of their tickets. Perhaps see you there.

Data, openness, community ownership and the commons

[Summary: reflections on responses to the GODAN discussion paper on agricultural open data, ownership and the commons – posted ahead of Africa Open Data Conference GODAN sessions]

Photo Credit - CC-BY - South Africa Tourism
]3 Photo Credit – CC-BY – South Africa Tourism

Key points

  • We need to distinguish between claims to data ownership, and claims to be a stakeholder in a dataset;
  • Ownership is a relevant concept for a limited range of datasets;
  • Openness can be a positive strategy, empowering farmers vis-a-vis large corporate interests;
  • Openness is not universally good: can also be used as a ‘data grab’ strategy;
  • We need to think critically about the configurations of openness we are promoting;
  • Commons and cooperative based strategies for managing data and open data are a key area for further exploration;

Open or owned data?

Following the publication of a discussion paper by the ODI for the Global Open Data for Agriculture and Nutrition initiative, putting forward a case for how open data can help improve agriculture, food and nutrition, debate has been growing about how open data should be approached in the context of smallholder agriculture. In this post, I explore some provisional reflections on that debate.

Respondents to the paper have pointed to the way in which, in situations of unequal power, and in complex global markets, greater accessibility of data can have substantial downsides for farmers. For example, commodity speculation based on open weather data can drive up food prices, or open data on soil profiles can be used in order to extract greater margins from farmers when selling fertilizers. A number of responses to the ODI paper have noted that much of the information that feeds into emerging models of data-driven agriculture is coming from small-scale farmers themselves: whether through statistical collection by governments, or hoovered up by providers of farming technology, all aggregated into big datasets that practically inaccessible to local communities and farmers.

This has led to some focussing in response on the concept of data ownership: asserting that more emphasis should be placed on community ownership of the data generated at a local level. Equally, it has led to the argument that “opening data without enabling effective, equitable use can be considered a form of piracy”, making direct allusions to the biopiracy debate and the consequent responses to such concerns in the form of interventions such as the International Treaty on Plant Genetic Resources.

There are valid concerns here. Efforts to open up data must be interrogated to understand which actors stand to benefit, and to identify whether the configuration of openness sought is one that will promote the outcomes claimed. However, claims of data ownership and data sovereignty need to be taken as a starting point for designing better configurations of openness, rather than as a blocking counter-claim to ideas of open data.

Community ownership and openness

My thinking on this topic is shaped, albeit not to a set conclusion, by a debate that took place last year at a Berkman Centre Fellows Hour based on a presentation by Pushpa Kumar Lakshmanan on the Nagoya Protocol which sets out a framework for community ownership and control over genetic resources.

The debate raised the tension between the rights of communities to gain benefits from the resources and knowledge that they have stewarded, potentially over centuries, with an open knowledge approach that argues social progress is better served when knowledge is freely shared.

It also raised important questions of how communities can be demarcated (a long-standing and challenging issue in the philosophy of community rights) – and whether drawing a boundary to protect a community from external exploitation risks leaving internal patterns of power and exploitation within the community unexplored. For example, does community ownership of data really lead to certain elites in the community controlling it.

Ultimately, the debate taps into a conflict between those who see the greatest risk as being the exploitation of local communities by powerful economic actors, and those who see the greater risk as a conservative hoarding of knowledge in local communities in ways that inhibit important collective progress.

Exploring ownership claims

It is useful to note that much of the work on the Nagoya Protocol that Pushpa described was centred on controlling borders to regulate the physical transfer of plant genetic material. Thinking about rights over intangible data raises a whole new set of issues: ownership cannot just be filtered through a lens of possession and physical control.

Much data is relational. That is to say that it represents a relationship between two parties, or represents objects that may stand in ownership relationships with different parties. For example, in his response to the GODAN paper, Ajit Maru reports how “John Deere now considers its tractors and other equipment as legally ‘software’ and not a machine… [and] claims [this] gives them the right to use data generated as ‘feedback’ from their machinery”. Yet, this data about a tractor’s operation is also data about the farmers land, crops and work. The same kinds of ‘trade data for service’ concerns that have long been discussed with reference to social media websites are becoming an increasing part of the agriculture world. The concern here is with a kind of corporate data-grab, in which firms extract data, asserting their absolute ownership over something which is primarily generated by the farmer, and which is at best a co-production of farmer and firm.

It is in response to this kind of situation that grassroots data ownership claims are made.

These ownership claims can vary in strength. For example:

  • The farmer can claim that ‘this is my data’, and I should have ultimate control over how it is used, and the ability to treat it as a personally held asset;

  • The second runs that ‘I have a stake in this data’, and as a consequence, I should have access to it, and a say in how it is used;

Which claim is relevant depends very much on the nature of the data. For example, we might allow ownership claims over data about the self (personal data), and the direct property of an individual. For datasets that are more clearly relational, or collectively owned (for example, local statistics collected by agricultural extension workers, or weather data funded by taxation), the stakeholding claim is the more relevant.

It is important at this point to note that not all (perhaps even not many) concerns about the potential misuse of data can be dealt with effectively through a property right regime. Uses of data to abuse privacy, or to speculate and manipulate markets may be much better dealt with by regulations and prohibitions on those activities, rather than attempts to restrict the flow of data through assertions of data ownership.

Openness as a strategy

Once we know whether we are dealing with ownership claims, or stakeholding claims, in data, we can start thinking about different strategic configurations of openness, that take into account power relationships, and that seek to balance protection against exploitation, with the benefits that can come from collaboration and sharing.

For example, each farmer on their own has limited power vis-a-vis a high-tech tractor maker like John Deere. Even if they can assert a right to access their own data, John Deere will most likely retain the power to aggregate data from 1000s of farmers, maintaining an inequality of access to data vis-a-vis the farmer. If the farmer seeks to deny John Deere the right to aggregate their data with that of others: changes that (a) they will be unsuccessful, as making an absolute ownership claim here is difficult – using the tractor was a choice after all; and (b) they will potentially inhibit useful research and use of data that could improve cropping (even if some of the other uses of the data may run counter to the farmers interest). Some have suggested that creating a market in the data, where the data aggregator would pay the farmers for the ability to use their data, offers an alternative path here: but it is not clear that the price would compensate the farmer adequately, or lead to an efficient re-use of data.

However, in this setting openness potentially offers an alternative strategy. If farmers argue that they will only give data to John Deere if John Deere makes the aggregated data open, then they have the chance to challenge the asymmetry of power that otherwise develops. A range of actors and intermediaries can then use this data to provide services in the interests of the farmers. Both the technology provider, and the farmer, get access to the data in which they are both stakeholders.

This strategy (“I’ll give you data only if you make the aggregate set of data you gather open”), may require collective action from farmers. This may be the kind of arrangement GODAN can play a role in brokering, particularly as it may also turn out to be in the interest of the firm as well. Information economics has demonstrated how firms often under-share information which, if open, could lead to an expansion of the overall market and better equilibria in which, rather than a zero-sum game, there are benefits to be shared amongst market actors.

There will, however, be cases in which the power imbalances between data providers and those who could exploit the data are too large. For example, the above discussion assumes intermediaries will emerge who can help make effective use of aggregated data in the interests of farmers. Sometimes (a) the greatest use will need to be based on analysis of disaggregated data, which cannot be released openly; and (b) data providers need to find ways to work together to make use of data. In these cases, there may be a lot to learn from the history of commons and co-operative structures in the agricultural realm.

Co-operative and commons based strategies

Many discussions of openness conflate the concept of openness, and the concept of the commons. Yet there is an important distinction. Put crudely:

  • Open = anyone is free to use/re-use a resource;
  • Commons = mutual rights and responsibilities towards the resource;

In the context of digital works, Creative Commons provide a suite of licenses for content, some of which are ‘open’ (they place no responsibilities on users of a resource, but grant broad rights), and others of which adopt a more regulated commons approach, placing certain obligations on re-users of a document, photo or dataset, such as the responsibility to attribute the source, and share any derivative work under the same terms.

The Creative Commons draws upon an imagery from the physical commons. These commons were often in the form of land over which farmers held certain rights to graze cattle, of fisheries in which each fisher took shared responsibility for avoiding overfishing. Such commons are, in practice, highly regulated spaces – but that seek to pursue an approach based on sharing and stakeholding in resources, rather than absolute ownership claims. As we think about data resources in agriculture, reflecting more on learning from the commons is likely to prove fruitful. Of course, data, unlike land, is not finite in the same ways, nor does it have the same properties of excludability and rivalrousness.

In thinking about how to manage data commons, we might look towards another feature prevalent in agricultural production: that of the cooperative. The core idea of a data cooperative is that data can be held in trust by a body collectively owned by those who contribute the data. Such data cooperatives could help manage the boundary between data that is made open at some suitable level of aggregation, and data that is analysed and used to generate products of use to those contributing the data.

With Open Data Services Co-operative I’ve just started to dig more into learning about the cooperative movement: co-founding a workers cooperative that supports open data projects. However, we’ve also been thinking about how data cooperatives might work – and I’m certain there is scope for a lot more work in this area, helping deal with some of the critical questions that have come up for open data from the GODAN discussion paper.

Enabling the Data Revolution: IODC 2015 Conference Report

ReportCoverThe International Open Data Conference in Ottawa in May this year brought together over 200 speakers and close to 1000 in-person attendees to explore the open data landscape. I had the great privilege of working with the conference team to work on co-ordinating a series of sessions designed to weave together discussions from across the conference into a series of proposals for action, supporting shared action to take forward a progressive open data agenda. From the Open Data Research Symposium and Data Standards Day and other pre-events, to the impact presentations, panel discussions and individual action track sessions, a wealth of ideas were introduced and explored.

Since the conference, we’ve been hard at work on a synthesis of the conference discussions, drawing on over 30 hours of video coverage, hundreds of slide decks and blog posts, and thousands of tweets, to capture some of the key issues discussed, and to put together a roadmap of priority areas for action.

The result has just been published in English and French as a report for download, and as an interactive copy on Fold: embedding video and links alongside the report section by section.

Weaving it together

The report was only made possible through the work of a team of volunteers – acting as rapporteurs for each sessions and blogging their reflections – and session organisers, preparing provocation blog posts in advance. That meant that in working to produce a synthesis of the different conferences I not only had video recordings and tweets from most sessions, but I also had diverse views and take-away insights written up by different participants, ensuring that the report was not just about what I took from the conference materials – but that it was shaped by different delegates views. In the Fold version of the report I’ve tried to link out to the recordings and blog posts to provide extra context in many sections – particularly in the ‘Data Plus’ section which covers open data in a range of contexts, from agriculture, to fiscal transparency and indigenous rights.

One of the most interesting, and challenging, sections of the report to compile has been the Roadmap for Action. The preparation for this began long in advance of the International Open Data Conference. Based on submissions to the conference open call, a set of action areas were identified. We then recruited a team of ‘action anchors’ to help shape inputs, provocations and conference workshops that could build upon the debates and case studies shared at the conference and it’s pre-events, and then look forward to set out an agenda for future collaboration and action in these areas. This process surfaced ideas for action at many different levels: from big-picture programmes, to small and focussed collaborative projects. In some areas, the conference could focus on socialising existing concrete proposals. In other areas, the need has been for moving towards shared vision, even if the exact next steps on the path there are not yet clear.

The agenda for action

Ultimately, in the report, the eight action areas explored at IODC2015 are boiled down to five headline categories in the final chapter, each with a couple of detailed actions underneath:

  • Shared principles for open data: “Common, fundamental principles are vital in order to unlock a sustainable supply of high quality open data, and to create the foundations for inclusive and effective open data use. The International Open Data Charter will provide principles for open data policy, relevant to governments at all levels of development and supported by implementation resources and working groups.”
  • Good practices and open standards for data publication: “Standards groups must work together for joined up, interoperable data, and must focus on priority practices rooted in user needs. Data publishers must work to identify and adopt shared standards and remove the technology and policy barriers that are frequently preventing data reuse.”
  • Building capacity to produce and use open data effectively: “Government open data leaders need increased opportunities for networking and peer-learning. Models are needed to support private sector and civil society open data champions in working to unlock the economic and social potential of open data. Work is needed to identify and embed core competencies for working with open data within existing organizational training, formal education, and informal learning programs.”
  • Strengthening open data innovation networks: “Investment, support, and strategic action is needed to scale social and economic open data innovations that work. Organizations should commit to using open data strategies in addressing key sectoral challenges. Open data innovation networks and thematic collaborations in areas such as health, agriculture, and parliamentary openness will facilitate the spread of ideas, tools, and skills— supporting context-aware and high-impact innovation exchange.”
  • Adopting common measurement and evaluation tools: “Researchers should work together to avoid duplication, to increase the rigour of open data assessments, and to build a shared, contextualized, evidence base on what works. Reusable methodological tools that measure the supply, use, and outcomes of open data are vital.To ensure the data revolution delivers open data, open data assessment methods must also be embedded within domain-specific surveys, including assessments of national statistical data.All stakeholders should work to monitor and evaluate their open data activities, contributing to research and shared learning on securing the greatest social impact for an open data revolution.”

In the full report, more detailed actions are presented in each of these categories. The true test of the roadmap will come with the 2016 International Open Data Conference, where we will be able to look at progress made in each of these areas, and to see whether action on open data is meeting the challenge of securing increased impact, sustainability and inclusiveness.