OCDS – Notes on a standard

Today sees the launch of the first release of the Open Contracting Data Standard (OCDS). The standard, as I’ve written before, brings together concrete guidance on the kinds of documents and data that are needed for increased transparency in processes of public contracting, with a technical specification describing how to represent contract data and meta-data in common ways.

The video below provides a brief overview of how it works (or you can read the briefing note), and you can find full documentation at http://standard.open-contracting.org.

When I first jotted down a few notes on how to go forward from the rapid prototype I worked on with Sarah Bird in 2012, I didn’t realise we would actually end up with the opportunity to put some of those ideas into practice. However: we did – and so in this post I wanted to reflect on some aspects of the standard we’ve arrived at, some of the learning from the process, and a few of the ideas that have guided at least my inputs into the development process.

As, hopefully, others pick up and draw upon the initial work we’ve done (in addition to the great inputs we’ve had already), I’m certain there will be much more learning to capture.

(1) Foundations for ‘open by default’

Early open data advocacy called for ‘raw data now‘, asking for governments to essentially export and dump online existing datasets, with issues of structure and regular publishing processes to be sorted out later. Yet, as open data matures, the discussion is shifting to the idea of ‘open by default’, and taken seriously this means more than just data dumps that are created being openly licensed as the default position, but should mean that data is released from government systems as a matter of course in part of their day-to-day operation.

The full OCDS model is designed to support this kind of ‘open by default’, allowing publishers to provide small releases of data every time some event occurs in the lifetime of a contracting process. A new tender is a release. An amendment to that tender is a release. The contract being awarded, or then signed, are each releases. These data releases are tied together by a common identifier, and can be combined into a summary record, providing a snapshot view of the state of a contracting process, and a history of how it has developed over time.

This releases and records model seeks to combine together different user needs: from the firm seeking information about tender opportunities, to the civil society organisation wishing to analyse across a wide range of contracting processes. And by allowing core stages in the business process of contracting to be published as they happen, and then joined up later, it is oriented towards the development of contracting systems that default to timely openness.

As I’ll be exploring in my talk at the Berkman Centre next week, the challenge ahead for open data is not just to find standards to make existing datasets line-up when they get dumped online, but is to envisage and co-design new infrastructures for everyday transparent, effective and accountable processes of government and governance.

(2) Not your minimum viable product

Different models of standard

Many open data standard projects adopt either a ‘Minimum Viable Product‘ approach, looking to capture only the few most common fields between publishers, or are developed through focussing on the concerns of a single publisher or users. Whilst MVP models may make sense for small building blocks designed to fit into other standardisation efforts, when it came to OCDS there was a clear user demand to link up data along the contracting process, and this required an overarching framework from into which simple component could be placed, or from which they could be extracted, rather than the creation of ad-hoc components, with the attempt to join them up made later on.

Whilst we didn’t quite achieve the full abstract model + idiomatic serialisations proposed in the initial technical architecture sketch, we have ended up with a core schema, and then suggested ways to represent this data in both structured and flat formats. This is already proving useful for example in exploring how data published as part of the UK Local Government Transparency Code might be mapped to OCDS from existing CSV schemas.

(3) The interop balancing act & keeping flex in the framework

OCDS is, ultimately, not a small standard. It seeks to describe the whole of a contracting process, from planning, through tender, to contract award, signed contract, and project implementation. And at each stage it provides space for capturing detailed information, linking to documents, tracking milestones and tracking values and line-items.

This shape of the specification is a direct consequence of the method adopted to develop it: looking at a diverse set of existing data, and spending time exploring the data that different users wanted, as well as looking at other existing standards and data specifications.

However, OCDS by not means covers all the things that publishers might want to state about contracting, nor all the things users may want to know. Instead, it focusses on achieving interoperability of data in a number of key areas, and then providing a framework into which extensions can be linked as the needs of different sub-communities of open data users arise.

We’re only in the early stages of thinking about how extensions to the standard will work, but I suspect they will turn out to be an important aspect: allowing different groups to come together to agree (or contest) the extra elements that are important to share in a particular country, sector or context. Over time, some may move into the core of the standard, and potentially elements that appear core right now might move into the realm of extensions, each able to have their own governance processes if appropriate.

As Urs Gasser and John Palfrey note in their work on Interop, the key in building towards interoperability is not to make everything standardised and interoperable, but is to work out the ways in which things should be made compatible, and the ways in which they should not. Forcing everything into a common mould removes the diversity of the real world, yet leaving everything underspecified means no possibility to connect data up. This is both a question of the standards, and the pressures that shape how they are adopted.

(4) Avoiding identity crisis

Data describes things. To be described, those things need to be identified. When describing data on the web, it helps if those things can be unambiguously identified and distinguished from other things which might have the same names or identification numbers. This generally requires the use of globally unique identifiers (guid): some value which, in a universe of all available contracting data, for example, picks out a unique contracting process; or, in the universe of all organizations, uniquely identifies a specific organization. However, providing these identifiers can turn out to be both a politically and technically challenging process.

The Open Data Institute have recently published a report on the importance of identifiers that underlines how important identifiers are to processes of opening data. Yet, consistent identifiers often have key properties of public goods: everyone benefits from having them, but providing and maintaining them has some costs attached, which no individual identifier user has an incentive to cover. In some cases, such as goods and service identifiers, projects have emerged which take a proprietary approach to fund the maintenance of those identifiers, selling access to the lookup lists which match the codes for describing goods and services to their descriptions. This clearly raises challenges for an open standard, as when proprietary identifiers are incorporated into data, then users may face extra costs to interpret and make sense of data.

In OCDS we’ve sought to take as distributed an approach to identifiers as possible, only requiring globally unique identifiers where absolutely necessary (identifying contracts, organizations and goods and services), and deferring to existing registration agencies and identity providers, with OCDS maintaining, at most, code lists for referring to each identity ‘scheme’.

In some cases, we’ve split the ‘scheme’ out into a separate field: for example, an organization identifier consists of a scheme field with a value like ‘GB-COH’ to stand for UK Companies House, and then the identifier given in that scheme, like ‘5381958’. This approach allows people to store those identifiers in their existing systems without change (existing databases might hold national company numbers, with the field assumed to come from a particular register), whilst making explicit the scheme they come from in the OCDS. In other cases, however, we look to create new composite string identifiers, combining a prefix, and some identifier drawn from an organizations internal system. This is particularly the case for the Open Contracting ID (ocid). By doing this, the identifier can travel between systems more easily as a guid – and could even be incorporated in unstructured data as a key for locating documents and resources related to a given contracting process.

However, recent learning from the project is showing that many organisations are hesistant about the introduction of new IDs, and that adoption of an identifier schema may require as much advocacy as adoption of a standard. At a policy level, bringing some external convention for identifying things into a dataset appears to be seen as affecting the, for want of a better word, sovereignty of a specific dataset: even if in practice the prefix approach of the ocid means it only need to be hard coded in the systems that expose data to the world, not necessarily stored inside organizations databases. However, this is an area I suspect we will need to explore more, and keep tracking, as OCDS adoption moves forward.

(5) Bridging communities of practice

If you look closely you might in fact notice that the specification just launched in Costa Rica is actually labelled as a ‘release candidate‘. This points to another key element of learning in the project, concerning the different processes and timelines of policy and technical standardisation. In the world of funded projects and policy processes, deadlines are often fixed, and the project plan has to work backwards from there. In a technical standardisation process, there is no ‘standard’ until a specification is in use: and has been robustly tested. The processes for adopting a policy standard, and setting a technical one, differ – and whilst perhaps we should have spoken from the start of the project of an overall standard, embedding within it a technical specification, we were too far down the path towards the policy launch before this point. As a result, the Release Candidate designation is intended to suggest the specification is ready to draw upon, but that there is still a process to go (and future governance arrangements to be defined) before it can be adopted as a standard per-se.

(6) The schema is just the start of it

This leads to the most important point: that launching the schemas and specification is just one part of delivering the standard.

In a recent e-mail conversation with Greg Bloom about elements of standardisation, linked to the development of the Open Referral standard, Greg put forward a list of components that may be involved in delivering a sustainable standards project, including:

The specification – with its various components and subcomponents);
Tools that assesses compliance according to the spec (e.g. validation tools, and more advanced assessment tools);
Some means of visualizing a given set of data’s level of compliance;
Incentives of some kind (whether positive or negative) for attaining various levels of compliance;
Processes for governing all of the above;
and of course the community through which all of this emerges and sustains;

To this we might also add elements like documentation and tutorials, support for publishers, catalysing work with tool builders, guidance for users, and so-on.

Open government standards are not something to be published once, and then left, but require labour to develop and sustain, and involve many social processes as much as technical ones.

In many ways, although we’ve spent a year of small development iterations working towards this OCDS release, the work now is only just getting started, and there are many technical, community and capacity-building challenges ahead for the Open Contracting Partnership and others in the open contracting movement.

Two senses of standard

[Summary: technical standards play a role in both interoperability, and in target-setting for policy.]

I’ve been doing lots of thinking about standardisation recently, particularly as part of work on the Open Contracting Data Standard (feedback invited on the latest draft release…), and thanks to the opportunity to work with Samuel Goëta on a paper around data standards (hopefully out some time next year).

One of the themes I’ve been seeking to explore is how standards play both a technical and a political role, and how standards processes (at least at the level of content standards) can sensitively engage with this. Below is a repost of my earlier contribution to a GitHub thread discussing some of this in the context of Open Contracting.

Two senses of standard

In Open Contracting I believe we’re dealing with two different senses of ‘standard’, and two purposes which we need to keep in balance. Namely:

Standards as a basis for interoperability – as in *”their data complies with the standard, and can be used by standards-compliant tools.”
Standards as targets – as in, “they have achieved a high standard of disclosure”.

To unpack these a bit:

(Note: the arguments below are predominantly theoretical, and so some of the edge cases considered may not come up at all in practice in the Open Contracting Data Standard, but considering them is a useful exercise to test the intuitions and principles directing our action.)

Standards as interoperability

We’re interested in interoperability in two directions: vertical (can a single dataset be used by other actors and tools in a value-chain of re-use), and horizontal (can two datasets from different publishers be easily analysed alongside one another).

Where data is already published, then the goal should be to achieve the largest possible set of data publishers who can richly represent their data in the standard, and of data users who can draw on data in the standard to meet their needs. This supports the idea that for any element in the standard where (a) data already exists; and (b) use cases already exist; we should be looking for reference implementations to test that data can be rendered in the standard, and that users (or tools they create) can read, analyse and use that data effectively.

However, it is important that in this we look at both both horizontal and vertical interoperability in making this judgement. E.g. there could be a country as the sole publisher of a field that is used by 5 different users in their country. This should clearly not be a required field in a standard, but articulating how it is standardised is useful to this community of users (one way to accommodate such cases may be in extensions, although the judgement on whether or not to move something to an extension might come down to whether it is likely that other publishers could be providing this data in future).

In many cases, underlying data from different sources is not perfectly interoperable, or there is a mismatch between the requirements of users, and the requirements of data holders. In these cases, the way a standard is designed affects the distribution of labour between publishers and users with respect to rendering data interoperable. For example, a use case might involve ‘Identifying which different government agencies, each publishing data independently, have contracts with a particular firm’. In this case, a standard could require all publishers, who may store different identifiers in their systems, to map these to a common identifier, or a standard could allow publishers to use whatever identifier they hold, leaving the costs of reconciling these on the user. Making things interoperable then involves can involve then a process of negotiation, and this process may play out differently in different places at different times, leaving certain elements of a standard less stable than others. The concept of ‘designing for the tussle’ (PDF) may be relevant here, thinking about how we can modularise stable (or ‘neutral’) and unstable elements of a standard (this is what the proposed Organisation ID standard does, but having a common way to represent identifiers, but separating this off from the choice of identifier itself, and then allowing for the emergence of a set of third-party tools and validation routines to help manage the tussle).

In seeking to maximise the set of publishers and users interoperable through the standard we need to be critically aware of both short-term and long-term interoperability, as organisations modify their practices in order to be able to publish to, or draw upon, a common standard. We need to balance out a ‘Lowest Common Denominator’ (LCD) of ‘Minimum Viable Product’ (MVP) approach that means that the majority of publishers can achieve substantial coverage of the standard, with a richer standard that supports the greatest chance of different producer and consumer groups being able to exchange data through the standard.

initial-sketch-thinking-about-standards

(Initial attempt to sketch distinction between maximising set of common fields across publisher and users, and maximising set of publishers and users)

Standards as targets

Open Contracting is a political process. The Open Contracting Partnership have articulated a set of Global Principles which set out the sorts of information about contracting that governments and other parties should disclose, and they are working to secure government sign-up to these principles. In policy circles, a standard is often seen as a form of measure, qualitative or quantitative, against which process towards some policy goal is measured. Some targets might be based on ‘best practice’, others are based on ‘stretch goals’: things which perhaps no-one is yet doing particularly well, but which a community of actors agree are worth aiming for. A standard, whether specified in terms of indicators and measures, or in terms of fields and formats, provides a means of agreeing what meeting the target will look like.

The Open Contracting Principles call for a lot of things which no governments appear to yet be publishing in machine-readable forms. In many cases we’ve not touched the standardisation of these right now (e.g. “Risk assessments, including environmental and social impact assessments”) recognising that standards for these will either exist in different domains that can be linked or embedded into our standard, or, recognising that interoperability of such information is hard to achieve and ultimately what is needed for most use cases may be legal text or plain language documents, rather than structured data. However, there may be cases where something is a strong candidate for standardisation, having both the potential to be published (i.e. this is something which evidence suggests governments either do, or could, capture in their existing information systems), and for which clearly articulated use cases exist. In these cases a proposed field-level standard can act as an important target for those seeking to provide this data to move towards. It also acts to challenge unwarranted ‘first mover advantage’ where the first person to publish, even if publishing less than an idea target would require, gets to set the standard, and instead makes the ‘target’ subject to community discussion.

Clearly any ‘aspirational’ elements of a standard should not predominate or make up the majority of a standard if it seeks to effectively support interoperability, but in standards that play a part in policy and political processes (as, in practice, all standards do to some extent (c.f. Lessig).

Implications for Open Contracting Data Standard

There are a number of ways we might respond to a recognition of the dual role that standardisation plays in Open Contracting.

Purposes and validation sets

One approach, suggested in the early technical scoping is to identify different sets of users, or ‘purposes’ for the standard, and for each of these to identify the kinds of fields (subset of the data) these purposes require. As Jeni Tennison’s work on the scoping describes “…each purpose can have a status (eg proposed vs implemented) and … purposes are only marked as implemented when there are implementations that use the given subset of data for the specified purpose”.

If their are neither purposes requiring a field, nor datasets providing a field, then it would not be suitable for inclusion in a standard. And if a purpose either went unimplemented for a long period, or required a field that no supplier could publish, then careful evaluation would be needed of whether to remove that purpose (or remove that field from the purpose) against which elements of the standard could be evaluated for relevance to remain in the model.

Purposes could also be used to validate datasets, identifying how many datasets are fit for which purpose.

Stable, ordinary and target elements

We could maintain a distinction in how the standard is described between fields and elements which are ‘stable’ (and thus very unlikely to change), ‘ordinary’ elements (which may have reference implementations, but could change if there was some majority interest amongst those governing a standard in seeing changes), and ‘target’ elements, which may lack any reference implementations, but which are considered useful to help publishers moving towards implementing a political commitment to publish.

Q: Could we build this information into the schema meta-data somehow?

We might need to have quite a long time horizon for keeping target elements provisionally in the standard, and to only remove them if there is agreement that no-one is likely to publish to them. However, being able to represent them visually as distinct in the schema, and clearly documenting the distinction may be valuable.

Extensions

Some ‘target’ elements may best belong in extensions, with some process for merging extensions into the core standard if they are widely enough adopted.

Regular implementation monitoring

The IATI Team run a dashboard which tracks use of particular fields in the data. Doing similar for Open Contracting would be valuable, and it may even be useful to feed such information into the display of the schema or documentation (or at least to make it easy for publishers and users to look up who is implementing a given property)

Implementation schedules

Another approach IATI uses for ‘target elements’ is to ask publishers to prepare ‘Implementation Schedules‘ which outline which fields they expect to be able to publish by when. This allows an indication of whether there is political will to reach some of the ‘stretch targets’ that might be involved in a standard, and holds out the potential to convene together to define and refine target standardisations those who are most likely to publish that data in the near to medium term.

Discussion

What theoretical writing on standardisation could I be drawing on here?

What experience from other standards could we be drawing upon in Open Contracting and in other standard processes?

Exploring Wikidata

[Summary: thinking aloud – brief notes on learning about the wikidata project, and how it might help addressing the organisational identifiers problem]

I’ve spent a fascinating day today at the Wikimania Conference at the Barbican in London, mostly following the programmes ‘data’ track in order to understand in more depth the Wikidata project. This post shares some thinking aloud to capture some learning, reflections and exploration from the day.

As the Wikidata project manager, Lydia Pintscher, framed it, right now access to knowledge on wikipedia is highly skewed by language. The topics of articles you have access to, the depth of meta-data about them (such as the locations they describe), and the detail of those articles, and their liklihood of being up to date, is greatly affected by the language you speak. Italian or Greek wikipedia may have great coverage of places in Italy or Greece, but go wider and their coverage drops off. In terms of seeking more equal access to knowledge, this is a problem. However, whilst the encyclopedic narrative of a French, Spanish of Catalan page about the Barbican Center in London will need to be written by someone in command of that language, many of the basic facts that go into an article are language-neutral, or translatable as small units of content, rather than sentences and paragraphs. The date the building was built, the name of the architect, the current capacity of the building – all the kinds of things which might appear in infoboxes – are all things that could be made available to bootstrap new articles, or that, when changed, could have their changes cascaded across all the different language pages that draw upon them.

That is one of the motivating cases for Wikidata: separating out ‘items’ and their ‘properties’ that might belong in Wikipedia from the pages, making this data re-usable, and using it to build a better encyclopedia.

However, wikidata is also generating much wider interest – not least because it is taking on a number of problems that many people want to see addressed. These include:

Somewhere ‘institutional’ and well governed on the web to put data – and where each data item also gains the advantage of a discussion page.
The long-term preservation, and versioning, of data;
Providing common identifiers on the web for arbitrary things – and providing URIs for these things that can be looked up (building on the idea of DBPedia as a crystalisation point for the web of linked data);
Providing a data model that can cope with change over time, and with data from heterogenous sources – all of the properties in wikidata can have qualifiers, such as when the statement is true from, or until, source information, and other provenance data.

Wikidata could help address these issues on two levels:

By allowing anyone to add items and properties to the central wikidata instance, and making these available for re-use;
By providing an open source software platform for anyone to use in managing their own corpus of wikified, versioned data*;

A particular use case I’m interested in is whether it might help in addressing the perenial Organisational Identifiers problem faced by data standards such as IATI and Open Contracting, where it turns out that having shared identifiers for government agencies, and lots of existing, but non-registered, entities like charities and associations that give and recieve funds, is really difficult. Others at Wikimania spoke of potential use cases around maintaining national statistics, and archiving the datasets underlying scientific publications.

However, in thinking about the use cases wikidata might have, its important to keep in mind it’s current scope:

It is a store of ‘items’ and then ‘statements’ about them (essentially a graph store). This is different from being a place to store datasets (as you might want to do with the archival of the dataset used in a scientific paper), and it means that, once created, items are the first class entities of wikidata, able to exist in multiple collection.
It currently inherits Wikipedia’s notability criteria for items. That is, the basic building blocks of wikidata – the items that can be identified and described, such as the Barbican, Cheese or Government of Grenada – can only be included in the main wikidata instance if they have a corresponding wikipedia page in some language wikipedia (or similar: this requirement is a little more complex).
It can be edited by anyone, at any time. That is, systems that rely on the data need to consider what levels of consistence they need. Of course, as wikipedia has shown, editability is often a great strength – and as Rufus Pollock noted in the ‘data roundtable’ session, updating and versioning of open data are currently big missing parts of our data infrastructures.

Unlike the entirely distributed open world assumption on the web of data, where the AAA assumption holds (Anyone can say Anything about Anything), wikidata brings both a layer of regulation to the statements that can be made, and the potential of community driven editorial control. It sits somewhere between the controlled description sets of Schema.org, and an entirely open proliferation of items and ontologies to describe them.

Can it help the organisational identifiers problem?

I’ve started to carry out some quick tests to see how far wikidata might be a resource to help with the aforementioned organisational identifiers problem.

Using Kasper Brandt‘s fantastically useful linked data rendering of IATI, I queried for the names of a selection of government and non-government organisations occurring in the International Aid Transparency Initiative data. I then used Open Refine to look up a selection of these on the DBPedia endpoint (which it seems now incorporates wikidata info as well). This was very rough-and-ready (just searching for full name matches), but by cross-checking negative results (where there were no matches) by searching wikipedia manually, it’s possible to get a sense of how many organisations might be identifiable within Wikipedia.

So far I’ve only tested the method, and haven’t run a large scale test – but I found around 1/2 the organisations I checked had a Wikipedia entry of some form, and thus would currently be eligible to be Wikidata items right away. For others, Wikipedia pages would need to be created, and whether or not all the small voluntary organisations that might occur in an IATI or Open Contracting dataset would be notable for inclusion is something that would need to be explored more.

Exploring the Wikidata pages for some of the organisations I did find threw up some interesting additional possibilities to help with organisation identifiers. A number of pages were linked to identifiers from Library Authority Files, including VIAF identifiers such as this set of examples returned for a search on Malawi Ministry of Finance. Library Authority Files would tend to only include entries when a government agency has a publication of some form in that library, but at a quick glance coverage seems pretty good.

Now, as Chris Taggart would be quick to point out, neither wikipedia pages, nor library authority file identifiers, act as a registry of legal entities. They pick out everyday concepts of an organisation, rather than the legally accountably body which enters into contracts. Yet, as they become increasingly backed by data, these identifiers do provide access to look up lots of contextual information that might help in understanding issues like organisational change over time. For example, the Wikipedia page for the UK’s Department for Education includes details on the departments that preceeded it. In wikidata form, a statement like this could even be qualified to say if that relationship of being a preceeding department is one that passes legal obligations from one to the other.

I’ve still got to think about this a lot more, but it seems that:

There are many things it might be useful to know about organisations, but which are not going to be captured in official registries anytime soon. Some of these things will need to be subject of discussion, and open to agreement through dialogue. Wikidata, as a trusted shared space with good community governance practices might be a good place to keep these things, albeit recognising that in its current phase it has no goal of being a comprehensive repository of records about all organisations in the world (and other spaces such as Open Corporates are already solving the comprehensive coverage problem for particular classes of organiastion).
There are some organisations for which, in many countries, no official registry exists (particularly Government Departments and Agencies). Many of these things are notable (Government Departments for example), and so even if no Wikipedia entry yet exists, one could and should. A project to manage and maintain government agency records and identifiers in Wikidata may be worth exploring.

Whether a shift from seeking to solve some aspects of the organisational identifiers problem through finding some authority to provide master lists, to developing a distributed best-efforts community approach is one that would make sense to the open government community is something yet to be explored.

Notes

*I here acknowledge SJ Klein‘s counsel was that this (encouraging multiple domain specific instances of a wikidata platform) is potentially a very bad idea, as the ‘forking’ of wiki-projects has rarely been a successful journey: particularly with respect to the sustainability of forked content. As SJ outlined, even though there may be technical and social challenges to a mega graph store, these could be compared to the apparant challenges of making the first encyclopedias (the idea of 50,000 page book must have seemed crazy at first), or the social challenges envisioned to Wikipedia at its genesis (‘how could non-experts possible edit an enecylopedia?’). On this view, it is only by setting the ambition of a comprehensive shared store of the worlds propositional data (with the qualifiers that Wikidata supports to make this possible without a closed world assumption) that such limits might be overcome. Perhaps with data there is a greater possibility to support forking, and remerging, of wikidata instances, permitting short-term pragmatic creation of datasets outside the core wikidata project, which can later be brought back in if they are considered, as a set, notable (although this still carries risks that forked projects diverge in their values, governance and structure so far that re-connecting later is made prohibitively difficult).

Reflections on open development from OKFest

[Summary: trying to capture some of the depth of discussion from a session on open (international) development at the Open Knowledge Festival]

“I don’t have a problem with the word open, I have a problem with the word development.”
Philip Thigo, in visions of open development panel at OKFest.

To international development practitioners, or communities receiving development aid, much of the ‘visions of open development’ discussion at the Open Knowledge Festival will have sounded familiar. Call for more participatory processes have a long history on the development field; and countless conferences have been spent focussing on the need for greater inclusion of local communities in setting priorities, and in holding institutions to account for what they deliver. Yet, for the Open Knowledge movement, where many are just now discovering and exploring the potential application of open technologies, data and knowledge to challenges of human development in the global South, engaging with well established critiques of development is important. Open data, open knowledge, open source and open hardware could all potentially be used in the pursuit of centralised, top-down models of development, rather than supporting emancipatory and participatory development practice; highlighting the need to ensure vision of open development import thinking and experience from development practice over recent decades if open development is to avoid leading to missed opportunities, or even leading to oppressive forms of development practice.

Yet, articulating open development involves more than importing established critical perspectives into the application of open data, open technologies and open knowledge to development problems. It involves working out both how the application of these ‘open’ technologies can impact on development practice, and identifying new cross-cutting values, rules and institutional arrangements that can guide their adoption. As our panel in Helsinki explored, this exploration will have to deal with a number of tensions.

Decentralising development?
Linda Raftree opened the panel with an input that talked of the ‘horizontality’ of networked communication. Linda suggested that, whilst open development is not about the technology, it has much to learn from the structures and organising principles we find in contemporary technologies. The Internet, with it’s networked and broadly peer-to-peer architecture, in which anyone with access can participate without prior permission offers a potential template for structuring development co-operation. Karina Banfi picked up the theme in arguing against ‘top-down’ development, and advocating consultation and active engagement of communities in setting development priorities and processes.

An illustration of the potential difference between centralised and decentralised development at the infrastructure level was offered by Urs Riggenbach of Solar Fire, who described the development of open source hardware for small-scale hydro-electric power generation. Urs argued that, rather than massive cost large-scale Dams projects, with their visible ecological impacts, potential to displace communities, and scope for corruption in their contracting arrangements, communities could make use of Intellectual Property free designs to construct their own small-scale solutions.

There might be a distinction here to draw between ‘weak’ and ‘strong’ decentralisation. In the former, citizens are given access to information (perhaps via data) and channels through which to feedback to those who control budgets and power. Decision making and ultimate executive responsibility remains large scale, and final authority invested in representative institutions. In the later, decision making and executive responsibility are devolved down to the local level with open knowledge used to support communities to be more self-reliant. Underlying this (as underlying all choices about how we practice openness) is a political choice about the level at which communities should co-ordinate their activities, and the mechanisms through which that co-ordination should take place: from formal states, to voluntary associations, to distributed ‘market’ mechanisms.

Although Tariq Kochkhar suggested that open development achieved would mean that ‘all people have the freedom to make choices over their own development’, panelists and participants from the audience emphasised a number of times that it is important not to ignore power, and to recognise that open development shifts where power lies, but does not necessarily decentralise or remove it altogether. In fact, this is something the Internet potentially shows us too. Although theoretically a decentralised medium, in practice there are a small number of companies who wield significant power online, such as the search services that not only act as a gateway to available information, but also in their choices about what to index or not, create incentives for other actors on the Web to shape their content in particular ways.

Rules for openness
I’ve suggested that most notions of openness are articulated in opposition to some set of closed arrangements, but that does not mean that openness involves just the negation of those arrangements. Rather, openness may need it’s own rules to function. In our panel, Blane Harvey emphasised, openness is not the same as de-regulation, although, as Jyrki Pulkkinen reminded us, the term open may be in active use with such connotations, as in the case of discussing an ‘open and free markets’.

The need to scaffold openness with rules and institutions if it is to lead to positive development has gone relatively unexplored in past discussions. Yet is an important debate for the open development community to engage in. Rules may be needed to protect the privacy and security of certain development actors through non-disclosure of information (Pernilla Nastfor’s of the Swedish International Development Agency highlighted the potential risks the human rights activists they fund may face in repressive regimes if full details of these projects were transparent). Rules may also be needed to ensure citizens can benefit from open knowledge, and to manage the distribution of benefits from openness.

Linda Raftree raised the question of whether the open development discourse is too often one of ‘trickle down openness’, where the fact that new technologies are securing greater openness for some, is assumed to mean that more openness for all will eventually result via some trickle-down process. This echoes the critique from Michael Gurstein that open data risks simply empowering the empowered. Some of the rules needed, like Right to Information guarantees, rather than just openness as an optional extra granted by governments, are well known – but there may be other rules required to ensure the benefits of open information and technologies are more equally distributed. For example Jyrki Pulkkinen noted that ‘open innovation’ was a key engine to convert open knowledge into enterprise and activity that can work for development, and yet so often innovation is frustrated by restrictive intellectual property and patent laws that create a thicket innovators may struggle to get through, even when much of the knowledge they need to innovate has been made more accessible. In a similar vein, Jyrki noted that open information in the political domain should not just be about freedom to receive, but should also open outwards into freedoms of expression that need to be guaranteed.

Before moving on from a consideration of the rules, regulations and institutions that enable or constrain equitable outcomes from openness, it is worth remembering Lessig’s phrase ‘Code as law’. Many of the ‘rules’ which will affect how open development operates in practice may not be within formal legal or regulatory frameworks, but may exist built into the technical artifacts and networks which deliver open content, data, information and hardware designs.

Culture, structure, policy
The importance of culture change was another theme that came out during our panel. Tariq Khokhar suggested that the World Bank’s policies on open data had brought new actors into the bank, creating the potential for a positive feedback loop, slowly shifting the culture of the organisation. Though Tariq also highlighted that big organisational change may require ‘principles of open development’: organisational tools that can be used to determine when projects are ‘open development’ projects or not – to avoid the latest buzz-word being applied to any project. Asked about how far development has shifted in recent years, Philip Thigo focussed on a perceived increase in the accessibility of staff from large institutions, and how more doors were open for conversation. Perhaps underplayed in our discussions so far has been the influence of e-mail, social media, search and generally accessible online information in creating more ‘open communications’ between development donors and others.

An input from Anahi Ayala Iacucci also got us thinking about the processes of development aid decision making, and the tensions between a desire for locally owned and defined projects, and a requirement from donors to have clear project plans and deliverables. Creating a culture supportive of emergent project plans is a challenge (as the aptly named ‘IKM Emergent‘ programme discovered over it’s five year duration), and it is possible that a focus on transparency and accountability, without looking carefully at the balance of power and who is doing the calling to account, could lead to a greater focus on fixed project plans rather than a greater freedom and flexibility, and openness to local pressures and demands. As technological and open information interventions of open development unfold, tracking how they feed into culture change in positive and negative ways is likely to be instructive.

Next steps in the conversation
The last post I started on Open Development, I didn’t think I would reach any conclusions, but I ended with a rough minimal description of what I saw to be some essential elements of open development. This time, following an incredibly rich discussion at the Open Knowledge Festival, I find I’ve got a sense of many more jigsaw puzzle pieces of open development – from the role of rules and policies; to the tensions of decentralisation – yet I’m less sure how these fit together, or how far there is a clear concept of open development to be articulated.

In debriefing from the Open Knowledge Festival, one of the general feelings amongst the open development track team was that bringing together these conversations in Helsinki was important to open up a space in the Open Knowledge movement to recognise how the themes being discussed had impacts beyond the US and Europe. It may be that open development is ultimately about providing a space to critically bridge between knowledge and perspectives from development, and ideas and perspectives from the diverse networks of open access, open hardware, open data, open culture and open knowledge currently developing across the world. In any case, as the conversation moves forward hopefully we can combine the practical and critical edge that discussions at OKFest displayed…

Reflections on an open panel

[Summary: learning notes from an experimental approach to running a panel at Open Knowledge Festival]

How do you hold an open discussion about ‘open development’ in a theatre-style auditorium? That’s what we tried to explore at the Open Knowledge Festival in Helsinki with an experimental ‘open panel’. Our goal was to combine input from experts and key contributors to the field, with a format that recognised that expertise and relevant insights were not just held by those on the pre-selected panel, but was also to be found amongst the audience in the room. The panel design we came up with draw upon ideas from ‘Fishbowl’ conversations, and involved creating space for members of the audience to join the panel after the initial inputs from pre-selected panelists.

The Format

Here’s a quick overview of the format we used:

1) We had six pre-selected panelists, each speaking for a maximum of five minutes without slides to introduce their views on the topic

2) We then opened the floor to inputs from the audience. The audience were told they could either come forward an ask a question, or could join the panel, taking a seat on stage to put forward their view. There were 10 seats on the stage overall, creating space for four at least four audience panelists.

3) There was the option of anyone (initial panelists, or those joining from the audience) leaving the panel if they felt they had said enough and wanted to create space for anyone else.

4) We also invite questions via Twitter, and ran a number of online polls to gather views. (We originally considered using handset voting, but decided against this for simplicity)

You can find the short presentation that I used to introduce the format here.

Instead of, as planned, having three podium microphones for panelists to come forward to, we passed a roving microphone along the panel.

How did it work?

Overall I believe it was a very successful panel: keeping inputs from panelists short kept the panel moving and let us cover a lot of ground. Listening back to the panel (LINK) it seems we had a reasonably good balance of voices. A number of points and themes were developed over the course of the session, although interwoven into one another – rather than as explicit threads.

When we opened the floor for audience contributions, initially contributions were just in the form of questions to the panel, rather than people taking up the empty seats on stage. It took some encouragement for people to ‘join’ the panel, although those participants who did join gave noticeably different contributions: being sat down alongside other panelists seems to really change the tone of the contribution someone makes – potentially bringing about much more relaxed and discursive contributions. With only four people choosing to take up the extra seats on the panel by the end of the session we didn’t get to see what would happen when space ran out, and if anyone would choose to leave.

The online voting and Twitter input in this case got relatively little traffic.

Learning points and reflection

I would experiment with this format again, although I would consider removing the option of just asking a question, and making the only ways for audience to input as either taking a seat on the panel, or asking questions via Twitter (getting people to take a seat and offer their input from being seated with the panel is I think the key; even if they focus on asking a question and leave immediately after it is answered).

We had aimed for a relatively diverse pre-selected panel of speakers. We need to think more about whether the format risks the overall inputs being less diverse, as the most confident may be more likely to choose to come and join the panel (with a bias greater than occurs with who choose to come forward and ask questions). The facilitator perhaps needs to have some control over the queues coming to contribute to ensure a balance of voices.

Having the roving microphone handed along the panel provided a good way of encouraging short contributions, and getting the panel to self-manage who was going to speak next. I stood outside the panel as facilitator, and at a number of points simply told the panelists how long they had to answer a question, and invited them to self-organise within that time to ensure everyone who wanted to go to speak. This appeared to be fairly effective, and to keep the conversation flowing.

Whilst we decided not to use keypad voting, I would consider doing this at a future session where keypads are available, if only as a good way to get people arriving early to come down and fill in rows at the front of the auditorium, rather than hanging around the back.

If you were involved in the panel, in the audience, or you’ve watched the recording – then I’d really welcome your feedback and reflections too… drop in a comment below…

Conclusions

There’s mileage in the open panel format, and it’s certainly something I’ll be looking to explore more in future.

Participation, enterprise, legitimacy and power: reflections from the Dyroy Seminar

[Summary: Links and reflections from a day live-blogging with Web Scientists in North Norway]

The Dyroy context
I’ve spent the last few days with a group of Web Science students in the community of Dyroy, in Northern Norway, about two hours by boat from the regional capital Tromso, and located around 69 degrees North – inside the arctic circle. Dyroy, like many rural areas, is facing a tough challenges to maintain a vibrant community as opportunities for employment draw young people away towards the cities, and as old industries and trade decline. Yet, as we heard from the Norwegian Minister for the Regions at today’s Dyroy Seminar, the area is not one to simply shrug and let decline set in – but is an area where citizens have come together to find new ways to sustain and develop the community. Although electricity only reached much of Dyroy in the 1950s, the Old Trading Post where we were staying brought a phone line into the area in the mid 1800’s, and the modern development of Dyroy relies heavily on high-speed Internet connectivity (hence the Web Science connection…).

One of the ways the community comes together is through a bi-annual conference, exploring topics of interest to the local community. This year’s seminar focussed on youth – looking at issues of youth participation, as well as exploring questions of identity and sustainable entrepreneurship and employment. As Web Science students we were present to explore how the web could be used to amplify some of the discussions from the first day of the seminar, and to build new online connections between ideas from Dryoy and the wider world. The day before the seminar, we spent time with students at the local school, running a number of workshops, including one exploring how social media could be used to campaign on key issues.

You can find a wealth of live-blogging and social reporting from the seminar here, and on the Dyroy Seminar website you will find a number of Norwegian reports about our projects. However in this post I wanted to draw out just a few reflections about some of the key youth participation themes of the last two days, in a way that I hope will be helpful both for those who took part in Dryoy, and for the wider readership of this blog.

Participation, politics and power
The room with English translation - and our social reporting hub I was hopeful that in Norway, the first county to establish a Children’s Rights Ombudsman, that when I asked a group of 13 – 15 year old students if they were aware of their right to participate under Article 12 of the UN Convention on the Rights of the Child that every hand would go up. However, translation issues aside, both in the school where we worked, and the seminar, ideas of participation did not appear to be explicitly rooted in the human rights of children and young people to have their views heard in matters that affect them. Building on a rights-based foundation is important to highlight (a) that children and young people’s participation needs to be about having an individual say, for example, in home life – as well as having a collective voice on community issues; (b) to recognise that all young people have participation rights, not just those who shout loudest or who get involved in formal structures.

Forms of participation that involve an individual expression of views can sit alongside participation in more formal politics: where debates are often concerned with the allocation of scarce resources. However, as a number of youth councillors, and young political party members debated during the seminar, it is important for young people engaged in political participation structures to be aware of the dangers of pursuing power for it’s own sake.

Structures, shared values and shared challenges
Much of the morning of the seminar involved discussion of how Norway is well on the way to having a Youth Council in every municipality, with the possibility of legislation to require Youth Council’s to be established. There was some debate over whether national requirements for youth representation would lead to an over-prescriptive set of structures, and whether instead flexibility was needed for each local area to develop it’s own youth participation approaches. The importance of handing over real power to youth fora was discussed, including mention of youth-led grant making (such as existed at scale in the UK with the, now sadly much rarer, Youth Opportunity Funds and Youth Capital Funds, and as still exists in other youth led grant making globally), or youth involvement in budgeting (or perhaps budget monitoring and advocacy, as a number of global youth participation projects are exploring).

In my experience of working on youth participation structures in the UK, when approaches are formalised it is important to recognise that there is no single structure that can support effective participation and representation, or that provides a suitable means of engagement for all young people. Rather, good participation involves a spread of interlinked approaches, from good complaint and feedback systems, through one-off-events and activities, to regular and structured representative structures. With the right design and active facilitation, online social media tools are potentially very effective to ‘bridge the gap’ between forms of one-off engagement, and more sustained engagement in local decision making.

Even with a good mix of approaches to youth participation, and many channels through which young people can get involve – without ‘shared values‘ being clearly articulated, and a wide shared understanding in the community that children and young people are equal citizens – participation of all forms risks becoming tokenism.

One of the peculiar properties of youth participation structures over other participation structures, is the relatively rapid turnover of membership. By definition, one can only be a member of a youth council for quite a short period of time compared perhaps to the main council. This leads to a need for both shared values, and participation structures, to be regularly revisited, revived and regenerated. It can also lead to a structural disadvantage for young people seeking to express their views – as they have to spend comparatively longer picking up the background knowledge needed to engage in particular debates, or may have less access to prior experience that could support them to secure the outcomes they want.

I’ve long been interested in the potential of the web to create a stronger institutional memory for youth campaigns: with social reporting and regular online reporting of youth activities generating an open record that future young participants can pick up – able to benefit from the experience of their predecessors. However, although it is often claimed that the Internet never forgets, in practice, keeping content updated and discoverable over many years turns out to be very challenging. For example, content from the Youth Council Website I developed and maintained over 10 years ago is now only available in the Internet Archive, where you would only find it if you knew where to look, and the archive of Oxford’s Social Responsible Investment campaign is scattered across a number of sites. Even if it was easy to deposit content from youth participation on the web as part of a long-term archive, we need better approaches to curate it so that future generations of youth representatives and campaigners can quickly find the intelligence they need to strengthen their hands.

Shared challenges
Having drawn on a rather oppositional idea of youth participation above: with the need to strengthen the voices of young people in contrast to those of adults, I want to step back and example whether that opposition is useful. There is a common platitude in youth participation events to talk about ‘young people as the future’. This is often met with the reply from young people that ‘we are part of the present too’, which is a very fair response. However, what concerns me more in this claim is that it often covers up an implied abdication of responsibility on the part of adults. By saying ‘we need the innovative ideas of young people to sort out future problems’, adults can be letting themselves off the hook for also being part of creating those innovative solutions. It can be a way of pushing the solving of the problem off into the future, perpetuating the generational injustice that has seen those currently in power create environmental problems, burden states with debt, and enable vastly unequal development (an accusation I target more at political leaders in the UK than Norway here).

In many cases the challenge is not to listen to the voice of youth, but to find ways for people to be involved in shared problem solving, regardless of age or background.

Entrepreneurship and legitimacy
Anders Waage Nilsen took us away from participation structures in his presentation to the seminar, highlighting how, particularly with the web, it is possible for people of all ages to self-organise, bringing an entrepreneurial spirit to problem solving. This approach, rooted in an impatience and desire to see change, suggests that young people should not wait around to have access to formal decision making power from which they can call for alternative models of economic and environmental development – but suggests that young people should use their networks to actively create the sorts of future they want.

The forms of ad-hoc social innovation enable by the web, by new practices and emerging norms of self-organising truly offer great opportunities to attack persistent social challenges, but when they become one of our primary modes of acting they also raise challenges of legitimacy. How far can, and should, communities (from local communities like Dyroy, to national communities like Norway) exercise collective self-determination over what happens amongst them? When the ability to take advantage of technologies to self-organise is not conditioned only by access to technology, but also by wider access to social and financial capital, how can a community avoid those with money and networks over-dominating the shape of local development by simply getting on with what they want to do outside of representative structures?

I’m not at all suggesting here that social innovation should be curtailed, and I would generally celebrate the forms of entrepreneurial social action Nilsen described. Yet, the most that ‘political action’ is conducted through ad-hoc actions, the more we need to find new ways to respond to it. To some extent our representative structures are about striking a balance of power, and as power shifts in the network society, we may need to develop new ways to regulate it’s legitimate exercise.

Web Science reflections: bridging with artifacts and agency
I’ve already mentioned a few ways the web might impact upon youth participation: from helping maintain an institutional memory for youth fora, to supporting new models of social action and problem solving.

In our workshop with students yesterday, we used the Social Media Game (with some extra cards made specially for this workshop), to explore how students might use the web to campaign on issues that affected them – from the poor quality of some roads, to a lack of activities, and issues relating to drugs and crime. A number of the strategies for using the web the young people put together involved strong use of online and offline channels – recognising that, for example, the support gathered on a Facebook page might need to be expressed through a letter to a politician to get their attention, or nothing that out-and-about exploration of problems with potholes could be taken online through videos and shared to raise awareness of the problem.

As we have also been exploring acting as social reporters today, bridging involves a mix of technical artifacts (tweets, blog posts, video clips and so-on), digital networks, and human connections. Understanding how these interact, and the different dynamics that affect each (from the design of content and messages, to the structure of digital networks, and the social psychology of sharing content) should be an important part of the contribution Web Science makes to thinking about participation.

Where next in a social reporting cycle?

For many of the Web Science DTC students, today was a first taste of live blogging and social reporting. Even for a single track conference, live blogging and social reporting generate a lot of content. Unlike events such as the Internet Governance Forum, where social reporting may be part of facilitating engagement in the live event, in the case of the Dyroy seminar, our social reporting has served more to amplify and create a record of the event. Working out sustainable ways to create a legacy out of this content is a challenge. For me, a first reflective blog post is a way to draw out some themes to reflect on more – that might emerge into future writing. However, with such a wealth of content generated through today – we do need to think more about how we might curate elements of it to further share ideas and debates from today’s event.

What is Open Development?

In just over a week I’ll be at the Open Knowledge Festival in Helsinki, where thanks to the work of an amazing team of volunteers, we will have a series of sessions taking place under the banner of ‘Open Development‘, looking at where Open Knowledge themes meet international development.

In one of those sessions we’ll be asking what we really mean by open development: inviting participants to share their own responses to the question ‘What does open development mean to you?’. I realised that, for all the time I’ve spent moderating the OKF open-development working group’s mailing list, and inputting to the OKFest Open Development stream, I’ve not had a clear answer to that question. I’m hoping that next weeks session will help address that, but in advance I thought it would be useful to jot down some reflections on how I might answer the question right now.

Of course, as luck would have it, I’m at just that stage in the PhD process of working out the questions, but not yet getting to the simplified crisp answers, so what follows is some thinking aloud, rather than a set answer…

The essence of open

I’ve written before about the way that the prefix ‘open’ does not necessarily pick out some common property across it’s wide usage for ‘open access’, ‘open source’, ‘open data’ and ‘open content’, ‘open government’, and ‘open development’ – but at best can be seen as offering these labels a broad ‘family resemblance‘. There is an important distinction to observe between openness focussed on artifacts such as data, source code, or academic articles, and openness of processes, such as democracy and development. Formal definitions of the former may tend to be concerned more with the legal or technical status of the artifact, whereas definitions of the latter may focus on questions of who is participating, how they are allowed to participate.

In so far as we can find a common family trait amongst ‘the opens’, then I would suggest ‘access and permission’ is a good candidate. Openness should remove barriers to access, and should grant relevant permissions that allow either use of an artifact, or participation in a process.

Note that whilst the artifact and process distinction might be possible to make at the level of formal definitions, many times when terms like ‘open source’, or ‘open government’ are deployed, they are used to refer to refer to both artifacts and processes. For example, we might use open source to refer to the processes of the open source community and movement, rather than just the properties of the source code itself; or we might use the term open government to refer to the papers and documents of government, as well as to participative processes that let citizens input into governance. Open artifacts may in some cases be necessary, but not sufficient, for an open process. In their work on Open ICT’s for Development, Smith et. al provide a definition that combines ‘artifact’ and ‘process’ elements in understanding how open ICTs may be a matter of access, participation and collaboration. In the case of development though I think it can be sustained that development is a process, and a process that is concerned primarily with increasing human quality of life.

Of course, development in practice involves many processes, and in assessing in any case whether we have open development or not we might have to ask about the relative openness of any number of processes, from priority setting, to planning, to spending, to monitoring and governance.

Open as oppositional

If openness is about ‘access and permission’, then generally it is articulated in opposition to some set of ‘closed’ arrangements. For example, open access is articulated in opposition to the tight intellectual property control and high prices of journal articles that restrict academics access to articles, and their permission to share them. Open movements are hard to isolate and specify separately from those arrangements they oppose (this tends to cloud the artifact/process distinction – as getting a process to open up might well involve some opening of its constituent artifacts).

So, in the case of international development, what is being opposed? It would be easy to generate a long list of things wrong with the way development is done, and to suggest that ‘open development’ is simply the negation of these – but that would overload the concept of open development, and lead to it being seen as a panacea for all that is wrong. Rather, where is there a lack of access, and a lack of permission, in development as it is currently practised? My own initial answer would focus on the fact that those whose human welfare is supposed to be increased by development often have very little stake in the decision making about where resources for development will be used, or in wider policy debates with an influence on their welfare. Access to decision making, and permission to participate, are limited right now – and open development should be about addressing the closed nature of information artifacts, and communication opportunities, that support exclusive processes of governance.

Others may want to focus on different ‘closed’ areas of the current development field, and in doing so, to articulate different visions, or different aspects of the same vision, for open development.

Open X for open development

Counter to the argument above, open development could be said to simply be the application of other open initiatives to the development field. That is – using open data for international development could be said to in itself be ‘open development’. However, I would argue that this is overly reductive, and indeed misses that open technologies or artifacts could potentially be used for non-open development.

‘Open ICT for development’, ‘open source for development’ and ‘open data for development’ are all potentially very good things – but we might also want to ask about whether they need an extra open in there – as in ‘open data for open development’ and so-on.

Open is not enough

As I outlined above, openness removes specific barriers to access, and provides permissions to participate. However, this does not mean effective access to decision making for all. That requires additional attention.

Again, we could load this into the concept of open development, to suggest that openness of process necessarily requires us to ensure all potential participants can overcome barriers outside the process that inhibit their participation. For example, we could say that a community meeting which is formally open to all, is not truly open unless we have been able to pay all the travel costs of everyone who might want to participate and to translate it into all local languages, because without this, there are still barriers to access. However, rather than build these ideas into ‘open development’ I would suggest that we are better to see ‘open’ as amongst a number of desirable prefixes and modifiers for development, such as ‘inclusive’ and ‘egalitarian’.

So what is open development?

When I started writing, I wasn’t sure if I would get down to one clear sentence, or nothing at all. As it is, I think I can offer the following as an interim answer to the question:

Open development is a process
Open development is about providing access to information, and permission to participate
Open development is about challenging closed and distant decision making on development issues
Open development is a companion to inclusive development and can provide the foundations for greater inclusion
Open development is more than just using open data for development, or taking open source to developing countries
Open development is still open to debate

Whether I’ll say the same after next weeks debate we’ll find out – and if you want to suggest your own definition of open development to feed into the discussions, you can do so before 19th September 2012 in this Etherpad.

OGP Take Aways

[Summary: Ten observations and take-aways from #ogp2012]

In an attempt to use reflective blogging to capture thoughts from the Open Government Partnership meeting in Brasilia I’ve jotted down ten key learning points, take-aways, or areas I’ve been musing on. Where critical, I hope they are taken in the spirit of constructive critique.

1) Good ideas come from everywhere
Warren Krafchik made this point in the closing plenary, and it’s one that was apparent throughout OGP. The OGP provides a space for shared learning in all directions: across sectors and across countries. I’ve certainly found my own understanding of open data has been deepened by thinking about how the lessons from Transparent Chennai and Bangalore might apply in the UK context, and I look forward to OGP exchanges providing space for much more sharing of challenges and solutions.

2) The quality of Right to Information really matters
Another bit of shared learning from OGP was previewed in a Guardian article by Arunu Roy writing about the potential strength of the Indian Right to Information (RTI) Act, as against the UK Freedom of Information (FOI) Act. A lot of the civil society participants I spoke with had experience of working with their national RTI laws, or lobbying for them to be created, and the quality, rather than just the presence, of the laws, was a key theme. Some RTI laws require payment to request data; some allow anonymity, others ensure every requester provides their full details. These differences matter, and that presents a challenge for the OGP mechanisms, which at the moment simply require a RTI Bill as a condition of joining.

3) Whistle blower protection is an important factor in the journey from openness to impact
In the closing plenary, Samantha Powell summed this one up: “when you have access to information that challenges conventional wisdom, or when you witness some wrongdoing, you need the protection to come forward with it, and to often that protection is lacking”.

Open data, and access to information might give people working in organisations some of the pieces of the jigsaw they need to spot corruption and wrongdoing. But if they have no protection to highlight that, we may miss many of the opportunities for more open information to bring accountability and impact.

4) We’ve not yet cracked culture change and capacity building
The shift to open government is not just a shift of policy, it also involves culture shift inside government (and to an extent in how civil society interfaces with government). I heard a few mentions of the need for culture change in National Action Plan sessions, but no clear examples of concerted government efforts to address ‘closed cultures’.

5) Ditto effective large scale public engagement
Many countries hadn’t consulted widely on their National Action Plans, and few action plans I heard details of included much substantive on public participation. In part this was explained because of the short lead time that many countries had to produce their action plans: but for me this seems to point to a number of significant challenges we need to work out how to address if open government is to be participative government. Working out more agile models of engagement, that still meet desirable criteria of being inclusive and accessible is a big challenge. For the OGP, it’s also interesting to consider the role of ‘engagement with citizens’ through mass participation, and engagement with CSOs, potentially as mediators of citizen voice. One idea I explored in a few conversations was whether, when OGP Governments support mass-participation in shaping action plans, the raw input should be shared and jointly analysed with CSOs.

6) There is a need to distinguish e-government, from open government
As one of the speakers put it in the closing plenary of day 1: “the open government partnership is not an e-government partnership”. E-government to make public service provision more effective has it’s place, and may overlap with open government, but in itself e-government is not one-and-the-same-as open government.

7) We need both data infrastructures, and accessibility ecosystem, for open data
This is something I’ll write a bit more on soon, but broadly there needs to be a recognition that not only do both government and civil society have a role in providing national infrastructures of open data to support governance, but they also both have a role in stimulating eco-systems that turn that data into information and make it accessible. Some of that comes out a bit in the five stars of open data engagement, though stimulating eco-systems might involve more than just engagement around specific datasets.

8) We need to develop a deeper dialogue between technologists and issue activists
David Eaves has blogged about OGP highlighting a sense of a divide between many of the established civil society groups, and the more emergent technology-skilled open data / open government community. The message that open government is broader than open data can be read in multiple ways. It can be taken as trying to avoid an OGP agenda being used to further ‘open data from government’ as opposed to ‘open data for open government’. It can be taken as a downplaying of the opportunity that technologies bring for opening government. Or it can be taken as calling for technologies to build upon, rather than to try and side-step or leap-over, the hard work and often very contested work that has gone into securing access to information policies and other open government foundations. Some of the best cases I heard about over the OGP were where, having secured a right to information, activists were then able to use technologies and data to more effectively drive accountability.

Finding the common ground, and admitting spaces of difference, between technology and issue-focussed open government communities is another key challenge as OGP develops.

9) Monitoring should ultimately be about change for citizens, not just commitments and process
One of the key tasks for the OGP Steering Committee over the coming months is to develop an Independent Review Mechanism to monitor country action plans. In one of the panel sessions this was described more as an ‘evidence collection’ mechanism, to ensure all voices in a country are heard, rather than an assessment and judgement mechanism – so it holds out real potential to support both third-party evaluation (i.e. non OGP) of country progress against action plans, and to support formative evaluation and learning.

One point which came up a number of times was that OGP should be about change for citizens, not just commitments and process. A IRM that asks the ‘What’s Changed?‘ question of a wide range of citizens, particularly those normally excluded from decision making processes, would be good to see.

10) Deciding on the tenth item for a ten-item list is tricky
Instead you can just link to wisdom from @tkb.

5-Stars of Open Data Engagement?

[Summary: Notes from a workshop at UKGovCamp that led to sketching a framework to encourage engagement and impact of open data initiatives might contain]

Update: The 5 Stars of Open Data Engagement now have their own website at http://www.opendataimpacts.net/engagement/.

In short

* Be demand driven

* * Provide context

* * * Support conversation

* * * * Build capacity & skills

* * * * * Collaborate with the community

The Context

I’ve spent the last two days at UKGovCamp, an annual open-space gathering of people from inside and around local and national government passionate about using digital technologies for better engagement, policy making and practice. This years event was split over two days: Friday for conversations and short open-space slots; Saturday for more hands-on discussions and action. Suffice to say, there were plenty of sessions on open data on both days – and this afternoon we tried to take forward some of the ideas from Day 1 about open data engagement in a practical form.

There is a general recognition of the gap between putting a dataset online, and seeing data driving real social change. In a session on Day 1 led by @exmosis, we started to dig into different ways to support everyday engagement with data, leading to Antonio from Data.gov.uk suggesting that open data initiatives really needed to have some sort of ‘Charter of engagement’ to outline ways they can get beyond simply publishing datasets, and get to supporting people to use data to create social, economic and administrative change. So, we took that as a challenge for day 2, and in session on ‘designing an engaging open data portal’ a small group of us (including Liz Stevenson, Anthony Zacharzewski, J on Foster and Jag Goraya) started to sketch what a charter might look like.

You can see the (still developing) charter draft in this Google Doc. However, it was Jag Goraya‘s suggestion that the elements of a charter we were exploring might also be distilled into a ‘5 Stars’ that seemed to really make some sense of the challenge of articulating what it means to go beyond publishing datasets to do open data engagement. Of course, 5-star rating scales have their limitations, but I thought it worth sharing the draft that was emerging.

What is Open Data Engagement?

We were thinking about open data engagement as the sorts of things an open data initiative should be doing beyond just publishing datasets. The engagement stars don’t relate to the technical openness or quality of the datasets (there are other scales for that), and are designed to be flexible to be able to apply to a particular dataset, a thematic set of datasets, or an open data initiative as a whole.

We were also thinking about open government data in our workshop; though hopefully the draft has wider applicability. The ‘overarching principles’ drafted for the Charter might also help put the stars in context:

Key principles of open government data: “Government information and data are common resources, managed in trust by government. They provide a platform for public service provision, democratic engagement and accountability, and economic development and innovation. A commitment to open data involves making information and data resources accessible to all without discrimination; and actively engaging to ensure that information and data can be used in a wide range of ways.”

Draft sketch of five stars of Open Data Engagement

The names and explanatory text of these still need a lot of work; you can suggest edits as comments in the Google Doc where they were drafted.

* Be demand driven

Are your choices about the data you release, how it is structured, and the tools and support provided around it based on community needs and demands? Have you got ways of listening to people’s requests for data, and responding with open data?

** Provide good meta-data; and put data in context

Do your data catalogue provide clear meta-data on datasets, including structured information about frequency of updates, data formats and data quality? Do you include qualitative information alongside datasets such as details of how the data was created, or manuals for working with the data? Do you link from data catalogue pages to analysis your organisation, or third-parties, have already carried out with the data, or to third-party tools for working with the data?

Often organisations already have detailed documentation of datasets (e.g. analysis manuals and How To’s) which could be shared openly with minimal edits. It needs to be easy to find these when you find a dataset. It’s also common that governments have published analysis of the datasets (they collected it for a reason), or used it in some product or service, and so linking to these from the dataset (and vice-versa) can help people to engage with it.

*** Support conversation around the data

Can people comment on datasets, or create a structured conversation around data to network with other data users? Do you join the conversations? Are there easy ways to contact the individual ‘data owner’ in your organisation to ask them questions about the data, or to get them to join the conversation? Are there offline opportunities to have conversations that involve your data?

**** Build capacity, skills and networks

Do you provide or link to tools for people to work with your datasets? Do you provide or link to How To guidance on using open data analysis tools, so people can build their capacity and skills to interpret and use data in the ways they want to? Are these links contextual (e.g. pointing people to GeoData tools for a geo dataset, and to statistical tools for a performance monitoring dataset)? Do you go out into the community to run skill-building sessions on using data in particular ways, or using particular datasets? Do you sponsor or engage with community capacity building?

When you give people tools – you help them do one thing. When you give people skills, you open the possibility of them doing many things in future. Skills and networks are more empowering than tools.

***** Collaborate on data as a common resource

Do you have feedback loops so people can help you improve your datasets? Do you collaborate with the community to create new data resources (e.g. derived datasets)? Do you broker or provide support to people to build and sustain useful tools and services that work with your data?

It’s important for all the stars that they can be read not just with engaging developers and techies in mind, but also community groups, local councillors, individual non-techie citizens etc. Providing support for collaboration can range from setting up source-code sharing space on GitHub, to hanging out in a community centre with print-outs and post-it notes. Different datasets, and different initiatives will have different audiences and so approaches to the stars – but hopefully there is a rough structure showing how these build to deeper levels of engagement.

Where next?

Hopefully Open Data Sheffield will spend some time looking at this framework at a future meeting – and all comments are welcome on the Google doc. Clearly there’s lot to be done to make these more snappy, focussed and neat – but if we do find there’s a fairly settled sense of a five stars of engagement framework (if not yet good language to express it) then it would be interesting to think about whether we have the platforms and processes in place anywhere to support all of this: finding the good practice to share. Of course, there might already be a good engagement framework out there we missed when sketching this all out – so comments to that effect welcome too…

Updates:

Ammended 22nd January to properly credit Antonio of Data.gov.uk as originator of the Charter idea

Exploring Open Charity Data with Nominet Trust

[Summary: notes from a pilot one-day working on open data opportunities in third-sector organisations]

On Friday I spent the day with Nominet Trust for the second of a series of charity ‘Open Data Days’ exploring how charities can engage with the rapidly growing and evolving world of open data. The goal of these hands-on workshops is to spend just one working day looking at what open data might have to offer to a particular organisation and, via some hands-on prototyping and skill-sharing, to develop an idea of the opportunities and challenges that the charity needs to explore to engage more with open data.

The results of ten open data days will be presented at a Nominet Trust, NCVO and Big Lottery Fund conference later in the year, but for now, here’s a quick run-down / brain-dump of some of the things explored with the Nominet Trust team.

What is Open Data anyway?

Open data means many different things to different people – so it made sense to start the day looking at different ways of understanding open data, and identifying the ideas of open data that chimed most with Ed and Kieron from the Nominet Trust Team.

The presentation below runs through five different perspectives on open data, from understanding open data as a set of policies and practices, to looking at how open data can be seen as a political movement or a movement to build foundations of collaboration on the web.

Reflecting on the slides with Ed and Kieron highlighted that the best route into exploring open data for Nominet Trust was looking at the idea that ‘open data is what open data does’ which helped us to set the focus for the day on exploring practical ways to use open data in a few different contexts. However, a lot of the uses of open data we went on to explore also chime in with the idea of a technical and cultural change that allows people to perform their own analysis, rather than just taking presentations of statistics and data at face value.

Mapping opportunities for open data

Even in a small charity there are many different places open data could have an impact. With Nominet Trust we looked at a number of areas where data is in use already:

Informing calls for proposals – Nominet Trust invite grant applications for ideas that use technology for disruptive innovation in a number of thematic areas, with two main thematic areas of focus live at any one time. New thematic areas of focus are informed by ‘State of the Art’ review reports. Looking at one of these it quickly becomes clear these are data-packed resources, but that the data, analysis and presentation are all smushed together.
Throughout the grant process – Nominet Trust are working not only to fund innovative projects, but also to broker connections between projects and to help knowledge and learning flow between funded projects. Grant applications are made online, and right now, details of successful applicants are published on the Trust’s websites. A database of grant investment is used to keep track of ongoing projects.
Evaluation – the Trust are currently looking at new approaches to evaluating projects, and identifying ways to make sure evaluation contributes not only to an organisations own reflections on a project, but also to wider learning about effective responses to key social issues.

With these three areas of data focus, we turned to identify three data wishes to guide the rest of the open data day. These were:

Being able to find the data we need when we need it
Creating actionable tools that can be embedded in different parts of the grant process – and doing this with open platforms that allow the Nominet Trust team to tweak and adapt these tools.
Improving evaluation – with better data in, and better day out

Pilots, prototypes and playing with data

The next part of our Open Data Day was to roll up our sleeves and to try some rapid experiments with a wide range of different open data tools and platforms. Here are some the experiments we tried:

Searching for data

We imagined a grant application looking at ways to provide support to young people not in education, employment or training in the Royal Borough of Kensington and Chelsea, and set the challenge of finding data that could support the application, or that could support evaluation of it. Using the Open Data Cook Book guide to sourcing data, Ed and Keiron set off to track down relevant datasets, eventually arriving at a series of spreadsheets on education stats in London on the London Skills and Employment Observatory website via the London Datastore portal. Digging into the spreadsheets allowed the team to put claims that could be made about levels of education and employment exclusion in RBKC in context, looking at the difference interpretations that might be drawn from claims made about trends and percentages, and claims about absolute numbers of young people affected.

Learning: The data is out there; and having access to the raw data makes it possible to fact-check claims that might be made in grant applications. But, the data still needs a lot of interpretation, and much of the ‘open data’ is hidden away in spreadsheets.

Publishing open data

Most websites are essentially databases of content with a template to present them to human readers. However, it’s often possible to make the ‘raw data’ underlying the website available as more structured, standardised open data. The Nominet Trust website runs on Drupal and includes a content type for projects awarded funding which includes details of the project, it’s website address, and the funding awarded.

Using a demonstration Drupal website we explored how the Drupal Views and the Views Bonus Pack open source modules it was easy to create a ‘CSV’ open data download of information in the website.

The sorts of ‘projects funded’ open data this would make available from Nominet Trust might be of interest to sites like OpenlyLocal.com which are aggregating details of funding to many different organisations.

Learning: You can become an open data publisher very easily, and by hooking into existing places where ‘datasets’ are kept, keeping your open data up-to-date is simple.

Mashing-up datasets

Because open datasets are often provided in standardised forms, and the licenses under which data is published allow flexible re-use of the data, it becomes easy to mash-up different datasets, generating new insights by combining different sources.

We explored a number of mash-up tools. Firstly, we looked at using Google Spreadsheets and Yahoo Pipes to filter a dataset ready to combine it with other data. The Open Data Cook Book has a recipe that involves scraping data with Google Spreadsheets, and a Yahoo Pipes recipe on combing datasets.

Then we turned to the open data powertool that is Google Refine. Whilst Refine runs in a web browser, it is software you install on your own computer, and it keeps the data on your machine until you publish it – making a good tool for a charity to use to experiment with their own data, before deciding whether it will be published as open data or not.

We started by using Google Refine to explore data from OpenCharities.org – taking a list of all the charities with the word ‘Internet’ in their description that had been exported from the site, and using the ‘Facets’ feature (and a Word Facet) in Google Refine to look at the other terms they used in their descriptions. Then we turned to a simple dataset of organisations funded by Nominet Trust, and explored how by using API access to OpenlyLocal.com’s spending dataset we could get Google Refine to fetch details of which Nominet Trust funded organisations had also recieved money from particular local authorities or big funders like Big Lottery Fund and the Arts Council. This got a bit technical, so a step-by-step How To will have to wait – but the result was an interesting indication of some of the organisations that might turn out to be common co-funders of projects with Nominet Trust – a discovery enabled by those funders making their funding information available as open data.

Learning: Mash-ups can generate new insights – although many mash-ups still involve a bit of technical heavy-lifting and it can take some time to really explore all the possibilities.

Open data for evaluation

Open data can be both an input and an output of evaluation. We looked at a simple approach using Google Spreadsheets to help a funder create evaluation online evaluation tools for funded projects.

With a Google Docs account, we looked at creating a new ‘Form’. Google Forms are easy to create, and let you design a set of simple survey elements that a project can fill in online, with the results going directly into an online Google Spreadsheet. In the resulting spreadsheet, we added an extra tab for ‘Baseline Data’, and exploring how the =ImportData() formula in Google Spreadsheet can be used to pull in CSV files of open data from a third party, keeping a sheet of baseline data up-to-date. Finally, we looked at the ‘Publish as a Web Page’ feature of Google Spreadsheets which makes it possible to provide a simple CSV file output from a particular sheet.

In this way, we saw that a funder could create an evaluation form template for projects in a Google Form/Spreadsheet, and with shared access to this spreadsheet, could help funded projects to structure their evaluations in ways that helped cross-project comparison. By using formulae to move a particular sub-set of the data to a new sheet in the Spreadsheet, and then using the ‘Publish as a Web Page’ feature, non-private information could be directly published as open data from here.

Learning: Open data can be both an input to, and an output from, evaluation.

Embeddable tools and widgets

Working with open data allows you to present one interpretation or analysis of some data, but also allow users of your website or resources to dig more deeply into the data and find their own angles, interpretations, or specific facts.

When you add a ‘Gadget’ chart to a Google Spreadsheet of data you can often turn it into a widget to embed in a third party website. Using some of the interactive gadgets allows you to make data available in more engaging ways.

Platforms like IBM’s Many Eyes also let you create interactive graphs that users can explore.

Sometimes, interactive widgets might already be available, as in the case of Interactive Population pyramids from ONS. The Nominet Trust state of the art review on Aging and use of the Internet includes a static image of a population pyramid, but many readers could find the interactive version more useful.

Learning: If you have data in a report, or on a web page, you can make it interactive by publishing it as open data, and then using embeddable widgets.

Looking ahead

The Open Data Day ended with a look at some of the different ways to take forward learning from our pilots and prototypes. The possibilities included:

Sooner

Quick wins: Making funded project data available as structured open data. As this information is already published online, there are not privacy issues with making it available in a more structured format.
Developing small prototypes taking the very rough proof-of-concept ideas from the Open Data Day on a stage, and using this to inform plans for future developments. Some of the prototypes might be interactive widgets.
A ‘fact check’ experiment: taking a couple of past grant applications, and using open data resources to fact-check the claims made in those applications. Reflecting on whether this process offers useful insights and how it might form part of future processes.
Commissioning open data along with research: when Nominet Trust commissions future State of the Art reviews it could include a request for the researcher to prepare a list of relevant open datasets as well, or to publish data for the report as open data.

Later

Explore open data standards such as the International Aid Transparency Initiative Standard for publishing project data in a more detailed form.
Building our own widgets and tools: for example, tools to help applicants find relevant open data to support their application, or tools to give trustees detailed information on applicant organisations to help their decision making.
Building generalisable tools and contributing to the growth of a common resource of software and tools for working with open data, as well as just building things for direct organisational use.

Where next?

This was just the second of a series of Open Data Days supported by Nominet Trust. I’m facilitating one more next month, and there are a team of other consultants working with varied other charities over the coming weeks. So far I’ve been getting a sense of the wide range of possible areas open data can fit into charity work (it feels quite like exploring the ways social media could work for charities did back in 2007/8…), but there’s also much work to be done identifying some of the challenges that charities might face, and sustainable ways to overcome them. Lots more to learn….