Following the money: preliminary remarks on IATI Traceability

[Summary: Exploring the social and technical dynamics of aid traceability: let’s learn what we can from distributed ledgers, without thinking that all the solutions are to be found in the blockchain.]

My colleagues at Open Data Services are working at the moment on a project for UN Habitat around traceability of aid flows. With an increasing number of organisations publishing data using the International Aid Transparency Initiative data standard, and increasing amounts of government contracting and spending data available online, the theory is that it should be possible to track funding flows.

In this blog post I’ll try and think aloud about some of the opportunities and challenges for traceability.

Why follow funds?

I can envisage a number of hypothetical use cases traceability of aid.

Firstly, donors want to be able to understand where their money has gone. This is important for at least three reasons:

  1. Effectiveness & impact: knowing which projects and programmes have been the most effective;
  2. Understanding and communication: being able to see more information about the projects funded, and to present information on projects and their impacts to the public to build support for development;
  3. Addressing fraud and corruption: identifying leakage and mis-use of funds.

Traceability is important because the relationship between donor and delivery is often indirect. A grant may pass through a number of intermediary organisations before it reaches the ultimately beneficiaries. For example, a country donor may fund a multi-lateral fund, which in turn commissions an international organisation to deliver a programme, and they in turn contract with country partners, who in turn buy in provision from local providers.

Secondly, communities where projects are funded, or where funds should have been receieved, may want to trace funding upwards: understanding the actors and policy agendas affecting their communities, and identifying when funds they are entitled to have not arrived (see the investigative work of Follow The Money Nigeria for a good example of this latter use case).

Short-circuiting social systems

It is important to consider the ways in which work on the traceability of funds potentially bypasses, ‘routes around’ or disrupts* (*choose your own framing) existing funding and reporting relationships – allowing donors or communities to reach beyond intermediaries to exert such authority and power over outcomes as they can exercise.

Take the example given above. We can represent the funding flows in a diagram as below:

downwards

But there are more than one-way-flows going on here. Most of the parties involved will have some sort of reporting responsibility to those giving them funds, and so we also have a report

upwards

By the time reporting gets to the donor, it is unlikely to include much detail on the work of the local partners or providers (indeed, the multilateral, for example, may not report specifically on this project, just on the development co-operation in general). The INGO may even have very limited information about what happens just a few steps down the chain on the ground, having to trust intermediary reports.

In cases where there isn’t complete trust in this network of reporting, and clear mechanisms to ensure each party is excercising it’s responsibility to ensure the most effective, and corruption-free, use of resources by the next party down, the case for being able to see through this chain, tracing funds and having direct ability to assess impacts and risks is clearly desirable.

Yet – it also needs to be approached carefully. Each of the relationships in this funding chain is about more than just passing on some clearly defined packet of money. Each party may bring specific contextual knowledge, skills and experience. Enabling those at the top of a funding chain to leap over intermediaries doesn’t inevitably having a positive impact: particularly given what the history of development co-operative has to teach about how power dynamics and the imposition of top-down solutions can lead to substantial harms.

None of this is a case against traceability – but it is a call for consideration of the social dynamics of traceability infrastructures – and considering of how to ensure contextual knowledge is kept accessible when it becomes possible to traverse the links of a funding chain.

The co-ordination challenge of traceability

Right now, the IATI data standard has support for traceability at the project and transaction level.

  • At the project level the related-activity field can be used to indicate parent, child and co-funded activities.
  • At the transaction level, data on incoming funds can specify the activity-id used by the upstream organisation to identify the project the funds come from, and data on outgoing funds can specify the activity-id used by the downstream organisation.

This supports both upwards and downwards linking (e.g. a funder can publish the identified of the funded project, or a receipient can publish the identifier of the donor project that is providing funds), but is based on explicit co-ordination and the capture of additional data.

As a distributed approach to the publication of open data, there are no consistency checks in IATI to ensure that providers and recipients agree on identifiers, and often there can be practical challenges to capture this data, not least that:

  • A) Many of the accounting systems in which transaction data is captured have no fields for upstream or downstream project identifier, nor any way of conceptually linking transactions to these externally defined projects;
  • B) Some parties in the funding chain may not publish IATI data, or may do so in forms that do not support traceability, breaking the chain;
  • C) The identifier of a downstream project may not be created at the time an upstream project assigns funds – exchanging identifiers can create a substantial administrative burden;

At the last IATI TAG meeting in Ottawa, this led to some discussion of other technologies that might be explored to address issues of traceability.

Technical utopias and practical traceability

Let’s start with a number of assorted observations:

  • UPS can track a package right around the world, giving me regular updates on where it is. The package has a barcode on, and is being transferred by a single company.
  • I can make a faster-payments bank transfer in the UK with a reference number that appears in both my bank statements, and the receipients statements, travelling between banks in seconds. Banks leverage their trust, and use centralised third-party providers as part of data exchange and reconciling funding transfers.
  • When making some international transfers, the money has effectively disappeared from view for quite a while, with lots of time spent on the phone to sender, recipient and intermediary banks to track down the funds. Trust, digital systems and reconciliation services function less well across international borders.
  • Transactions on the BitCoin Blockchain are, to some extent, traceable. BitCoin is a distributed system. (Given any BitCoin ‘address’ it’s possible to go back into the public ledger and see which addresses have transferred an amount of bitcoins there, and to follow the chain onwards. If you can match an address to an identity, the currency, far from being anonymous, is fairly transparent*. This is the reason for BitCoin mixer services, designed to remove the trackability of coins.)
  • There are reported experiments with using BlockChain technologies in a range of different settings, incuding for land registries.
  • There’s a lot of investment going into FinTech right now – exploring ways to update financial services

All of this can lead to some excitement about the potential of new technologies to render funding flows traceable. If we can trace parcels and BitCoins, the argument goes, why can’t we have traceability of public funds and development assistance?

Although I think such an argument falls down in a number of key areas (which I’ll get to in a moment), it does point towards a key component missing from the current aid transparency landscape – in the form of a shared ledger.

One of the reasons IATI is based on a distributed data publishing model, without any internal consistency checks between publishers, is prior experience in the sector of submitting data to centralised aid databases. However, peer-to-peer and block-chain like technologies now offer a way to separate out co-ordination and the creation of consensus on the state of the world, from the centralisation of data in a single database.

It is at least theoretically possible to imagine a world in which the data a government publishes about it’s transactions is only considered part of the story, and in which the recipient needs to confirm receipt in a public ledger to complete the transactional record. Transactions ultimately have two parts (sending and receipt), and open (distributed) ledger systems could offer the ability to layer an auditable record on top of the actual transfer of funds.

However (as I said, there are some serious limitations here), such a system is only an account giving of the funding flows, not the flows themself (unlike BitCoin) which still leaves space for corruption through maintaining false information in the ledger. Although trusted financial intermediaries (banks and others) could be brought into the picture, as others responsible for confirming transactions, it’s hard to envisage how adoption of such a system could be brought about over the short and medium term (particularly globally). Secondly, although transactions between organisations might be made more visible and traceable in this way, the transactions inside an organisation remain opaque. Working out which funds relate to which internal and external projects is still a matter of the internal businesses processes in organisations involved in the aid delivery chain.

There may be other traceability systems we should be exploring as inspirations for aid and public money traceable. What my brief look at BitCoin leads me to reflect on is potential role over the short-term of reconciliation services that can, at the very least, report on the extent to which different IATI publisers are mutually confirming each others information. Over the long-term, a move towards more real-time transparency infrastructures, rather than periodic data publication, might open up new opportunities – although with all sorts of associated challenges.

Ultimately – creating traceable aid still requires labour to generate shared conceptual understandings of how particular transactions and projects relate.

How much is enough?

Let’s loop back round. In this post (as in many of the conversations I’ve had about traceable), we started with some use cases for traceability; we saw some of the challenges; we got briefly excited about what new technologies could do to provide traceability; we saw the opportunities, but also the many limitations. Where do we end up then?

I think important is to loop back to our use cases, and to consider how technology can help but not completely solve, the problems set out. Knowing which provider organisations might have been funded through a particular donors money could be enough to help them target investigations in cases of fraud. Or knowing all the funders who have a stake in projects in a particular country, sector and locality can be enough for communities on the ground to do further research to identify the funders they need to talk to.

Rather than searching after a traceability data panopticon, can we focus traceability-enabling practices on breaking down the barriers to specific investigatory processes?

Ultimately, in the IATI case, getting traceability to work at the project level alone could be a big boost. But doing this will require a lot of social coordination, as much as technical innovation. As we think about tools for traceability, thinking about tools that support this social process may be an important area to focus on.

Where next

Steven Flower and the rest of the Open Data Services team will be working on coming weeks on a deeper investigation of traceability issues – with the goal of producing a report and toolkit later this year. They’ve already been digging into IATI data to look for the links that exist so far and building on past work testing the concept of traceability against real data.

Drop in comments below, or drop Steven a line, if you have ideas to share.

OCDS – Notes on a standard

logo-open-contracting Today sees the launch of the first release of the Open Contracting Data Standard (OCDS). The standard, as I’ve written before, brings together concrete guidance on the kinds of documents and data that are needed for increased transparency in processes of public contracting, with a technical specification describing how to represent contract data and meta-data in common ways.

The video below provides a brief overview of how it works (or you can read the briefing note), and you can find full documentation at http://standard.open-contracting.org.

When I first jotted down a few notes on how to go forward from the rapid prototype I worked on with Sarah Bird in 2012, I didn’t realise we would actually end up with the opportunity to put some of those ideas into practice. However: we did – and so in this post I wanted to reflect on some aspects of the standard we’ve arrived at, some of the learning from the process, and a few of the ideas that have guided at least my inputs into the development process.

As, hopefully, others pick up and draw upon the initial work we’ve done (in addition to the great inputs we’ve had already), I’m certain there will be much more learning to capture.

(1) Foundations for ‘open by default’

Early open data advocacy called for ‘raw data now‘, asking for governments to essentially export and dump online existing datasets, with issues of structure and regular publishing processes to be sorted out later. Yet, as open data matures, the discussion is shifting to the idea of ‘open by default’, and taken seriously this means more than just data dumps that are created being openly licensed as the default position, but should mean that data is released from government systems as a matter of course in part of their day-to-day operation.

green_compilation.svgThe full OCDS model is designed to support this kind of ‘open by default’, allowing publishers to provide small releases of data every time some event occurs in the lifetime of a contracting process. A new tender is a release. An amendment to that tender is a release. The contract being awarded, or then signed, are each releases. These data releases are tied together by a common identifier, and can be combined into a summary record, providing a snapshot view of the state of a contracting process, and a history of how it has developed over time.

This releases and records model seeks to combine together different user needs: from the firm seeking information about tender opportunities, to the civil society organisation wishing to analyse across a wide range of contracting processes. And by allowing core stages in the business process of contracting to be published as they happen, and then joined up later, it is oriented towards the development of contracting systems that default to timely openness.

As I’ll be exploring in my talk at the Berkman Centre next week, the challenge ahead for open data is not just to find standards to make existing datasets line-up when they get dumped online, but is to envisage and co-design new infrastructures for everyday transparent, effective and accountable processes of government and governance.

(2) Not your minimum viable product

Different models of standard

Many open data standard projects adopt either a ‘Minimum Viable Product‘ approach, looking to capture only the few most common fields between publishers, or are developed through focussing on the concerns of a single publisher or users. Whilst MVP models may make sense for small building blocks designed to fit into other standardisation efforts, when it came to OCDS there was a clear user demand to link up data along the contracting process, and this required an overarching framework from into which simple component could be placed, or from which they could be extracted, rather than the creation of ad-hoc components, with the attempt to join them up made later on.

Whilst we didn’t quite achieve the full abstract model + idiomatic serialisations proposed in the initial technical architecture sketch, we have ended up with a core schema, and then suggested ways to represent this data in both structured and flat formats. This is already proving useful for example in exploring how data published as part of the UK Local Government Transparency Code might be mapped to OCDS from existing CSV schemas.

(3) The interop balancing act & keeping flex in the framework

OCDS is, ultimately, not a small standard. It seeks to describe the whole of a contracting process, from planning, through tender, to contract award, signed contract, and project implementation. And at each stage it provides space for capturing detailed information, linking to documents, tracking milestones and tracking values and line-items.

This shape of the specification is a direct consequence of the method adopted to develop it: looking at a diverse set of existing data, and spending time exploring the data that different users wanted, as well as looking at other existing standards and data specifications.

However, OCDS by not means covers all the things that publishers might want to state about contracting, nor all the things users may want to know. Instead, it focusses on achieving interoperability of data in a number of key areas, and then providing a framework into which extensions can be linked as the needs of different sub-communities of open data users arise.

We’re only in the early stages of thinking about how extensions to the standard will work, but I suspect they will turn out to be an important aspect: allowing different groups to come together to agree (or contest) the extra elements that are important to share in a particular country, sector or context. Over time, some may move into the core of the standard, and potentially elements that appear core right now might move into the realm of extensions, each able to have their own governance processes if appropriate.

As Urs Gasser and John Palfrey note in their work on Interop, the key in building towards interoperability is not to make everything standardised and interoperable, but is to work out the ways in which things should be made compatible, and the ways in which they should not. Forcing everything into a common mould removes the diversity of the real world, yet leaving everything underspecified means no possibility to connect data up. This is both a question of the standards, and the pressures that shape how they are adopted.

(4) Avoiding identity crisis

green_organisation.svgData describes things. To be described, those things need to be identified. When describing data on the web, it helps if those things can be unambiguously identified and distinguished from other things which might have the same names or identification numbers. This generally requires the use of globally unique identifiers (guid): some value which, in a universe of all available contracting data, for example, picks out a unique contracting process; or, in the universe of all organizations, uniquely identifies a specific organization. However, providing these identifiers can turn out to be both a politically and technically challenging process.

The Open Data Institute have recently published a report on the importance of identifiers that underlines how important identifiers are to processes of opening data. Yet, consistent identifiers often have key properties of public goods: everyone benefits from having them, but providing and maintaining them has some costs attached, which no individual identifier user has an incentive to cover. In some cases, such as goods and service identifiers, projects have emerged which take a proprietary approach to fund the maintenance of those identifiers, selling access to the lookup lists which match the codes for describing goods and services to their descriptions. This clearly raises challenges for an open standard, as when proprietary identifiers are incorporated into data, then users may face extra costs to interpret and make sense of data.

In OCDS we’ve sought to take as distributed an approach to identifiers as possible, only requiring globally unique identifiers where absolutely necessary (identifying contracts, organizations and goods and services), and deferring to existing registration agencies and identity providers, with OCDS maintaining, at most, code lists for referring to each identity ‘scheme’.

In some cases, we’ve split the ‘scheme’ out into a separate field: for example, an organization identifier consists of a scheme field with a value like ‘GB-COH’ to stand for UK Companies House, and then the identifier given in that scheme, like ‘5381958’. This approach allows people to store those identifiers in their existing systems without change (existing databases might hold national company numbers, with the field assumed to come from a particular register), whilst making explicit the scheme they come from in the OCDS. In other cases, however, we look to create new composite string identifiers, combining a prefix, and some identifier drawn from an organizations internal system. This is particularly the case for the Open Contracting ID (ocid). By doing this, the identifier can travel between systems more easily as a guid – and could even be incorporated in unstructured data as a key for locating documents and resources related to a given contracting process.

However, recent learning from the project is showing that many organisations are hesistant about the introduction of new IDs, and that adoption of an identifier schema may require as much advocacy as adoption of a standard. At a policy level, bringing some external convention for identifying things into a dataset appears to be seen as affecting the, for want of a better word, sovereignty of a specific dataset: even if in practice the prefix approach of the ocid means it only need to be hard coded in the systems that expose data to the world, not necessarily stored inside organizations databases. However, this is an area I suspect we will need to explore more, and keep tracking, as OCDS adoption moves forward.

(5) Bridging communities of practice

If you look closely you might in fact notice that the specification just launched in Costa Rica is actually labelled as a ‘release candidate‘. This points to another key element of learning in the project, concerning the different processes and timelines of policy and technical standardisation. In the world of funded projects and policy processes, deadlines are often fixed, and the project plan has to work backwards from there. In a technical standardisation process, there is no ‘standard’ until a specification is in use: and has been robustly tested. The processes for adopting a policy standard, and setting a technical one, differ – and whilst perhaps we should have spoken from the start of the project of an overall standard, embedding within it a technical specification, we were too far down the path towards the policy launch before this point. As a result, the Release Candidate designation is intended to suggest the specification is ready to draw upon, but that there is still a process to go (and future governance arrangements to be defined) before it can be adopted as a standard per-se.

(6) The schema is just the start of it

This leads to the most important point: that launching the schemas and specification is just one part of delivering the standard.

In a recent e-mail conversation with Greg Bloom about elements of standardisation, linked to the development of the Open Referral standard, Greg put forward a list of components that may be involved in delivering a sustainable standards project, including:

  • The specification – with its various components and subcomponents);
  • Tools that assesses compliance according to the spec (e.g. validation tools, and more advanced assessment tools);
  • Some means of visualizing a given set of data’s level of compliance;
  • Incentives of some kind (whether positive or negative) for attaining various levels of compliance;
  • Processes for governing all of the above;
  • and of course the community through which all of this emerges and sustains;

To this we might also add elements like documentation and tutorials, support for publishers, catalysing work with tool builders, guidance for users, and so-on.

Open government standards are not something to be published once, and then left, but require labour to develop and sustain, and involve many social processes as much as technical ones.

In many ways, although we’ve spent a year of small development iterations working towards this OCDS release, the work now is only just getting started, and there are many technical, community and capacity-building challenges ahead for the Open Contracting Partnership and others in the open contracting movement.

Joined Up Philanthropy data standards: seeking simplicity, and depth

[Summary: technical notes on work in progress for the Open Philanthropy data standard]

I’m currently working on sketching out a alpha version of a data standard for the Open Philanthropy project(soon to be 360giving). Based on work Pete Bass has done analysing the supply of data from trusts and foundations, a workshop on demand for the data, and a lot of time spent looking at existing standards at the content layer (eGrant/hGrantIATISchema.orgGML etc) and deeper technical layers (CSV, SDFXMLRDF,JSONJSON-Schema and JSON-LD), I’m getting closer to having a draft proposal. But – ahead of that – and spurred on by discussions at the Berkman Center this afternoon about the role of blogging in helping in the idea-formation process, here’s a rough outline of where it might be heading. (What follows is ‘thinking aloud’ from my work in progress, and does not represent any set views of the Open Philanthropy project)

Building Blocks: Core data plus

Joined Up Data Components

There are lots of things that different people might want to know about philanthropic giving, from where money is going, to detailed information on the location of grant beneficiaries, information on the grant-making process, and results information. However, few trusts and foundations have all this information to hand, and very few are likely to have it in a single system such that creating an single open data file covering all these different areas of the funding process would be an easy task. And if presented with a massive spreadsheet with 100s of columns to fill in, many potential data publishers are liable to be put off by the complexity. We need a simple starting point for new publishers of data, and a way for those who want to say more about their giving to share deeper and more detailed information.

The approach to that should be a modular, rather than monolithic standard: based on common building blocks. Indeed, in line with the Joined Up Data efforts initiated by Development Initiatives, many of these building blocks may be common across different data standards.

In the Open Philanthropy case, we’ve sketched out seven broad building blocks, in addition to the core “who, what and how much” data that is needed for each of the ‘funding activities’ that are the heart of an open philanthropy standard. These are:

  • Organisations – names, addresses and other details of the organisations funding, receiving funds and partnering in a project
  • Process – information about the events which take place during the lifetime of a funding activity
  • Locations – information about the geography of a funded activity – including the location of the organisations involved, and the location of beneficiaries
  • Transactions – information about pledges and transfers of funding from one party to another
  • Results – information about the aims and targets of the activity, and whether they have been met
  • Classifications – categorisations of different kinds that are applied to the funded activity (e.g. the subject area), or to the organisations involved (e.g. audited accounts?)
  • Documents – links to associated documents, and more in-depth descriptions of the activity

Some of these may provide more in-depth information about some core field (e.g. ‘Total grant amount’ might be part of the core data, but individual yearly breakdowns could be expressed within the transactions building block), whilst others provide information that is not contained in the core information at all (results or documents for example).

An ontological approach: flat > structured > linked

One of the biggest challenges with sketching out a possible standard data format for open philanthropy is in balancing the technical needs of a number of different groups:

  • Publishers of the data need it to be as simple as possible to share their information. Publishing open philanthropy must be simple, with a minimum of technical skills and resources required. In practice, that means flat, spreadsheet-like data structures.
  • Analysts like flat spreadsheet-style data too – but often want to be able to cut it in different ways. Standards like IATI are based on richly structured XML data, nested a number of levels deep, which can make flattening the data for analysts to use it very challenging.
  • Coders prefer structured data. In most cases for web applications that means JSON. Whilst someexpressive path languages for JSON are emerging, ideally a JSON structure should make it easy for a coder to simply drill-down in the tree to find what they want, so being able to look foractivity.organisations.fundingOrganisation[0] is better than having to iterate through all theactivity.organisation nodes to find the one which has “type”:”fundingOrganisation”.
  • Data integrators want to read data into their own preferred database structures, from noSQL to relational databases. Those wanting to integrate heterogeneous data sources from different ‘Joined Up Data’ standards might also benefit from Linked Data approaches, and graph-based data using cross-mapped ontologies.

It’s pretty hard to see how a single format for representing data can meet the needs of all these different parties: if we go with a flat structure it might be easier for beginners to publish, but the standard won’t be very expressive, and will be limited to use in a small niche. If we go with richer data structures, the barriers to entry for newcomers will be too high. Standards like IATI have faced challenges through the choice of an expressive XML structure which, whilst able to capture much of the complexity of information about aid flows, is both tricky for beginners, and programatically awkward to parse for developers. There are a lot of pitfalls an effective, and extensible, open philanthropy data standard will have to avoid.

In considering ways to meet the needs of these different groups, the approach I’ve been exploring so far is to start from a detailed, ontology based approach, and then to work backwards to see how this could be used to generate JSON and CSV templates (and as JSON-LD context), allowing transformation between CSV, JSON and Linked Data based only on rules taken from the ontology.

In practice that means I’ve started sketching out an ontology using Protege in which there are top entities for ‘Activity’, ‘Organisation’, ‘Location’, ‘Transaction’, ‘Documents’ and so-on (each of the building blocks above), and more specific sub-classed entities like ‘fundedActivity’, ‘beneficiaryOrganisation’, ‘fundingOrganisation’, ‘beneficiaryLocation’ and so-on. Activities, Organisations, Locations etc. can all have many different data properties, and there are then a range of different object properties to relate ‘fundedActivities’ to other kinds of entity (e.g. a fundedActivity can have a fundingOrganisation and so-on). If this all looks very rough right now, that’s because it is. I’ve only built out a couple of bits in working towards a proof-of-concept (not quite there yet): but from what I’ve explored so far it looks like building a detailed ontology should also allow mappings to other vocabularies to be easily managed directly in the main authoritative definition of the standard: and should mean when converted into Linked Data heterogenous data using the same or cross-mapped building blocks can be queried together. Now – from what I’ve seen ontologies can tend to get out of hand pretty quickly – so as a rule I’m trying to keep things as flat as possible: ideally just relationships between Activities and the other entities, and then data properties.

What I’ve then been looking at is how that ontology could be programatically transformed:

  • (a) Into a JSON data structure (and JSON-LD Context)
  • (b) Into a set of flat tables (possibly described with Simple Data Format if there are tools for which that is useful)

And so that using the ontology, it should be possible to take a set of flat tables and turn them into structure JSON and, via JSON-LD, into Linked Data. If the translation to CSV takes place using the labels of ontology entities and properties rather than their IDs as column names, then localisation of spreadsheets should also be in reach.

Rough work in progress... worked example coming soon
Rough work in progress. From ontology to JSON structure (and then onwards to flat CSV model). Full worked example coming soon…

I hope to have a more detailed worked example of this to post shortly, or, indeed, a post detailing the dead-ends I came to when working this through further. But – if you happen to read this in the next few weeks, before that occurs – and have any ideas, experience or thoughts on this approach – I would be really keen to hear your ideas. I have been looking for any examples of this being done already – and have not come across anything: but that’s almost certainly because I’m looking in the wrong places. Feel free to drop in a comment below, or tweet @timdavies with your thoughts.

Young Rewired State at Oxfam

Update: Postponedwe weren’t quite quick off the blocks enough to recruit young people to take part in an Oxfam hack-day during the main Youth Rewired State week: so the Oxfam YRS has been postponed. We’ll hopefully work out a new date / plan in the next few weeks. However, other Young Rewired State centres are still on the go…

What happens when you take 5 or 10 young coders and designers aged between 15 and 18; give them a room at the heart of Oxfam HQ; link them up with designers, campaigners and digital experts; and give them a week to create things with government data?

I’m not sure yet. But in few weeks hopefully we’ll find out.

I’m helping to organise a Young Rewired State event at Oxfam HQ in Oxford to do just that – and right now we’re looking for young people from the local area to apply to take part.

You can download a flyer with lots more information to share with any young people you think might be interested, and a sign-up form is here. Deadline for applications is 25th July – but the sooner applications come in the more chance they have. Young Rewired State events are also taking place across the UK, so if you know young people who might be interested but can’t make it to Oxfam HQ in Oxford every day during the first week of August, point them in the direction of the national Rewired State Website.

Legacies of social reporting: an IGF09 example

[Summary: aggregating content from the Internet Governance Forum & exploring ways to develop the legacy of social reporting at events…]

Introducing social reporting to an event can bring many immediate benefits. From new skills for those participating in the social reporting, to increasing opportunities for conversation at the event, and building bridges between those present at an event, and those interested in the topic but unable to physically take part.

However, the wealth of content gathered through social reporting can also act as a resource ‘after the event’ – offering insights and narratives covering event themes, and offering contrasting and complementary perspectives to any ‘official’ event records that may exist.

Many of the tools I use when social reporting at an event have a certain ‘presentism’ about them. Newer content is prioritised over older content, and, in the case of dashboard aggregators like NetVibes, or services such as Twitter, good content can quickly disappear from the front page, or even altogether.

So, as we got towards the end of a frantic four days social reporting out at the Internet Governance Forum in Egypt earlier this year, I started thinking about how to make the most of the potential legacy impacts of the social reporting that was going on – both in the event-wide Twitterstream, and in the work of the young social reporters I was specifically working with.

Part of that legacy was about the skills and contacts gathered by the social reporters – so we quickly put together this handout for participants – but another part of that legacy was in the content. And gathering that together turned out to be trickier than I expected.

However, I now have a micro-site set up at http://igf2009.practicalparticipation.co.uk/ where you can find all the blog posts and blips created by our social reporters, as well as all the tagged tweets we could collect together. Over the coming weeks colleagues at Diplo will be tagging core content to make it easy to navigate and potentially use as part of online learning around Internet Governance. I’ve run the 3500+ twitter messages I managed to (eventually) aggregate through the Open Calais auto-tagging service as an experiment to see if this provide ways to identify insights within them – and I’ve been exploring different ways to present the information found in the site.

Learning: Next time set up the aggregator in advance
I didn’t start putting together the site (a quick bit of Drupal + FeedAPI, with the later addition of Views, Panels, Autotagging, Timeline and other handy modules) till the final day of IGF09, by which time over 50 blog posts had been added to our Ning website, and over 3000 twitter messages tagged #igf09.

Frustratingly, Ning only provides the last 20 items in any RSS feed, and, as far as I can tell, no way to page through past items; and the Twitter search API is limited to fetching just 1500 tweets.

Fortunately when it came to Twitter I had captured all the Tweets in Google Reader – but still had to scrape Twitter message IDs back out of there – and set up a slow script to spend a couple of days fetching original tweets (given the rate limiting again on the Twitter API).

For Ning, I ended up having to go through and find all the authors who had written on IGF09, and to fetch the feeds of their posts, run through a Yahoo Pipe to create an aggregate feed of only those items posted during the time of the IGF.

It would have been a lot easier if I set up the Drupal + FeedAPI aggregator beforehand, and added new feeds to it whenever I found them.

Discoveries: Language and noise
I’ve spent most of my time just getting the content into this aggregator, and setting up a basic interface for exploring it. I’ve not yet hand chance to dive in and really explore the content itself. However, two things I noticed:

1) There is mention of a francaphone hash-tag for IGF2009 in some of the tweets. Searching on that hash-tag now, over a month later, doesn’t turn up any results – but it’s quite possible that there were active conversations this aggregator fails to capture because we weren’t looking at the right tags.

Social Network Map of Tweets
Mapping Twitter @s with R and Iplot

2) A lot of the Twitter messages aggregated appear to be about the ‘censorship incident‘ that dominated external coverage of IGF09, but which was only a small part of all the goings on at IGF. Repeated tweeting and re-tweeting on one theme can drown out conversations on other themes unless there are effective ways to navigate and filter the content archives.

I’ve started to explore how @ messages, and RTs within Tweets could be used to visualise the structure, as well as content, of conversations – but have run up against the limitations of my meagre current skill set with R and iplot.

I’m now on the look out for good ways of potentially building some more intelligent analysis of tweets into future attempts to aggregate with Drupal – possibly by extracting information on @s and RTs at the time of import using the promising FeedAPI Scraper module from the great folk at Youth Agora.

Questions: Developing social reporting legacies
There is still a lot more to reflect upon when it comes to making the most of content from a socially reported event, not least:

1) How long should information be kept?

I’ve just been reading Delete, which very sensibly suggests that not all content should be online for ever – and particularly with conversational twitter messages or video clips, there may be a case for ensuring a social reporting archive only keeps content public for as long as there is a clear value in doing so.

2) Licensing issues

Aggregation on the model I’ve explored assumes licence to collect and share tweets and other content. Is this a fair assumption?

3) Repository or advocacy?

How actively should the legacy content from social reporting be used? Should managing the legacy of an event also involve setting up search and blog alerts, and pro-actively spreading content to other online spaces? If so – who should be responsible for that and how?


If you are interested in more exploration of Social Reporting, you may find the Social by Social network, and Social Reporters group there useful.

Voicebox – making the most of engagement

VoiceBoxThe process is depressingly familiar. Someone asks you to fill in a survey for research or consultation. They take away your results – and – in the rare cases where you ever hear of the research/consultation again – you see that your responses have been written up as part of a dull report, full of graphs made in Excel, and likely to sit on the book shelves of people whose behaviour betrays the probability that they’ve not really read or understood what was in the report.

Which is why it is refreshing to see the (albeit well funded) Vinspired team doing something rather different with their Voicebox survey of 16 – 25 year olds. Here’s how they introduce the project:

Journalists, politicians, academics, police and parents all have a point of view on what the ‘kids of today’ are like.

But has anyone ever asked the young people themselves, and not just in a focus group in Edmonton, but in an open and transparent way and on a national scale? And has anyone done anything smart, cool or fun with that data, that might, just might, make the truth about young people be heard?

These questions were the starting point for Voicebox; a project which aims to curate the views of 16-25s, visualise the results in creative ways, and then set that data free. Over the coming months, we’re going to try to find out how young people spend their time, what they care about, how many carry knives, what they really think about the area they live in and much more.

But not only are they breaking up their survey of views into manageable chunks, and giving instant feedback on the results to anyone filling the survey in – they are opening up the data they collect through an open XML API and CSV downloads, so anyone can take and use the data collected.

Plus – to make sure responses to the question ‘What do young people really care about?’ make it in front of decision makers – they’re planning to wire up the responses to a robot, ready to hand-write out each and every response as part of an installation in Parliament.

Of course, it’s not often that your budget stretches to custom-built flashy survey applications and internet-connected-robots when you’re looking to gain young people’s input into local issues or policy making. But what Vinspired have done with VoiceBox does raise the questions: how will you make sure that you really make the most of the views young people give you? Any how will you get young people’s views in front of decision makers in a way that makes them tricky to ignore?

Certainly two questions I’m going to be asking myself on any future consultation or engagement projects I work on…

Explaining Twitter in one page…

I’ve been trying to create a general purpose one page guide to Twitter for a while. I’ve made two attempts in the past for particular situations – although with the end of SMS based access to Twitter in the UK those guides are both out of date.

But – I think I’ve finally created a guide I’m happy with – with this guide created for an Action Learning Set on Youth Participation and Social Network Sites I’m currently co-facilitating – but written to work as an introduction in just about any circumstance.

You can get the PDF of this one page guide to Twitter from Scribd.com (look for the download link) or, as this guide, like all the other one page guides, is provided you can download an Open Office copy (ODT) to edit and re-purpose as you wish (just make sure you let me know about any updated versions).

(Thanks to Harry @ Neontribe for photos and feedback used in this guide)

Guide preview:

NYA Youth Issues News as RSS

I like to try and keep up to date with the latest news about goings on in the Youth sector. I’ve got a dashboard page in my NetVibes homepage devoted to the latest information on youth issues and initiatives – particularly useful for sparking ideas about youth participation or promoting positive activities.

One of the best sources for news about young people related news is The National Youth Agency’s press clippings service (Youth Issues News), which serves up a dose of the latest headlines every day. Frustratingly though, it’s not available as an RSS feed to slot nicely into my news dashboard – so, with a little help from folk on twitter, I’ve used Dapper.net to create my own RSS feed from the NYA press clippings.

I thought other’s might find it useful as well, so if you want to use it, the simple copy this link here into your RSS reader and (if it all works alright) get daily updated headlines of young people-related news.

(If you’re not sure what all this RSS thing is about then the BBC have a pretty good introduction on their website.)

Sharing learning from the Plings project…

[Summary: I’m going to be blogging for the Plings project – exploring new ways of collecting and sharing positive activity information for young people]

The Plings project has been one to watch for a while. Exploring new ways of collecting, processing and sharing information for on positive activities for young people.

Local authorities are under a duty to provide information on the out-of-school activities in a local area  young people can get involved with – but collecting and disseminating all that information is a big challenge.

Plings, built by research cooperative Substance, is an open source project that has been seeking to pilot and explore ways of semantically aggregating and then distributing that data, through XML feeds, iCal widgets and other mash-ups. Now that Substance has won the contract to lead the DCSF funded Information and Signposting Project, they’re going to be accelerating the development of the Plings project, and working with 20 local authorities to generate stacks of shared learning about collecting, processing and sharing positive activity information. This week has already seen the data from Plings made available via DigiTV, and I’m in the midst of scoping how positive activity information could be shared through Social Network Sites.

And if I can keep up with all the learning being generated, I’ll hopefully be blogging as much of it as possible over on the Plings blog.

So, if you’re interested in public sector mash-ups, promoting positive activities to young people, or just exploring new ways of innovating in the youth sector, please do subscribe to the Plings blog and throw in your thoughts and reflections to the comments there as the project moves forward…


(Disclosure: My blogging for the Plings project is part of a paid contract with Substance. I’m sharing news of it here as I think the learning from the ISP/Plings project will be of interest to a lot of readers of this blog.)

A social media game without an evening lost laminating

cardimages.jpg[Summary: Using Moo.com to make workshop resources]

(This post is mainly for those who have spent far too long laminating little bits of card late at night in preparation for a workshop the next day…)

I’ve used variations on the Social Media Game in workshops before. The game, which works by setting scenarios, and getting workshop participants to explore how they would use different tools or approaches to respond to those scenarios is a really effective way to encourage knowledge sharing and practical learning in a session.

However, preparing the game cards for a workshop always turns into one of those nightmare jobs. Simply printing them on paper or card isn’t enough – they’re too flimsy – and it’s always surprising how much the quality of a resource affects people’s interaction with. So, up until now – that’s always meant an evening of laminating little bits of printed paper to create good quality cards. And I know I’m not the only one who suffers this small but significant laminating challenge – @davebriggswife has rendered great services to social media in this country through laminating little bits of social media game card.

So, this time, as I started putting together the ‘Social Network Game’ for the Federation of Detached Youth Workers’ conference next Friday I though I’d try something different. And this morning a set of wonderful ‘Social Network Game’ postcards arrived on my doormat courtesy of Moo.com.


Picture 21.png

All I needed to do was to create each of the cards as an image, upload them to Moo, pay a few quid, and ta-da – high quality postcard-size workshop resources ready to go.

Why bother blogging this?

Well, asides from trying to save others who loose evenings to the laminating machine – I’m really interested by the potential that Print on Demand solutions like that of Moo.com can offer for:

  • Creating high quality resources – I’ve always been struck by how having good quality resources for workshops affects people’s responses. But often getting things professionally printed for a one-off workshop just isn’t viable… but can be with Print on Demand.
  • Resource sharing – Using the Moo.com API I could provide an easy way for anyone else to order a set of the Social Network Game cards I’ve designed. (In fact, once I’ve tested them out in a workshop I might try and create a set for others to get hold of…)
  • Promoting positive activities – Could the Information and Signposting project make use of the positive activities data and multi-media it’s collecting to make it really cheap and easy for activity providers to order promotional postcards to hand out?

Definitely something I’m keen to explore more. Would be great to hear about any other ideas or experience that you have…