Category Archives: Techie things

Joined Up Philanthropy data standards: seeking simplicity, and depth

[Summary: technical notes on work in progress for the Open Philanthropy data standard]

I’m currently working on sketching out a alpha version of a data standard for the Open Philanthropy project(soon to be 360giving). Based on work Pete Bass has done analysing the supply of data from trusts and foundations, a workshop on demand for the data, and a lot of time spent looking at existing standards at the content layer (eGrant/hGrantIATISchema.orgGML etc) and deeper technical layers (CSV, SDFXMLRDF,JSONJSON-Schema and JSON-LD), I’m getting closer to having a draft proposal. But – ahead of that – and spurred on by discussions at the Berkman Center this afternoon about the role of blogging in helping in the idea-formation process, here’s a rough outline of where it might be heading. (What follows is ‘thinking aloud’ from my work in progress, and does not represent any set views of the Open Philanthropy project)

Building Blocks: Core data plus

Joined Up Data Components

There are lots of things that different people might want to know about philanthropic giving, from where money is going, to detailed information on the location of grant beneficiaries, information on the grant-making process, and results information. However, few trusts and foundations have all this information to hand, and very few are likely to have it in a single system such that creating an single open data file covering all these different areas of the funding process would be an easy task. And if presented with a massive spreadsheet with 100s of columns to fill in, many potential data publishers are liable to be put off by the complexity. We need a simple starting point for new publishers of data, and a way for those who want to say more about their giving to share deeper and more detailed information.

The approach to that should be a modular, rather than monolithic standard: based on common building blocks. Indeed, in line with the Joined Up Data efforts initiated by Development Initiatives, many of these building blocks may be common across different data standards.

In the Open Philanthropy case, we’ve sketched out seven broad building blocks, in addition to the core “who, what and how much” data that is needed for each of the ‘funding activities’ that are the heart of an open philanthropy standard. These are:

  • Organisations - names, addresses and other details of the organisations funding, receiving funds and partnering in a project
  • Process - information about the events which take place during the lifetime of a funding activity
  • Locations - information about the geography of a funded activity – including the location of the organisations involved, and the location of beneficiaries
  • Transactions - information about pledges and transfers of funding from one party to another
  • Results - information about the aims and targets of the activity, and whether they have been met
  • Classifications - categorisations of different kinds that are applied to the funded activity (e.g. the subject area), or to the organisations involved (e.g. audited accounts?)
  • Documents - links to associated documents, and more in-depth descriptions of the activity

Some of these may provide more in-depth information about some core field (e.g. ‘Total grant amount’ might be part of the core data, but individual yearly breakdowns could be expressed within the transactions building block), whilst others provide information that is not contained in the core information at all (results or documents for example).

An ontological approach: flat > structured > linked

One of the biggest challenges with sketching out a possible standard data format for open philanthropy is in balancing the technical needs of a number of different groups:

  • Publishers of the data need it to be as simple as possible to share their information. Publishing open philanthropy must be simple, with a minimum of technical skills and resources required. In practice, that means flat, spreadsheet-like data structures.
  • Analysts like flat spreadsheet-style data too – but often want to be able to cut it in different ways. Standards like IATI are based on richly structured XML data, nested a number of levels deep, which can make flattening the data for analysts to use it very challenging.
  • Coders prefer structured data. In most cases for web applications that means JSON. Whilst someexpressive path languages for JSON are emerging, ideally a JSON structure should make it easy for a coder to simply drill-down in the tree to find what they want, so being able to look foractivity.organisations.fundingOrganisation[0] is better than having to iterate through all theactivity.organisation nodes to find the one which has “type”:”fundingOrganisation”.
  • Data integrators want to read data into their own preferred database structures, from noSQL to relational databases. Those wanting to integrate heterogeneous data sources from different ‘Joined Up Data’ standards might also benefit from Linked Data approaches, and graph-based data using cross-mapped ontologies.

It’s pretty hard to see how a single format for representing data can meet the needs of all these different parties: if we go with a flat structure it might be easier for beginners to publish, but the standard won’t be very expressive, and will be limited to use in a small niche. If we go with richer data structures, the barriers to entry for newcomers will be too high. Standards like IATI have faced challenges through the choice of an expressive XML structure which, whilst able to capture much of the complexity of information about aid flows, is both tricky for beginners, and programatically awkward to parse for developers. There are a lot of pitfalls an effective, and extensible, open philanthropy data standard will have to avoid.

In considering ways to meet the needs of these different groups, the approach I’ve been exploring so far is to start from a detailed, ontology based approach, and then to work backwards to see how this could be used to generate JSON and CSV templates (and as JSON-LD context), allowing transformation between CSV, JSON and Linked Data based only on rules taken from the ontology.

In practice that means I’ve started sketching out an ontology using Protege in which there are top entities for ‘Activity’, ‘Organisation’, ‘Location’, ‘Transaction’, ‘Documents’ and so-on (each of the building blocks above), and more specific sub-classed entities like ‘fundedActivity’, ‘beneficiaryOrganisation’, ‘fundingOrganisation’, ‘beneficiaryLocation’ and so-on. Activities, Organisations, Locations etc. can all have many different data properties, and there are then a range of different object properties to relate ‘fundedActivities’ to other kinds of entity (e.g. a fundedActivity can have a fundingOrganisation and so-on). If this all looks very rough right now, that’s because it is. I’ve only built out a couple of bits in working towards a proof-of-concept (not quite there yet): but from what I’ve explored so far it looks like building a detailed ontology should also allow mappings to other vocabularies to be easily managed directly in the main authoritative definition of the standard: and should mean when converted into Linked Data heterogenous data using the same or cross-mapped building blocks can be queried together. Now – from what I’ve seen ontologies can tend to get out of hand pretty quickly – so as a rule I’m trying to keep things as flat as possible: ideally just relationships between Activities and the other entities, and then data properties.

What I’ve then been looking at is how that ontology could be programatically transformed:

  • (a) Into a JSON data structure (and JSON-LD Context)
  • (b) Into a set of flat tables (possibly described with Simple Data Format if there are tools for which that is useful)

And so that using the ontology, it should be possible to take a set of flat tables and turn them into structure JSON and, via JSON-LD, into Linked Data. If the translation to CSV takes place using the labels of ontology entities and properties rather than their IDs as column names, then localisation of spreadsheets should also be in reach.

Rough work in progress... worked example coming soon

Rough work in progress. From ontology to JSON structure (and then onwards to flat CSV model). Full worked example coming soon…

I hope to have a more detailed worked example of this to post shortly, or, indeed, a post detailing the dead-ends I came to when working this through further. But – if you happen to read this in the next few weeks, before that occurs – and have any ideas, experience or thoughts on this approach – I would be really keen to hear your ideas. I have been looking for any examples of this being done already – and have not come across anything: but that’s almost certainly because I’m looking in the wrong places. Feel free to drop in a comment below, or tweet @timdavies with your thoughts.

Young Rewired State at Oxfam

Update: Postponedwe weren’t quite quick off the blocks enough to recruit young people to take part in an Oxfam hack-day during the main Youth Rewired State week: so the Oxfam YRS has been postponed. We’ll hopefully work out a new date / plan in the next few weeks. However, other Young Rewired State centres are still on the go…

What happens when you take 5 or 10 young coders and designers aged between 15 and 18; give them a room at the heart of Oxfam HQ; link them up with designers, campaigners and digital experts; and give them a week to create things with government data?

I’m not sure yet. But in few weeks hopefully we’ll find out.

I’m helping to organise a Young Rewired State event at Oxfam HQ in Oxford to do just that – and right now we’re looking for young people from the local area to apply to take part.

You can download a flyer with lots more information to share with any young people you think might be interested, and a sign-up form is here. Deadline for applications is 25th July – but the sooner applications come in the more chance they have. Young Rewired State events are also taking place across the UK, so if you know young people who might be interested but can’t make it to Oxfam HQ in Oxford every day during the first week of August, point them in the direction of the national Rewired State Website.

Legacies of social reporting: an IGF09 example

[Summary: aggregating content from the Internet Governance Forum & exploring ways to develop the legacy of social reporting at events...]

Introducing social reporting to an event can bring many immediate benefits. From new skills for those participating in the social reporting, to increasing opportunities for conversation at the event, and building bridges between those present at an event, and those interested in the topic but unable to physically take part.

However, the wealth of content gathered through social reporting can also act as a resource ‘after the event’ – offering insights and narratives covering event themes, and offering contrasting and complementary perspectives to any ‘official’ event records that may exist.

Many of the tools I use when social reporting at an event have a certain ‘presentism’ about them. Newer content is prioritised over older content, and, in the case of dashboard aggregators like NetVibes, or services such as Twitter, good content can quickly disappear from the front page, or even altogether.

So, as we got towards the end of a frantic four days social reporting out at the Internet Governance Forum in Egypt earlier this year, I started thinking about how to make the most of the potential legacy impacts of the social reporting that was going on – both in the event-wide Twitterstream, and in the work of the young social reporters I was specifically working with.

Part of that legacy was about the skills and contacts gathered by the social reporters – so we quickly put together this handout for participants – but another part of that legacy was in the content. And gathering that together turned out to be trickier than I expected.

However, I now have a micro-site set up at where you can find all the blog posts and blips created by our social reporters, as well as all the tagged tweets we could collect together. Over the coming weeks colleagues at Diplo will be tagging core content to make it easy to navigate and potentially use as part of online learning around Internet Governance. I’ve run the 3500+ twitter messages I managed to (eventually) aggregate through the Open Calais auto-tagging service as an experiment to see if this provide ways to identify insights within them – and I’ve been exploring different ways to present the information found in the site.

Learning: Next time set up the aggregator in advance
I didn’t start putting together the site (a quick bit of Drupal + FeedAPI, with the later addition of Views, Panels, Autotagging, Timeline and other handy modules) till the final day of IGF09, by which time over 50 blog posts had been added to our Ning website, and over 3000 twitter messages tagged #igf09.

Frustratingly, Ning only provides the last 20 items in any RSS feed, and, as far as I can tell, no way to page through past items; and the Twitter search API is limited to fetching just 1500 tweets.

Fortunately when it came to Twitter I had captured all the Tweets in Google Reader – but still had to scrape Twitter message IDs back out of there – and set up a slow script to spend a couple of days fetching original tweets (given the rate limiting again on the Twitter API).

For Ning, I ended up having to go through and find all the authors who had written on IGF09, and to fetch the feeds of their posts, run through a Yahoo Pipe to create an aggregate feed of only those items posted during the time of the IGF.

It would have been a lot easier if I set up the Drupal + FeedAPI aggregator beforehand, and added new feeds to it whenever I found them.

Discoveries: Language and noise
I’ve spent most of my time just getting the content into this aggregator, and setting up a basic interface for exploring it. I’ve not yet hand chance to dive in and really explore the content itself. However, two things I noticed:

1) There is mention of a francaphone hash-tag for IGF2009 in some of the tweets. Searching on that hash-tag now, over a month later, doesn’t turn up any results – but it’s quite possible that there were active conversations this aggregator fails to capture because we weren’t looking at the right tags.

Social Network Map of Tweets

Mapping Twitter @s with R and Iplot

2) A lot of the Twitter messages aggregated appear to be about the ‘censorship incident‘ that dominated external coverage of IGF09, but which was only a small part of all the goings on at IGF. Repeated tweeting and re-tweeting on one theme can drown out conversations on other themes unless there are effective ways to navigate and filter the content archives.

I’ve started to explore how @ messages, and RTs within Tweets could be used to visualise the structure, as well as content, of conversations – but have run up against the limitations of my meagre current skill set with R and iplot.

I’m now on the look out for good ways of potentially building some more intelligent analysis of tweets into future attempts to aggregate with Drupal – possibly by extracting information on @s and RTs at the time of import using the promising FeedAPI Scraper module from the great folk at Youth Agora.

Questions: Developing social reporting legacies
There is still a lot more to reflect upon when it comes to making the most of content from a socially reported event, not least:

1) How long should information be kept?

I’ve just been reading Delete, which very sensibly suggests that not all content should be online for ever – and particularly with conversational twitter messages or video clips, there may be a case for ensuring a social reporting archive only keeps content public for as long as there is a clear value in doing so.

2) Licensing issues

Aggregation on the model I’ve explored assumes licence to collect and share tweets and other content. Is this a fair assumption?

3) Repository or advocacy?

How actively should the legacy content from social reporting be used? Should managing the legacy of an event also involve setting up search and blog alerts, and pro-actively spreading content to other online spaces? If so – who should be responsible for that and how?

If you are interested in more exploration of Social Reporting, you may find the Social by Social network, and Social Reporters group there useful.

Voicebox – making the most of engagement

VoiceBoxThe process is depressingly familiar. Someone asks you to fill in a survey for research or consultation. They take away your results – and – in the rare cases where you ever hear of the research/consultation again – you see that your responses have been written up as part of a dull report, full of graphs made in Excel, and likely to sit on the book shelves of people whose behaviour betrays the probability that they’ve not really read or understood what was in the report.

Which is why it is refreshing to see the (albeit well funded) Vinspired team doing something rather different with their Voicebox survey of 16 – 25 year olds. Here’s how they introduce the project:

Journalists, politicians, academics, police and parents all have a point of view on what the ‘kids of today’ are like.

But has anyone ever asked the young people themselves, and not just in a focus group in Edmonton, but in an open and transparent way and on a national scale? And has anyone done anything smart, cool or fun with that data, that might, just might, make the truth about young people be heard?

These questions were the starting point for Voicebox; a project which aims to curate the views of 16-25s, visualise the results in creative ways, and then set that data free. Over the coming months, we’re going to try to find out how young people spend their time, what they care about, how many carry knives, what they really think about the area they live in and much more.

But not only are they breaking up their survey of views into manageable chunks, and giving instant feedback on the results to anyone filling the survey in – they are opening up the data they collect through an open XML API and CSV downloads, so anyone can take and use the data collected.

Plus – to make sure responses to the question ‘What do young people really care about?’ make it in front of decision makers – they’re planning to wire up the responses to a robot, ready to hand-write out each and every response as part of an installation in Parliament.

Of course, it’s not often that your budget stretches to custom-built flashy survey applications and internet-connected-robots when you’re looking to gain young people’s input into local issues or policy making. But what Vinspired have done with VoiceBox does raise the questions: how will you make sure that you really make the most of the views young people give you? Any how will you get young people’s views in front of decision makers in a way that makes them tricky to ignore?

Certainly two questions I’m going to be asking myself on any future consultation or engagement projects I work on…

Explaining Twitter in one page…

I’ve been trying to create a general purpose one page guide to Twitter for a while. I’ve made two attempts in the past for particular situations – although with the end of SMS based access to Twitter in the UK those guides are both out of date.

But – I think I’ve finally created a guide I’m happy with – with this guide created for an Action Learning Set on Youth Participation and Social Network Sites I’m currently co-facilitating – but written to work as an introduction in just about any circumstance.

You can get the PDF of this one page guide to Twitter from (look for the download link) or, as this guide, like all the other one page guides, is provided you can download an Open Office copy (ODT) to edit and re-purpose as you wish (just make sure you let me know about any updated versions).

(Thanks to Harry @ Neontribe for photos and feedback used in this guide)

Guide preview:

NYA Youth Issues News as RSS

I like to try and keep up to date with the latest news about goings on in the Youth sector. I’ve got a dashboard page in my NetVibes homepage devoted to the latest information on youth issues and initiatives – particularly useful for sparking ideas about youth participation or promoting positive activities.

One of the best sources for news about young people related news is The National Youth Agency’s press clippings service (Youth Issues News), which serves up a dose of the latest headlines every day. Frustratingly though, it’s not available as an RSS feed to slot nicely into my news dashboard – so, with a little help from folk on twitter, I’ve used to create my own RSS feed from the NYA press clippings.

I thought other’s might find it useful as well, so if you want to use it, the simple copy this link here into your RSS reader and (if it all works alright) get daily updated headlines of young people-related news.

(If you’re not sure what all this RSS thing is about then the BBC have a pretty good introduction on their website.)

Sharing learning from the Plings project…

[Summary: I'm going to be blogging for the Plings project - exploring new ways of collecting and sharing positive activity information for young people]

The Plings project has been one to watch for a while. Exploring new ways of collecting, processing and sharing information for on positive activities for young people.

Local authorities are under a duty to provide information on the out-of-school activities in a local area  young people can get involved with – but collecting and disseminating all that information is a big challenge.

Plings, built by research cooperative Substance, is an open source project that has been seeking to pilot and explore ways of semantically aggregating and then distributing that data, through XML feeds, iCal widgets and other mash-ups. Now that Substance has won the contract to lead the DCSF funded Information and Signposting Project, they’re going to be accelerating the development of the Plings project, and working with 20 local authorities to generate stacks of shared learning about collecting, processing and sharing positive activity information. This week has already seen the data from Plings made available via DigiTV, and I’m in the midst of scoping how positive activity information could be shared through Social Network Sites.

And if I can keep up with all the learning being generated, I’ll hopefully be blogging as much of it as possible over on the Plings blog.

So, if you’re interested in public sector mash-ups, promoting positive activities to young people, or just exploring new ways of innovating in the youth sector, please do subscribe to the Plings blog and throw in your thoughts and reflections to the comments there as the project moves forward…

(Disclosure: My blogging for the Plings project is part of a paid contract with Substance. I’m sharing news of it here as I think the learning from the ISP/Plings project will be of interest to a lot of readers of this blog.)

A social media game without an evening lost laminating

cardimages.jpg[Summary: Using to make workshop resources]

(This post is mainly for those who have spent far too long laminating little bits of card late at night in preparation for a workshop the next day…)

I’ve used variations on the Social Media Game in workshops before. The game, which works by setting scenarios, and getting workshop participants to explore how they would use different tools or approaches to respond to those scenarios is a really effective way to encourage knowledge sharing and practical learning in a session.

However, preparing the game cards for a workshop always turns into one of those nightmare jobs. Simply printing them on paper or card isn’t enough – they’re too flimsy – and it’s always surprising how much the quality of a resource affects people’s interaction with. So, up until now – that’s always meant an evening of laminating little bits of printed paper to create good quality cards. And I know I’m not the only one who suffers this small but significant laminating challenge – @davebriggswife has rendered great services to social media in this country through laminating little bits of social media game card.

So, this time, as I started putting together the ‘Social Network Game’ for the Federation of Detached Youth Workers’ conference next Friday I though I’d try something different. And this morning a set of wonderful ‘Social Network Game’ postcards arrived on my doormat courtesy of

Picture 21.png

All I needed to do was to create each of the cards as an image, upload them to Moo, pay a few quid, and ta-da – high quality postcard-size workshop resources ready to go.

Why bother blogging this?

Well, asides from trying to save others who loose evenings to the laminating machine – I’m really interested by the potential that Print on Demand solutions like that of can offer for:

  • Creating high quality resources – I’ve always been struck by how having good quality resources for workshops affects people’s responses. But often getting things professionally printed for a one-off workshop just isn’t viable… but can be with Print on Demand.
  • Resource sharing – Using the API I could provide an easy way for anyone else to order a set of the Social Network Game cards I’ve designed. (In fact, once I’ve tested them out in a workshop I might try and create a set for others to get hold of…)
  • Promoting positive activities – Could the Information and Signposting project make use of the positive activities data and multi-media it’s collecting to make it really cheap and easy for activity providers to order promotional postcards to hand out?

Definitely something I’m keen to explore more. Would be great to hear about any other ideas or experience that you have…

Free guide: analytics for social change organisations…

Tracking impact - analytics guideIn social change organisations we want to change things. Real world things. Things that make a difference to people.

If changing the numbers in our website statistics can contribute towards that, then we want to change those numbers.

But it can be far to easy (doubly so, it seems, when reports for funders are involved) to get trapped looking at the numbers, and to lose sight of how those are part of creating change for people.

I recently had the chance to put together a training pack/guide for Participation Works about web analytics, and how they can be used in a social change focussed organisation.

Much of the guide was specific to Participation Works, but a lot is, I hope, relevant to other social change organisations as well. And as I must acknowledge much debt to shared content from Beth Kanter and many others in putting together this guide, it only seems right to share what I can of it back freely to non-profit organisations.

So, attached to the bottom of this post you will find an outline version of that guide for you to use, adapt and build upon . To quote from it:

This is a skeleton document for building a guide to web analytics for social change organisations.

It is shared under a creative commons non commercial license in the interests of supporting those working with not for profit organisations. If you wish to use or adapt this guide as part of paid consultancy to not-for-profit organisations, or in private sector settings, please contact in advance.

This guide is not out-of-the-box ready to be used. Throughout this document you will find text highlighted in yellow which will need customizing for the particular context where use of this guide is intended. This customization will require some technical knowledge. Other areas of the document not highlighted in yellow may also need to be changed depending on your context.

Analytics Guide contentsThat said, most of Chapters 1, 2 and 4 can be taken and used fairly as-is.

Oh, and whilst you are thinking about ways of measuing the impact of your organisation, if you happen to be:

  • From a not for profit organisation,
  • Based in Enland,
  • Working with young people, or with young people as stakeholders in your work,

then you might want to get in touch with Participation Works to find out about their free programme of training and support third-sector organisations in building their capacity to listen to and respond to the voice of children and young people. The web analytics only tell you so much… it’s the conversations with, and the handing of power to, service users that really helps you know whether you’re heading in the right direction…

Attachment: Analytics for social change organisations.doc