Open Arms? Unlocking raw data

[Summary: Exploring the process of requesting access to a raw dataset]

Update 22nd December: Almost a month on, and whilst my post on the OPSI Data Unlocking Service has had 30 votes in favour (more than any other request I can see by far) I’ve not heard from either OPSI or the data owner/data.gov.uk in response to my comments/requests for raw data. So far, it looks like requesting new raw data through the advertised routes doesn’t meet with much action. I’ll wait till the Open Up competition closes in the New Year to see what results that might bring – and then it’s time to start looking at what other ways there might be to request this data…

A lot of the open government data that has been released in recent years is only available locked up in PDFs and website interfaces. As this definition seeks to explain this radically limits the potential uses of that data.

Following a recent event organised by Campaign Against the Arms Trade I was curious about who the UK issues Export Control Licenses to, so I took a look on data.gov.uk. Sure enough, the Strategic Export Controls: Reports and Statistics Website is listed on the Data.gov.uk catalogue. But on closer investigation it turn out that the Strategic Export Controls: Reports and Statistics Website (a) requires registration before you can access it; (b) predominantly provides data as PDFs; (c) has a very complex search interface that generates reports in the background ready for download later – but reports which don’t include key information such as the month a license was issued. All the data is clearly in the system – as you can search by date – but in it’s current form, to extract meaningful information about where UK companies have gained arms export licenses (or been refused) would be a long and slow job.

I’ve heard about the OPSI Data Unlocking Service, and I’ve been in a number of presentations hearing senior government officials and Ministers talking about the commitment of government to releasing raw data, so I thought this would provide a good opportunity to test the process of requesting raw data.

So – as of this morning, I’ve tried three routes to ask for access to this data:

  1. Adding a comment to the package on Data.gov.uk requesting access to the data. I’ve also sent a copy of the comment via the ‘Feedback Form’ listed under ‘Contact Details’ for each dataset. From past experience, I think the comment form gets forwarded to the Data.gov.uk team who forward it on to the department – but I’m not certain where that message has gone, or who reads the comments on datasets.
  2. Submitting a request to the OPSI Data Unlocking Service. This appeared to submit an e-mail form to the OPSI webmaster, who is, I understand, supposed to check the request and  then add it to the OPSI website for others to vote on – as well as – I presume, to someone inside OPSI to review and act upon – although the process by which a request could lead to data is fairly unclear. My request is not yet on the website.
  3. Adding an idea submission to the TSO Open Up Competition which you can see here. As I understand, the TSO are working closely with government on open data projects, although don’t have authority to open access to data themselves. However, there does appear to be an interest from the competition in what datasets people want to see – so I figured a request via here can’t harm.

I suspect a fourth route might be to submit a Freedom of Information Request, but I’m keen to explore in the first place how these open data requesting channels work in practice. Have I missed any? How else should be requesting access to raw data? Do you have experience of requesting data? What worked and what didn’t?

I’ll report back on any updates on the process of getting access to this data…

Defining raw data

[Summary: explaining what raw data is and  why it matters]

On the Friday of last weeks Open Government Data Camp in a discussion on how to empower non-technical citizens, civil servants and community activists to make use of open government data, we hit upon the idea of an ‘Open Data Cook Book’ of simple recipes for working with data. The recipe analogy also emerged (via @exmosis) in a twitter discussion on Monday about ‘machine-readable data’ – and a bit of cook-book drafting later, here’s my attempt at describing good open data, whilst avoiding as much as possible any technical terms or getting caught up in the ambiguity of machine-readability.

Sourcing your ingredients for a raw data project:

For all of the recipes in the forthcoming open data cook book you will need to have access to some raw data to work with. You might already have the data you want to work with to hand, or you might have ideas for a great project, but no idea of where to get the data you need. In cook book we will outline a range of places you can source your data, and how to prepare it ready to be part of your data-creations.

Identifying raw data

You can find data all over the place when you start looking, but all-too-often the data you want has been pre-prepared, locked down in written reports, or only available through complicated website interfaces that only let you glimpse a small bit of the data at any one time.

Raw data is easier to manipulate with a computer. When you have raw data you can sort it, edit it and remix it in new ways with the tools you want to use.

Locked, raw, linked

We can think of data on a continuum.

At one end, is locked-up data. This is the sort of data you find in reports, charts and maps. Someone has interpreted what the data means and has pinned it down in a particular context. To use this data in new ways you will probably have to spend time converting it into a raw format through scraping, crowd-sourcing, or lots of manual work.

In the middle is raw data. This is when the data is available in a structured way that you can load into the software or online tools of your choice and can explore, manipulate and remix it. Raw data is ready for us in open data recipes.

However, to make use of any raw dataset you will need to know what it contains. Often raw data can contain cryptic headings, titles and codes for columns, rows or other elements of the dataset, so you will need to make sure you have access to meta-data which tells you what all the things in your raw dataset are, and how the data was generated (sort of like the ingredients list, and list of additives and preservatives on the back of any food packet).

Linked data and RDF provide a way for the meta-data to be transferred along with the raw data, and for connections to be made between different datasets that make it possible to discover even more context about something in your data. Linked data can make it easier to integrate different datasets when they use the same ways of representing different parts of the data. The tools for working with linked data aren’t quite as widespread yet as the tools for working with standard raw data formats, so often linked data is transformed into a common raw data format like CSV (spreadsheets/tabular data), or JSON and XML (flexible structures for different sorts of data).


I’ve still some more work to do tidying up these definitions – and I hope in the cook book we can make use of a few more visual metaphors to show the difference between locked-up, raw and linked information. The process of creating thinking through the relationship between raw and linked data as defined above, in conjunction with the DIKW model also seems to hint at a useful point I’ve not found a good way of articulating yet: that in most mash-up creation/data-use, human understanding of both data and context(meta-data) as separate elements is important – so whilst linked data helps context travel with data, when it comes to working with data, most users need to decompose it back into raw data with separate data and context to work with it.

A fear of open data heresy? Time to move beyond zealotry?

[Summary: A quick post for folk mainly for folk at today’s Open Government Data Camp, on the need to raise critical perspectives about open government.]

There are strong normative arguments for opening up government data – and there is great potential to be realised from that.

However, whilst the broad brush idea can command widespread support, the details of how we do open government data matter, and attentiveness to the social impacts is vital.

I’ve heard many people at events, including the Open Government Data Camp, express nuanced views on openness. And yet, far too often such views have been followed by comments such as “but I’m not sure I should be saying that sort of thing here”, or a retreat from the critical argument in order to add voices to the call for ‘more data now’.

So – I’m for a bit more heresy. A bit more challenge to the zealotry. A slightly louder voice for the critical friends of the open data movement.

It’s possible to argue for greater openness of data, and to think critically about the impacts that open data will have. It’s important to ask the question ‘Open data + what’ ? What do we need to be doing as well as releasing data to drive positive social change.

Young Lives Linked Data Demonstrator

[Summary: showcasing linked data for development project]

Over the past month or so I’ve been working for IKM Emergent on a demonstrator project to explore the potential implications of linked data for information management in the development sector – seeking put a small sub-section of the survey micro-data from the Young Lives longitudinal study online in order to explore the process and potential of generating linked data in development-focussed settings.

The results of that project are now live and online for the time being, and accessible here. The most visually interesting part of the demonstrator (thanks to the work or Rupert Redington at NeonTribe) is the Comparator tool which does some pretty clever things to identify ‘Data Cubes’ in the Young Lives linked data dataset we’ve published, and to offer (in the case of the smoking prevalence data) comparisons between the Young Lives dataset, and another comparable dataset we’ve also loaded into our Young Lives datastore.

However, through the demonstrator we’ve also made the Health dataset from the Young Lives data available to browse via OntoWiki interface, and to query via SPARQL – exploring how linked data structures give us the opportunity to annotate the questions from the young lives data – potentially helping future researchers to find questions and data of interest to them,

The presentation below steps through some of the basics of Linked Data, before, from slide 13 onwards, introducing the Young Lives Data Demonstrator.

I’ll be sharing some more learning notes from the Young Lives Linked Data Demonstrator over on the open data impacts blog soon.

Open Data Hack Day in Oxford – 4th December

Open Data Day Oxford on the 4th December 2010 is on the look out for designers, coders, copy-writers, policy people, journalists, statisticians, campaigners, data-geeks and anyone interested in exploring what can be done when you take some public data and spend a day creating things with it in order to contribute to some positive social change goals.

Thanks to Cowley based Web & Software Developers White October we’ve got a fantastic venue for an Oxford Open Data Hack Day* as part of the global Open Data Day events taking place right across the world.

Here’s how an open data hack day in Oxford should work:

  1. Anyone interested in taking part signs up using the registration form here, and, optionally, adds some notes to the planning Wiki page (Just click ‘Edit’ at the top-right of the wiki page, scroll to find where to add your notes, drop them in, ignoring any extra characters/symbols on the page you’re not sure about, and save the changed page. )

    You can sign-up with an idea for the project you want to work on on the day – or just to offer your skills. You don’t need to have taken part in a hack-day before, or to be an uber-geek to take part!

  2. The planning group will make sure we’ve got a good mix of people and possible project teams emerging – and might get in touch to link you up with potential collaborators for the day so you can have conversations in advance.
  3. On the day, we’ll start around 10am in the fantastic split-level and spacious White October offices, which are an walk/Bus Ride from the centre of Oxford (or a short bus-ride from the station) with coffee, refreshments and chance to meet other participants and hear about different ideas for projects on the day.
  4. We’ll form into teams to work on particular projects. Teams will find a space, get laptops and computers out -and start building things. You can either spend your whole day working with a particular team, or you can take your skills between teams to help them out when they need.

    Teams usually develop fairly organically to have 3 – 5 people in (although some people choose to work in smaller or larger groups) and will have a mix of skills.

  5. In your teams you will identify the data you are working with and what you want to do – and start creating something. It could be anything. At past events we’ve built everything from mash-up maps, through to paper-based card-games and Facebook apps.

    Recent ideas I’ve heard for hack-day outputs include data-driven stencils for creating artworks; web applications for checking the best place to park a bike; mobile phone-based tools for finding transport routes – and lots more.

    I’m expecting to be spending a lot of my time helping source data – and help people get hold of the data they want – and the team from White October will, I’m sure, be on hand offering their skills in all manor of digital webby stuff.

  6. By about 1pm we’ll get some lunch in – and depending on how work is going, we might break for people to feedback on progress so far and share any offers of, or requests for extra skills they have. We might even be able to link up by Skype with one of the other open data day events taking place around the world (tbc.)
  7. After an afternoon of making stuff, around 5pm, we’ll have a show and tell. If any kind sponsors get in touch we might even have some prizes to award to the best or most innovative creations.

    We’re thinking of inviting people from the City & County Council or other groups who might have an interest in releasing data along to see what has been created. Anyone with contacts who we could invite along to the show and tell, do let me know.

  8. We’ll tidy up and head to the pub – an optional ending to the day.

Are you up for it? If so – head over to the Wiki page to get registered. Offers of help organising, sourcing sponsorship, inviting show and tell participants etc. all welcome. Any questions? Drop them in as blog comments or on the Wiki.

(*Whilst some of the data we focus on might be Oxford/Oxfordshire based, participation is open to all, not just those based locally)

Brief practical notes on open data and activism

Flip Chart from CAAT Conference[Summary: Context, links, resources and ideas for working with open data in campaigning organisations and/or third-sector contexts.] (See other open data posts here.)

The rough notes below come from an short open session discussion held at the  Campaign Against the Arms Trade (CAAT) annual gathering last Saturday exploring how open data could be useful to a campaigning organisation. A PDF copy is here: Open Data and Campaigning.

Background & Context

The last 18-months have seen an impressive array of policy initiatives and practical actions leading to the release of datasets from governments in the UK, the US and across the world in open and re-usable formats online. Datasets ranging from the location of educational institutions, to details of taxation and government spending, have been brought together in data portals such as data.gov and data.gov.uk.

The open government data ‘movement’ has three broad constituent parts:

  • An open Public Sector Information (PSI) movement – drawing upon economic arguments to call for government data to be released and made freely re-useable. Often drawing upon comparisons between EU context where government collected data is copyright and restricted, and the US where government datasets are more open and large industries have developed on the back of them (e.g. Weather data; Geodata etc.).

  • A transparency movement – linked to Access to Information and Freedom of Information movements – calling for the release of data in the interests of democratic empowerment, or data to be used in particular contexts and settings.

  • Digital government & semantic web computerization movements – focussed on the potential for innovation and more efficient working when data is made available for computer processing: and working to build open networks of knowledge across the Internet though linked-data approaches.

Many different groups can be found within the open government data ‘movement’ – from groups calling for aid transparency, to SME companies seeking to address what are seen as unfair data monopolies.

Policy context:

(See Open Government Data & Democracy report for a full timeline)

  • The http://data.gov initiative in the US proceeded from Obama’s first executive order on taking power as President.
  • http://data.gov.uk in the UK was initiated by Gordon Brown in 2009.
  • Since coming to power in 2010 the Coalition Government in the UK have continued to push open data initiatives – thought with a slightly different ‘transparency’ and ‘accountability’ framing.
    • A requirement has been placed on local authorities to publish all spending over £500 by January, listing supplier and spend.
    • Government departments are under a similar requirement for all spend over £25,000, and have been asked to publish senior staff pay details and internal organizational diagrams.
    • Francis Maude has spoken of the need for a ‘Freedom of Data’ act, and has called for all responses to Freedom of Information requests that contain data to provide that data in machine-readable forms (i.e. Excel spreadsheet rather than print-out of PDF files…)
    • Aid Transparency has been high up the government’s development agenda.
  • The World Bank have released significant amounts of their data as open data.
  • Australia, New Zealand and many European countries have ongoing open data initiatives and campaigns.

Beyond government data

It’s not only government supplied data that is of interest to campaigners:

  • Projects like TheyWorkForYou.com and PublicWhip.org generate structured data about politicians voting records by ‘scraping’ parliamentary records;
  • Data Journalists (led by innovators at The Guardian amongst other places) publish their research as open accessible spreadsheets of data that others can re-use.
  • Some NGOs and community organisations are publishing open datasets.

Why data?

One of the key properties of data is that it can be easily manipulated by computer – allowing datasets to be combined, visualized, explored and used in many more ways than a written report or printed document can.

Where to find data

For official government data – the guardian’s World Government Data Search looks across a range of data catalogues like http://data.gov.uk. Find it at http://www.guardian.co.uk/world-government-data and search for keywords or topics of interest to you.

You can also search http://data.gov.uk direct to browse data my department or topic.

http://ckan.net/ provides a catalogue of open data from many different sources – including government data, NGOs and research projects. It is a good place to ‘register’ any open data you create. It is also wiki-like, meaning any user can edit the records – allowing the creation of ‘collections’ of data on a particular topic: e.g. ‘arms trade’.

ScraperWiki.com provides a collection of ‘scrapers’ which collect structured data from unstructured data-sources (i.e. make open data where the original publisher didn’t provide it). For example, generating a dataset of hospitality received by UK Government Ministers, originally only available as a large collection of different word documents is now here: http://scraperwiki.com/scrapers/government-meetings-with-external-organisations-ne/ and available for download (Update: it’s also now available from http://transparency.number10.gov.uk/)

If you are looking for a particular dataset – it can be worth asking in the data.gov.uk forums, or using the #opendata hash-tag on Twitter.

Data on MPs and voting records is available from www.theyworkforyou.com in the UK, and the PublicWhip.org project collects more detailed voting records and makes them available.

When data isn’t available

Try using the Public Data Unlocking Service to request that data is proactively published: http://www.opsi.gov.uk/unlocking-service/opsipage.aspx?page=unlockindex

If using the Freedom of Information Act to request data, remind the recipient of Francis Maude’s policy statements on the need to provide machine-readable data in return.

If the information is available on websites, but not as structured data – consider putting a request on http://www.scraperwiki.com for someone to build a tool to screen-scrape the data.

Consider using any of the ‘data competitions’ (e.g. http://openup.tso.co.uk) as a higher-profile way to ask for a dataset: emphasizing the government’s focus on accountability through transparency in other sectors such as local authority spending and aid.

Use the facts you can find from datasets like COINS (http://data.gov.uk/dataset/coins) to better structure Freedom of Information requests or crowdsourcing activities.

Explore ways to ‘crowd-source’ the data by calling on campaigners and supporters to find out particular facts – and to enter them into shared online spreadsheets (e.g. using Google Spreadsheets and Google Forms you can create an easy way for people to collaboratively input into a shared document – which can be instantly published online). Crowdsourcing tools like Ushahidi can also be used to develop projects such as http://WhereAreTheCuts.org – crowdsourcing reports of public spending cuts.

Working with data

Working with data scares many people – but it can start off very simply, but there are many approaches – including:

  1. Using data-driven websites such as http://TheyWorkForYou.com (MPs speeches and voting) or http://WhereDoesMyMoneyGo.com (government spending) which have taken government data and made it available in more accessible forms.
  2. Downloading and exploring a single dataset – many datasets can be opened in spreadsheet software like Excel. Sort and filter the columns to look for interesting information.
  3. Visualise the data – using a tool like IBM Many Eyes where you can upload simple datasets and explore a range of different ways of presenting the data.
  4. Building a mash-up – using tools like Google Spreadsheets and Google Fusion Tables, or Google Refine (available for free download) to explore and combine datasets.Google Fusion Tables will allow you to upload any spreadsheet, and, if it contains place names, quickly ‘geocode’ the data for displaying on a map. You can also combine two datasets – matching on any shared keys (e.g. MP name; Town name; Constituency) to build larger datasets.
  5. Holding a hack day – hack days like those organized by Rewired State bring together developers (coders/geeks) and people with problems to solve and spend one or two days of concerted effort creating ‘hacks’ (rapid prototypes) which address those issues, often using open data.For example, a hack-day could look to generate visualizations concerning arms licenses (CAAT Specific), or to create tools that support campaigners to get information to use when writing to MPs. (Update: We could have a campaigning strand at the Oxford Open Data Hack Day on 4th December if there was interest)
  6. Commissioning open data-based tools – developing hack-day created prototypes, or other ideas, into full working tools.

  7. Training activists in using data – through workshops and hands-on activities. (I’m mid way through developing a training workshop at the mo… suggestions of groups to pilot with welcome…)
  8. Releasing datasets – from in-house research or crowd-sourced data – and inviting supporters to use the data in creative ways. For example, putting researched data into Google Spreadsheets and, much as the Guardian Datablog does, sharing links to that data whenever posting news stories or website pages based upon it.

Going further

Search for the #opendata community on Twitter; or the ‘Open Government Data’ mailing lists run by Open Knowledge Foundation. Most of the links above will also provide access to further practical and background information on open government data.

Tim Davies, Practical Participation (tim@practicalparticipation.co.uk) can offer consultancy, training, workshops and support for organisations exploring the use of open data in campaigning. Please do get in touch to explore more…

Open government data is not just a one-way flow…

[Summary: How can citizens, community institutions and social enterprise be part of producing ‘government’ data as well as consuming it? Some quick reflections…] (Cross-posted to Open Data Impacts blog)

Alison Powell poses the question in this blog post of whether we are moving into an era of ‘policy-based evidence’: where ideologically-driven policy making may lead to an end of evidence collection on key indicators (justified, no doubt, in the interests of ‘efficiency’), but impoverishing our understanding of the impacts of key policy choices. Alison certainly has a point: collecting evidence on an issue has been a key political strategy for shifting the political debate: and when evidence on the impact of a policy is gone – showing the positive or negative impact it had becomes far trickier.

However, just because government stops collecting data, or requiring that data is collected, doesn’t necessarily have to mean the loss of important social-policy datasets. The same transformational technological forces that mean government no-longer needs to, or can justify, monopolising the analysis of state data, means that the monopoly power of government is no longer needed to collect and collate many social-policy relevant datasets.

For many datasets the state has acted as co-ordinator of data collection: using it’s authority to require data to be shared in a standardised form (more often than not, spreadsheets or forms filled in and mailed or e-mailed in to some official in central government, who then rekeys data into another spreadsheet…). But: with collaborative online tools, will from the grassroots, and the right co-ordination/leadership many important datasets may be possible to generate without government involved at all.

Of course, if the strict definition of open government data is only applied to “produced or commissioned by government or government controlled entities” (though the definitions are a live debate…) then what I’m really talking about is community-created “open governance data” – or ‘data essential for informed democratic policy making’.

I don’t pretend that all the datasets Alison fears will be lost will survive: but it is worth thinking about how, if government no longer wants the data, those who care about the stories it will be telling in a few years time, keep collecting and take open, collaborative approaches to making governance data a two-way street…

(Some of the thoughts here are based on the lit review/analysis in §2.1 – 2.3 of my dissertation)

Online version: Open Data, Democracy and Public Sector Reform

Reposted from my Open Data Impacts research blog, which is where I’ll try and keep most open data related posts in future, with this blog maintaining it’s wider focus…

A public report based on the research for my MSc Dissertation is now available here*.

Thank you to everyone who contributed to the research whether in discussions, interviews or responding to the survey. As promised, I’ve started to share data from the survey, and will add to this as time allows in coming weeks.

Published for digressions

Over the weeks since I handed in my MSc Dissertation I’ve been trying to work out how best to share the final version. Each time I’ve started to edit it for release I’ve found more areas where I want to develop the argument further, or where I recognise that points I thought were conclusions are in fact the start of new questions. After trying out a few options, I settled on the fantastic Digress.it platform to put a copy of the report online – giving each paragraph it’s own URL and space for comments and trackbacks.

Hopefully this can help turn a static dissertation into something more dynamic as a tool for helping take forward thinking about the impacts of open government data. All comments, feedback, reflections and thinking aloud on the document welcome.


*Note: This is not the copy of the dissertation I submitted. That is still with the University being marked. When I submit a hard and digital library copy later this year I’ll post a link to those as the ‘official’ literature.

Oxfordshire: open & interactive

[Summary: A local post about open data, interactive working and social media in Oxford & Oxfordshire – and some rough ideas for making stuff happen…]

There’s not all that much open data published by local authorities in Oxfordshire right now, and whilst there are some great pockets of social media use, and digital technology projects across the different local authorities in the County, online interactivity from councillors, digital engagement from local councils, and hyperlocal community websites seem pretty sparse round here. We’ve got some great geek gatherings and social media meets, but not much that I can find in the way of social media surgery type activities.

But, having met with quite a few people from different local authorities across the County in the last month  it seems clear that there is real potential for more online engagement and open working in Oxfordshire, just some gaps in the knowledge, networks and catalysts to make things happen.

Which got me wondering about how the knowledge, networks and catalysts could be brought together. What would help…

  • …local authorities in Oxfordshire to understand, explore and release more open data;
  • …local authorities and community groups to get the most out of social media and interactive technology;
  • …turn Oxfordshire from a bit of a laggard in the worlds of open data and online interactivity, into a leading light…

And I realised: I’m not sure. But, here’s two modest proposals:

  • An informal gathering some time in September of people interested in catalysing more online engagement and open data action in the county to explore possibilities. Could we set up a regular social media surgery? What about some hack-days with local open data? Or should we head out a build a better directory of the hyperlocal websites across Oxford? Interested? Let me know in the comments below – and suggest when might be a good time on this Doodle and I’ll try and find a suitable venue… (offers of venues welcome…)
  • Running a half-day event for Oxfordshire local authorities sometime in the Autumn to provide an introduction to open data; social media and ideas for more interactive ways of working. Sometime to be discussed at an informal gathering perhaps – but I’d also be interested to hear direct from anyone in Oxfordshire Councils about whether this would be useful / what would be most useful…. drop me an e-mail if you work with an Oxfordshire LA and you would be interested; or if you work with open data / social media locally and might be interested in helping organise something.

What do you think? If there is interest then I’d be up for spending a bit of time helping make something happen…

Of course – this may all already be happening? Or it might have been tried before? So comments / ideas on stuff already going / criticism / alternative ideas etc. welcome too…

Open data requires responsible reporting…

[Summary: Some initial reflections on the release and reporting of COINS government spending data]

The last week has seen big moves in the opening up of Government data, with the release today of the COINS database of government spending.

Since it was released at 9.30 this morning there has been buzz of activity trying to clean the raw data up into usable forms (see the Open Knowledge Foundation and Guardian interfaces to exploring the COINS data) and I think it’s certainly fair to say that the race to create ways to explore the data has generated some impressive results – leading to tools beyond what may have been created by an internal government process to present the same data in user-friendly forms. We’re learning a lot right now about the potential of crowd-sourced collaborations between government and other groups. And thanks to the development of good ways to explore the data, it is already providing the basis for news stories on government spending… and this is where we’ve still got a lot to learn.

Responsible Reporting

Neither the Guardian (disappointingly), nor the Daily Mail (unsurprisingly), in reporting that government spend £1.8bn on consultants last year, give an account of how this figure was derived. Transparency can’t be for government alone.

It does not seem to be too much to ask that the reports give an account of how this data was derived, given they can very easily link to the raw data itself. The £1.8bn Consultancy Spend story is interesting. But without knowing what categories of codes from COINS were used to generate that figure – I’ve no way of using the transparency of the government data to explore that finding more for myself.

Interestingly, this may also fall foul of the terms under which the data is available: ‘Crown Copyright with Data.gov.uk Rights‘. This requires attribution of the data ‘in the form the data provider specifies, or otherwise “Contains [insert name of Data Provider] data © Crown copyright and database right” and requires that users “do not misrepresent the Data or its source”.

As government develops new conventions for transparency – it would be good to see new conventions from mediators between data and the public too. Perhaps data.gov.uk should be clearer about attribution – and suggest that attribution should involve a clear link back to the dataset. If that was combined with some of the points Paul Clarke noted (and my comment on that post picks up on) around improving the user-friendly nature of data-stores, then simple steps might move us closer to ensuring transparency builds effective public debate – weaving data into the information.

Transparency in government means more than just chance for government. And that’s important for advocates of open data and an open society not to loose sight of…