Content analysis, tagging, linked data and digital objectivities

I’ve tried to keep musings on research methodology & epistemology mostly off this blog (they are mostly to be found over on my just-out-of-stealth-mode ‘Open Data Impacts’ research blog), however, for want of somewhere better to park the following brief(ish) reflections:

  • Content Analysis is a social science method that takes ‘texts’ and seeks to analyze them: usually involving ‘coding’ topics, people, places or other elements of interest in the texts, and seeking to identify themes that are emerging from them.
  • One of the challenges of any content analysis is developing a coding structure, and defending that coding structure as reasonable.In most cases, the coding structure will be driven by the research interest, and codes applied on the basis of subjective judgements by the researcher. In research based within more ‘objective’ epistemic frameworks, or at least trying to establish conclusions as valid independently of the particular researcher – multiple people may be asked to code a text, and then tests of ‘inter-coder reliability‘ (how much the coders agreed or disagreed) may be applied.
  • With the rise of social bookmarking sites such as Delicious, and the growth of conventions of tagging and folksonomy, much online content already has at least some set of ‘codes’ attached.For example, here you can see the tags people have applied to this blog on Delicious.
  • Looking up any tags that have been applied to an element of digital content could be useful for researchers as part of their reflective practice to ensure they have understood an element of content from a wide range of angles – beyond that which is primarily driving their research.
  • (With many caveats) It could also support some form of ‘extra-coder reliability’ providing a check of coding against ‘folk’ assessments of content’s meaning.
  • The growth of the semantic web also means that many of the objects which codes refer to (e.g. people, organizations, concepts) have referenceable URIs, and if not, the researcher can easily create them.Services such as Open Calais, and Open Amplify also draw on vast ‘knowledge bases’ to machine-classify and code elements of text – identifying, with re-usable concept labels, people, places, organizations and even emotions. (The implications of machine classification for content analysis are not, however, the primary topic of this point or post).
  • Researchers could chose to code their content using semantic web URIs and conventions – contributing their meta-data annotations of texts to either local, or global, hypertexts.For example, if I’m coding a paragraph of text about the launch of data.gov.uk, instead of just adding my own arbitrary tags to it, I could mark-up the paragraph based on some convention (RDFa?), and reference shared concepts.From a brief search of Subj3ct for ‘data’, I quickly find I have to make some fairly specific choices about which concepts of data I might be coding against, although hopefully if they have suitable relationships attached, I may be able to query my coded data in more flexible ways in the future.
  • All of this raises a mass of interesting epistemic issues, none of which I can do justice to in these brief notes, but which include:
    • Changing the relationship of the researcher to concept-creation – and encouraging both the re-use of concepts, and the shaping of shared semantic web concepts in line with the research;
    • The appropriateness, or not, of using concepts from the semantic web in social scientific research, where the relatively objectivist and context free framing of most current semantic web projects runs counter to often subjectivist and interpretivist  leanings within social science;
    • The role of key elements of the current web of concepts on the semantic web (for many social scientific concepts, primarily Wikipedia via the dbpedia project) where the choice of what concepts are easily referenceable or not depends on a complex social context involving both crowd-sourcing and centralised control (ref the policies of Wikipedia or other taxonomy / knowledge base providers).
  • The actual use of existing online tagging, and semantic web URIs as part of the content analysis coding process (or any other social scientific coding process for that matter) may remain, at present, both methodologically challenging, and impractical given the available tools – but is worth further reflection and exploration.

Reflections; points to literatures that are already exploring this; questions etc. all welcome…

Curating a conference: young people in a digital world

This is a quick blog post to link to the videos and social reporting content from last week’s Young People in a Digital World conferences in Wales which are now available through the newly launched Digital Youth Wales network.

You can find over five hours content, including a fantastic panel discussion with young people from Swansea schools and colleges, insights from e-Moderation and Moshi Monster’s Chief Community & Safety Officer, my interview with Tanya Byron, and some great examples of digital youth work from Swansea. You might even find a clip of me trying to unpack how, through the lens of youth work values, the Internet provides an exciting opportunity space for youth work.

Curating social reporting

As well as the webcast recordings (created by the ever friendly and professional Richard Jolly and Diarmaid Lynch) the event was also comprehensively ‘socially reported’ with live-blogging, video interviews and more being co-ordinated by David Wilcox and Chie Elliott.

All of which, thanks to the kind support of Sangeet from WISE KIDS who organised the conference, gave me a chance to try out further exploration of curating content from social reporting. Building on the IGF09 Drupal+FeedAPI framework, I’ve put together a micro-site within the Digital Youth Wales site which links together a record of live-blogging, with the webcast video, and any informal social reporting videos for each session.

Take a look here to explore the individual sessions – and do let me know your ideas for how this sort of social reporting aggregation could be improved or further developed…

A campaign that’s time has come: Robin Hood Tax #rht

I’ve long supported the idea of a Tobin Tax some form of Tobin-like tax (update 11th Feb) – taxing financial market transactions a minute amount to fund development and investment in poverty reduction (and more recently, climate change prevention, adaptation and mitigation). The trouble is, talking about a financial transaction tax isn’t the easiest thing to do.

So it’s fantastic to see the launch today of the Robin Hood Tax campaign. If you support one campaign this year…make it this one.