[Summary: aggregating content from the Internet Governance Forum & exploring ways to develop the legacy of social reporting at events…]
Introducing social reporting to an event can bring many immediate benefits. From new skills for those participating in the social reporting, to increasing opportunities for conversation at the event, and building bridges between those present at an event, and those interested in the topic but unable to physically take part.
However, the wealth of content gathered through social reporting can also act as a resource ‘after the event’ – offering insights and narratives covering event themes, and offering contrasting and complementary perspectives to any ‘official’ event records that may exist.
Many of the tools I use when social reporting at an event have a certain ‘presentism’ about them. Newer content is prioritised over older content, and, in the case of dashboard aggregators like NetVibes, or services such as Twitter, good content can quickly disappear from the front page, or even altogether.
So, as we got towards the end of a frantic four days social reporting out at the Internet Governance Forum in Egypt earlier this year, I started thinking about how to make the most of the potential legacy impacts of the social reporting that was going on – both in the event-wide Twitterstream, and in the work of the young social reporters I was specifically working with.
Part of that legacy was about the skills and contacts gathered by the social reporters – so we quickly put together this handout for participants – but another part of that legacy was in the content. And gathering that together turned out to be trickier than I expected.
However, I now have a micro-site set up at http://igf2009.practicalparticipation.co.uk/ where you can find all the blog posts and blips created by our social reporters, as well as all the tagged tweets we could collect together. Over the coming weeks colleagues at Diplo will be tagging core content to make it easy to navigate and potentially use as part of online learning around Internet Governance. I’ve run the 3500+ twitter messages I managed to (eventually) aggregate through the Open Calais auto-tagging service as an experiment to see if this provide ways to identify insights within them – and I’ve been exploring different ways to present the information found in the site.
Learning: Next time set up the aggregator in advance
I didn’t start putting together the site (a quick bit of Drupal + FeedAPI, with the later addition of Views, Panels, Autotagging, Timeline and other handy modules) till the final day of IGF09, by which time over 50 blog posts had been added to our Ning website, and over 3000 twitter messages tagged #igf09.
Frustratingly, Ning only provides the last 20 items in any RSS feed, and, as far as I can tell, no way to page through past items; and the Twitter search API is limited to fetching just 1500 tweets.
Fortunately when it came to Twitter I had captured all the Tweets in Google Reader – but still had to scrape Twitter message IDs back out of there – and set up a slow script to spend a couple of days fetching original tweets (given the rate limiting again on the Twitter API).
For Ning, I ended up having to go through and find all the authors who had written on IGF09, and to fetch the feeds of their posts, run through a Yahoo Pipe to create an aggregate feed of only those items posted during the time of the IGF.
It would have been a lot easier if I set up the Drupal + FeedAPI aggregator beforehand, and added new feeds to it whenever I found them.
Discoveries: Language and noise
I’ve spent most of my time just getting the content into this aggregator, and setting up a basic interface for exploring it. I’ve not yet hand chance to dive in and really explore the content itself. However, two things I noticed:
1) There is mention of a francaphone hash-tag for IGF2009 in some of the tweets. Searching on that hash-tag now, over a month later, doesn’t turn up any results – but it’s quite possible that there were active conversations this aggregator fails to capture because we weren’t looking at the right tags.
2) A lot of the Twitter messages aggregated appear to be about the ‘censorship incident‘ that dominated external coverage of IGF09, but which was only a small part of all the goings on at IGF. Repeated tweeting and re-tweeting on one theme can drown out conversations on other themes unless there are effective ways to navigate and filter the content archives.
I’ve started to explore how @ messages, and RTs within Tweets could be used to visualise the structure, as well as content, of conversations – but have run up against the limitations of my meagre current skill set with R and iplot.
I’m now on the look out for good ways of potentially building some more intelligent analysis of tweets into future attempts to aggregate with Drupal – possibly by extracting information on @s and RTs at the time of import using the promising FeedAPI Scraper module from the great folk at Youth Agora.
Questions: Developing social reporting legacies
There is still a lot more to reflect upon when it comes to making the most of content from a socially reported event, not least:
1) How long should information be kept?
I’ve just been reading Delete, which very sensibly suggests that not all content should be online for ever – and particularly with conversational twitter messages or video clips, there may be a case for ensuring a social reporting archive only keeps content public for as long as there is a clear value in doing so.
2) Licensing issues
Aggregation on the model I’ve explored assumes licence to collect and share tweets and other content. Is this a fair assumption?
3) Repository or advocacy?
How actively should the legacy content from social reporting be used? Should managing the legacy of an event also involve setting up search and blog alerts, and pro-actively spreading content to other online spaces? If so – who should be responsible for that and how?