2013 in review: crowdsourcing, digital history, visualisation, and lots and lots of words

A quick and incomplete summary of my 2013 for those days when I wonder where the year went… My PhD was my main priority throughout the year, but the slow increase in word count across my thesis is probably only of interest to me and my supervisors (except where I've turned down invitations to concentrate on my PhD). Various other projects have spanned the years: my edited volume on 'Crowdsourcing our Cultural Heritage', working as a consultant on the 'Let's Get Real' project with Culture24, and I've continued to work with the Open University Digital Humanities Steering Group, ACH and to chair the Museums Computer Group.

In January (and April/June) I taught all-day workshops on 'Data Visualisation for Analysis in Scholarly Research' and 'Crowdsourcing in Libraries, Museums and Cultural Heritage Institutions' for the British Library's Digital Scholarship Training Programme.

In February I was invited to give a keynote on 'Crowd-sourcing as participation' at iSay: Visitor-Generated Content in Heritage Institutions in Leicester (my event notes). This was an opportunity to think through the impact of the 'close reading' people do while transcribing text or describing images, crowdsourcing as a form of deeper engagement with cultural heritage, and the potential for 'citizen history' this creates (also finally bringing together my museum work and my PhD research). This later became an article for Curator journal, From Tagging to Theorizing: Deepening Engagement with Cultural Heritage through Crowdsourcing (proof copy available at http://oro.open.ac.uk/39117). I also ran a workshop on 'Data visualisation for humanities researchers' with Dr. Elton Barker (one of my PhD supervisors) for the CHASE 'Going Digital' doctoral training programme.

In March I was in the US for THATCamp Feminisms in Claremont, California (my notes), to do a workshop on Data visualisation as a gateway to programming and I gave a paper on 'New Challenges in Digital History: Sharing Women's History on Wikipedia' at the Women's History in the Digital World' conference at Bryn Mawr, Philadelphia (posted as 'New challenges in digital history: sharing women's history on Wikipedia – my draft talk notes'). I also wrote an article for Museum Identity magazine, Where next for open cultural data in museums?.

In April I gave a paper, 'A thousand readers are wanted, and confidently asked for': public participation as engagement in the arts and humanities, on my PhD research at Digital Impacts: Crowdsourcing in the Arts and Humanities (see also my notes from the event), and a keynote on 'A Brief History of Open Cultural Data' at GLAM-WIKI 2013.

In May I gave an online seminar on crowdsourcing (with a focus on how it might be used in teaching undergraduates wider skills) for the NITLE Shared Academics series. I gave a short paper on 'Digital participation and public engagement' at the London Museums Group's 'Museums and Social Media' at Tate Britain on May 24, and was in Belfast for the Museums Computer Group's Spring meeting, 'Engaging Visitors Through Play' then whipped across to Venice for a quick keynote on 'Participatory Practices: Inclusion, Dialogue and Trust' (with Helen Weinstein) for the We Curate kick-off seminar at the start of June.

In June the Collections Trust and MCG organised a Museum Informatics event in York and we organised a 'Failure Swapshop' the evening before. I also went to Zooniverse's ZooCon (my notes on the citizen science talks) and to Canterbury Cathedral Archives for a CHASE event on 'Opening up the archives: Digitization and user communities'.

In July I chaired a session on Digital Transformations at the Open Culture 2013 conference in London on July 2, gave an invited lightning talk at the Digital Humanities Oxford Summer School 2013, ran a half-day workshop on 'Designing successful digital humanities crowdsourcing projects' at the Digital Humanities 2013 conference in Nebraska, and had an amazing time making what turned out to be Serendip-o-matic at the Roy Rosenzweig Center for History and New Media at George Mason University's One Week, One Tool in Fairfax, Virginia (my posts on the process), with a museumy road trip via Amtrak and Greyhound to Chicago, Cleveland, Pittsburg inbetween the two events.

In August I tidied up some talk notes for publication as 'Tips for digital participation, engagement and crowdsourcing in museums' on the London Museums Group blog.

October saw the publication of my Curator article and Creating Deep Maps and Spatial Narratives through Design with Don Lafreniere and Scott Nesbit for the International Journal of Humanities and Arts Computing, based on our work at the Summer 2012 NEH Advanced Institute on Spatial Narrative and Deep Maps: Explorations in the Spatial Humanities. (I also saw my family in Australia and finally went to MONA).

In November I presented on 'Messy understandings in code' at Speaking in Code at UVA's Scholars' Lab, Charlottesville, Virginia, gave a half-day workshop on 'Data Visualizations as an Introduction to Computational Thinking' at the University of Manchester and spoke at the Digital Humanities at Manchester conference the next day. Then it was down to London for the MCG's annual conference, Museums on the Web 2013 at Tate Modern. Later than month I gave a talk on 'Sustaining Collaboration from Afar' at Sustainable History: Ensuring today's digital history survives.

In December I went to Hannover, Germany for the Herrenhausen Conference: "(Digital) Humanities Revisited – Challenges and Opportunities in the Digital Age" where I presented on 'Creating a Digital History Commons through crowdsourcing and participant digitisation' (my lightning talk notes and poster are probably the best representation of how my PhD research on public engagement through crowdsourcing and historians' contributions to scholarly resources through participant digitisation are coming together). In final days of 2013, I went back to my old museum metadata games, and updated them to include images from the British Library and took a first pass at making them responsive for mobile and tablet devices.

So we made a thing. Announcing Serendip-o-matic at One Week, One Tool

So we made a thing. And (we think) it's kinda cool! Announcing Serendip-o-matic http://t.co/mQsHLqf4oX #OWOT
— Mia (@mia_out) August 2, 2013

Source code at GitHub Serendipomatic – go add your API so people can find your stuff! Check out the site at serendipomatic.org.

Update: and already we've had feedback that people love the experience and have found it useful – it's so amazing to hear this, thank you all! We know it's far from perfect, but since the aim was to make something people would use, it's great to know we've managed that:

Congratulations @mia_out and the team of #OWOT for http://t.co/cNbCbEKlUf Already try it & got new sources about a Portuguese King. GREAT!!!
— Daniel Alves (@DanielAlvesFCSH) August 2, 2013

Update from Saturday morning – so this happened overnight:

Cool, Serendipmatic cloned and local dev version up and running in about 15 mins. Now to see about adding Trove to the mix. #owot
— Tim Sherratt (@wragge) August 3, 2013

And then this:

Just pushed out an update to http://t.co/uM13iWLISU — now includes Trove content! #owot
— RebeccaSuttonKoeser (@suttonkoeser) August 3, 2013

From the press release: One Week | One Tool Team Launches Serendip-o-matic

serendip-o-maticAfter five days and nights of intense collaboration, the One Week | One Tool digital humanities team has unveiled its web application: Serendip-o-matic <http://serendipomatic.org>. Unlike conventional search tools, this “serendipity engine” takes in any text, such as an article, song lyrics, or a bibliography. It then extracts key terms, delivering similar results from the vast online collections of the Digital Public Library of America, Europeana, and Flickr Commons. Because Serendip-o-matic asks sources to speak for themselves, users can step back and discover connections they never knew existed. The team worked to re-create that moment when a friend recommends an amazing book, or a librarian suggests a new source. It’s not search, it’s serendipity.

Serendip-o-matic works for many different users. Students looking for inspiration can use one source as a springboard to a variety of others. Scholars can pump in their bibliographies to help enliven their current research or to get ideas for a new project. Bloggers can find open access images to illustrate their posts. Librarians and museum professionals can discover a wide range of items from other institutions and build bridges that make their collections more accessible. In addition, millions of users of RRCHNM’s Zotero can easily run their personal libraries through Serendip-o-matic.
Serendip-o-matic is easy to use and freely available to the public. Software developers may expand and improve the open-source code, available on GitHub. The One Week | One Tool team has also prepared ways for additional archives, libraries, and museums to make their collections available to Serendip-o-matic. 

Halfway through. Day three of OWOT.

Crikey. Day three. Where do I start?

We've made great progress on our mysterious tool. And it has a name! Some cool design motifs are flowing from that, which in turn means we can really push the user experience design issues over the next day and a half (though we've already been making lots of design decisions on the hoof so we can keep dev moving). The Outreach team have also been doing some great communications work, including a Press Release and have lots more in the pipeline. The Dev/Design team did a demo of our work for the Outreach team before dinner – there are lots of little things but the general framework of the tool works as it should – it's amazing how far we've come since lunchtime yesterday.  We still need to do a full deployment (server issues, blah blah), and I'll feel a lot better when we've got that process working and then running smoothly, so that we can keep deploying as we finish major features up to a few hours before launch rather than doing it at the end in a mad panic. I don't know how people managed code before source control – not only does Github manage versions for it, it makes pulling in code from different people so much easier.

There's lots to tackle on many different fronts, and it may still end up in a mad rush at the end, but right now, the Dev/Design team is humming along. I've been so impressed with the way people have coped with some pretty intense requirements for working with unfamiliar languages or frameworks, and with high levels of uncertainty in a chaotic environment.  I'm trying to keep track of things in Github (with Meghan and Brian as brilliant 'got my back' PMs) and keep the key current tasks on a whiteboard so that people know exactly what they need to be getting done at any time. Now that the Outreach team have worked through the key descriptive texts, name and tagline we'll need to coordinate content production – particularly documentation, microcopy to guide people through the process – really closely, which will probably get tricky as time is short and our tasks are many, but given the people gathered together for OWOT, I have faith that we'll make it work.


Things I have learnt today: despite two years working on a PhD in digital humanities/digital history, I still have a brain full of technical stuff – it's a relief to realise it hasn't atrophied through lack of use. I've also realised how much the work I've done designing workshops and teaching since starting my PhD have fed into how I work with teams, though it's hard right now to quantify exactly *how*. Finally, it's re-affirmed just how much I like making things – but also that it's important to make those things in the company of people who are scholarly (or at least thoughtful) about subjects beyond tech and inter-disciplinary, and ideally to make things that engage the public as well as researchers. As the end of my PhD approaches, it's been really useful to step back into this world for a week, and I'll definitely draw on it when figuring out what to do after the PhD. If someone could just start a CHNM in the UK, I'd be very happy.

I still can't tell you what we're making, but I *can* tell you that one of these photos in this post contains a clue (and they all definitely have nothing to do with mild lightheadedness at the end of a long day).

DHOxSS: 'From broadcast to collaboration: the challenges of public engagement in museums'

I'm just back from giving at a lightning talk for the Cultural Connections strand of the Digital.Humanities@Oxford Summer School 2013, and since the projector wasn't working to show my examples during my talk I thought I'd share my notes (below) and some quick highlights from the other presentations.

Mark Doffman said that it's important that academic work challenges and provokes, but make sure you get headlines for the right reasons, but not e.g. on how much the project costs. He concluded that impact is about provocation, not just getting people to say your work is wonderful.

Gurinder Punn of the university's Isis Innovation made the point that intellectual property and expertise can be transferred into businesses by consulting through your department or personally. (And it's not just for senior academics – one of the training sessions offered to PhD students at the Open University is 'commercialising your research').

Giles Bergel @ChapBookPro spoke on the Broadside Ballads Online (blog), explaining that folksong scholarship is often outside academia – there's a lot of vernacular scholarship and all sorts of domain specialists including musicians. They've considered crowdsourcing but want to be in a position to take the contributions as seriously as any print accession. They also have an image-match demonstrator from Oxford's Visual Geometry Group which can be used to find similar images on different ballad sheets.

Christian von Goldbeck-Stier offered some reflections on working with conductors as part of his research on Wagner. And perfectly for a summer's day:

Christian quotes Wilde on beauty: "one of the great facts of the world, like sunlight, or springtime…" http://t.co/8qGE9tLdBZ #dhoxss
— Pip Willcox (@pipwillcox) July 11, 2013

My talk notes: 'From broadcast to collaboration: the challenges of public engagement in museums'

I’m interested in academic engagement from two sides – for the past decade or so I was a museum technologist; now I’m a PhD student in the Department of History at the Open University, where I’m investigating the issues around academic and ‘amateur’ historians and scholarly crowdsourcing.

As I’ve moved into academia, I’ve discovered there’s often a disconnect between academia and museum practice (to take an example I know well), and that their different ways of working can make connecting difficult, even before they try to actually collaborate. But it’s worth it because the reward is more relevant, cutting-edge research that directly benefits practitioners in the relevant fields and has greater potential impact.

I tend to focus on engagement through participation and crowdsourcing, but engagement can be as simple as blogging about your work in accessible terms: sharing the questions that drive your research, how you’ve come to some answers, and what that means for the world at large; or writing answers to common questions from the public alongside journal articles.

Plan it

For a long time, museums worked with two publics: visitors and volunteers. They’d ask visitors what they thought in ‘have your say’ interactives, but to be honest, they often didn’t listen to the answers. They’d also work with volunteers but sometimes they valued their productivity more than they valued their own kinds of knowledge. But things are more positive these days – you've already heard a lot about crowdsourcing as a key example of more productive engagement.

Public engagement works better when it’s incorporated into a project from the start. Museums are exploring co-curation – working with the public to design exhibitions. Museums are recognising that they can’t know everything about a subject, and figuring out how to access knowledge ‘out there’ in the rest of the world. In the Oramics project at the Science Museum (e.g. Oramics to Electronica or Engaging enthusiasts online), electronic musicians were invited to co-curate an exhibition to help interpret an early electronic instrument for the public. 

There’s a model from 'Public Participation in Scientific Research' (or 'citizen science') I find useful in my work when thinking about how much agency the public has in a project, and it's also useful for planning engagement projects. Where can you benefit from questions or contributions from the public, and how much control are you willing to give up? 

Contributory projects designed by scientists, with participants involved primarily in collecting samples and recording data; Collaborative projects in which the public is also involved in analyzing data, refining project design, and disseminating findings; Co-created projects are designed by scientists and members of the public working together, and at least some of the public participants are involved in all aspects of the work. (Source: Public Participation in Scientific Research: Defining the Field and Assessing Its Potential for Informal Science Education (full report, PDF, 3 MB))

Do it

Museums have learnt that engaging the public means getting out of their venues (and their comfort zones). One example is Wikipedians-in-Residence, including working with Wikipedians to share images, hold events and contribute to articles. (e.g. The British Museum and MeA Wikipedian-in-Residence at the British MuseumThe Children's Museum's Wikipedian in Residence). 
It’s not always straightforward – museums don’t do ‘neutral’ points of view, which is a key goal for Wikipedia. Museums are object-centric, Wikipedia is knowledge-centric. Museums are used to individual scholarship and institutional credentials, Wikipedia is consensus-driven and your only credentials are your editing history and your references. Museums are slowly learning to share authority, to trust the values of other platforms. You need to invest time to learn what drives the other groups, how to talk with them and you have to be open to being challenged.

Mean it

Done right, engagement should be transformative for all sides. According to the National Co-ordinating Centre for Public Engagement, engagement ‘is by definition a two-way process, involving interaction and listening, with the goal of generating mutual benefit.’ Saying something is ‘open to the public’ is easy; making efforts to make sure that it’s intellectually and practically accessible takes more effort; active outreach is a step beyond open. It's not the same as marketing – it may use the same social media channels, but it's a conversation, not a broadcast. It’s hard to fake being truly engaged (and it's rude) so you have to mean it – doing it cynically doesn't help anyone.

Asking people to do work that helps your mission is a double win. For example, Brooklyn Museum's 'Freeze Tagask members of their community to help moderate tags entered by people elsewhere – they're trusting members of the community to clean up content for them.

Enjoy it

My final example is the National Library of Ireland on Flickr Commons, who do a great job of engaging people in Irish history, partly through their enthusiasm for the subject and partly through the effort they put into collating comments and updating their records, showing how much they value contributions. 

Almost by definition, any collaboration around engagement will be with people who are interested in your work, and they’ll bring new perspectives to it. You might end up working with international peers, academics from different disciplines, practitioner groups, scholarly amateurs or kids from the school down the road. And it’s not all online – running events is a great way to generate real impact and helps start conversations with potential for future collaboration.

You might benefit too! Talking about your research sometimes reminds you why you were originally interested in it… It’s a way of looking back and seeing how far you’ve come. It’s also just plain rewarding seeing people benefit from your research, so it's worth doing well.


Thanks again to Pip Willcox for the invitation to speak, and to the other speakers for their fascinating perspectives.  Participation and engagement lessons from cultural heritage and academia is a bit of a hot topic at the moment – there's more on it (including notes from a related paper I gave with Helen Weinstein) at Participatory Practices.

Setting off small fireworks: leaving space for curiosity

Remember when blog posts didn't need titles, didn't need to be long or take ages to write, and had nothing to do with your 'personal brand'? I've realised that while I'm writing up the PhD I'll barely blog at all if I don't blog like it's 2007 and just share interesting stuff when I've got a moment. Here goes…

I've been interested in the role of curiosity in engaging people with museum collections since I evaluated museum 'tagging' crowdsourcing games for my MSc project and learnt that the randomness of the objects presented made players really curious about what would appear next, and in turn that curiosity was one reason they kept playing. It turns out other metadata game designers have noticed the same effect. Flanagan and Carini (2012) wrote: 'Curiosity and doubt are key design opportunities. … In a number of instances, players became so curious about the images they were tagging that they would tag images with inquiry phrases, such as "want to know more about this culture."'

I returned to 'curiosity' for a talk I gave at the iSay conference in Leicester, where I related it to Raddick et al's (2009) 'Levels of Engagement' in citizen science, where Level 2 participation in community discussion (e.g. forums on crowdsourcing sites) and Level 3 is 'working independently on self-identified research projects'. To me, this suggested you should leave room for curiosity and wonder to develop – it might turn into a new personal journey for the participant or visitor, or even a new research question for a crowdsourcing project.

The reason I'm posting now is that I just came across Langer's definition of 'mindfulness': 'the "state of mind that results from drawing novel distinctions, examining information from new perspectives, and being sensitive to context. It is an open, creative, probabilistic state of mind in which the individual might be led to finding differences among things thought similar and similarities among things thought different" (Langer 1993, p.44).' in Csikszentmihalyi and Hermanson (1995). Further:

'Exhibits that facilitate mindfulness display information in context and present various viewpoints. For example, Langer (1993, p.47) contrasts the statement "The three main reasons for the Civil War were…" with the statement "From the perspective of the white male living in the twentieth century, the main reasons for the Civil War were…" (p.47). The latter approach calls for thoughtful comparisons. For example, How did women feel during the Civil War? the old? the old from the North? the black male today? and so on.'

I don't know about you, but my curiosity was piqued and my mind started going in lots of different directions. The second question carefully creates a gap just big enough to let a hundred new questions through and is a brilliant example of why both museum interpretation and participatory projects should leave room for curiosity…

Works cited:

  • Csikszentmihalyi, Mihaly, and Kim Hermanson. 1995. “Intrinsic Motivation in Museums: Why Does One Want to Learn?” In Public Institutions for Personal Learning: Establishing a Research Agenda, edited by John Falk and Lynn D. Dierking, 66 – 77. Washington D.C.: American Association of Museums. [This is seriously ace, track down a copy if you can]
  • Flanagan, Mary, and Peter. 2012. “How Games Can Help Us Access and Understand Archival Images.” American Archivist 75 (2): 514–537.
  • Raddick, M. Jordan, Georgia Bracey, K. Carney, G. Gyuk, K. Borne, J. Wallin, and S Jacoby. 2009. “Citizen Science: Status and Research Directions for the Coming Decade.” In Astro2010: The Astronomy and Astrophysics Decadal Survey. Vol. 2010. http://www8.nationalacademies.org/astro2010/DetailFileDisplay.aspx?id=454.

(Ok, so a post with references is not exactly blogging like it's 2006, but you've got to start somewhere…)
(Someone is literally setting off fireworks somewhere nearby. I have no idea why.)
(And yeah, I am working on a Saturday night. Friends don't let friends do PhDs, innit.)

We're all looking at the stars: citizen science projects at ZooCon13

Last Saturday I escaped my desk to head to the Physics department at the University of Oxford and be awed by what we're learning about space (and more terrestrial subjects) through citizen science projects run by Zooniverse at ZooCon13. All the usual caveats about notes from events apply – in particular, assume any errors are mine and that everyone was much more intelligent and articulate than my notes make them sound. These notes are partly written for people in cultural heritage and the humanities who are interested in the design of crowdsourcing projects, and while I enjoyed the scientific presentations I am not even going to attempt to represent them!  Chris Lintott live-blogged some of the talks on the day, so check out 'Live from ZooCon' for more. If you're familiar with citizen science you may well know a lot of these examples already – and if you're not, you can't really go wrong by looking at Zooniverse projects.

Aprajita Verma kicked off with SpaceWarps and 'Crowd-sourcing the Discovery of Gravitational Lenses with Citizen Scientists'. She explained the different ways gravitational lenses show up in astronomical images, and that 'strong gravitational lensing research is traditionally very labour-intensive' – computer algorithms generate lots of false positives, so you need people to help. SpaceWarps includes some simulated lenses (i.e. images of the sky with lenses added), mostly as a teaching tool (to provide more examples and increase familiarity with what lenses can look like) but also to make it more interesting for participants. The SpaceWarps interface lets you know when you've missed a (simulated, presumably) lens as well as noting lenses you've marked. They had 2 million image classifications in the first week, and 8500 citizen scientists have participated so far, 40% of whom have participated in 'Talk', the discussion feature. As discussed in their post 'What happens to your markers? A look inside the Space Warps Analysis Pipeline', they've analysed the results so far on ranges between astute/obtuse and pessimistic/optimistic markers – it turns out most people are astute. Each image is reviewed by ten people, so they've got confidence in the results.

Karen Masters talked about 'Cosmic Evolution in the Galaxy Zoo', taking us back to the first Galaxy Zoo project's hopes to have 30,000 volunteers and contrasting that with subsequent peer-reviewed papers that thanked 85,000, or 160,000 or 200,000 volunteers. The project launched in 2007 (before the Zooniverse itself) to look at spiral vs elliptical galaxies and it's all grown from there. The project has found rare objects, most famously the pea galaxies, and as further proof that the Zooniverse is doing 'real science online', the team have produced 36 peer reviewed paper, some with 100+ citations. At least 50 more papers have been produced by others using their data.

Phil Brohan discussed 'New Users for Old Weather'. The Old Weather project is using data from historic ships logs to help answer the question 'is this climate change or just weather?'. Some data was already known but there's a 'metaphorical fog' from missing observations from the past. Since the BBC won't let him put a satellite in a Tardis, they've been creative about finding other sources to help lift 'the fog of ignorance'. This project has long fascinated me because it started off all about science: in Phil's words, 'when we started all this, I was only thinking about the weather', but ended up being about history as well: 'these documents are intrinsically interesting'– he learnt what else was interesting about the logs from project participants who discovered the stories of people, disasters and strange events that lay within them. The third thing the project has generated (after weather and history) is 'a lot of experts'. One example he gave was evidence of the 1918-19 Spanish flu epidemic on board ship, which was investigated after forum posts about it. There's still a lot to do – more logs, including possibly French and Dutch – to come, and things would ideally speed up 'by a factor of ten'.

In Brooke Simmons' talk on 'Future plans for Galaxy Zoo', she raised the eternal issue of what to call participants in crowdsourcing: 'just call everyone collaborators'. 'Citizen scientists' makes a distinction between paid and unpaid scientists, as does 'volunteers'. She wants to help people do their own science, and they're working on making it easier than downloading and learning how to use more complicated tools. As an example, she talked about people collecting 'galaxies with small bulges' and analysing the differences in bulges (like a souped-up Galaxy Zoo Navigator?). She also talked about Zoo Teach, with resources for learning at all ages.

After the break we learnt about 'The Planet 4 Invasion', the climate and seasons of Mars from Meg Schwamb and about Solar Stormwatch in 'Only you can save planet Earth!' from Chris Davis, who was also presenting research from his student Kim Tucker-Wood (sp?). Who knew that solar winds could take the tail off a comet?!

Next up was Chris Lintott on 'Planet Hunting with and without Kepler'. Science communication advice says 'don't show people graphs', and since Planet Hunters is looking at graphs for fun, he thought no-one would want to do Planet Hunters. However, the response has surprised him. And 'it turns out that stars are actually quite interesting as well'. In another example of participants going above and beyond the original scope of the project, project participants watched a talk streamed online on 'heartbeat binaries', and went and found 30 of them from archives, their own records and posted them on the forum.  Now a bunch of Planet Hunters are working with Kepler team to follow them up.  (As an aside, he showed a screenshot of a future journal paper – the journal couldn't accept the idea that you could be a Planet Hunter and not be part of an academic team so they're listed as the Department of Astronomy at Yale.)

The final speaker was Rob Simpson on 'The Future of the Zooniverse'.  To put things in context, he said the human race spends 16 years cumulatively playing the game Angry Birds every day; people spend 2 months every day on the Zooniverse. In the past year, the human race spent 52 years on the Zooniverse's 15 live projects (they've had 23 projects in total). The Andromeda project went through all their data in 22 days – other projects take longer, but still attract dedicated people.  In the Zooniverse's immediate future are 'tools for (citizen) scientists' – adding the ability to do analysis in the browser, 'because people have a habit of finding things, just by being given access to the data'. They're also working on 'Letters' – public versions of what might otherwise be detailed forum posts that can be cited, and as a form of publication, it puts them 'in the domain'.  They're helping people communicate with each other and embracing their 'machine overlords', using Galaxy Zoo as a training tool for machine learning.  As computers get more powerful, the division of work between machines and people will change, perhaps leaving the beautiful, tricky, or complex bits for humans. [Update, June 29, 2013: Rob's posted about his talk on the Zooniverse blog, '52 Years of Human Effort', and corrected his original figure of 35 years to 52 years of human effort.]

At one point a speaker asked who in the room was a moderator on a Zooniverse project, and nearly everyone put their hand up. I felt a bit like giving them a round of applause because their hard work is behind the success of many projects. They're also a lovely, friendly bunch, as I discovered in the pub afterwards.

Conversations in the pub also reminded me of the flipside of people learning so much through these projects – sometimes people lose interest in the original task as their skills and knowledge grow, and it can be tricky to find time to contribute outside of moderating.  After a comment by Chris at another event I've been thinking about how you might match people to crowdsourcing projects or tasks – sometimes it might be about finding something that suits their love of the topic, or that matches the complexity or type of task they've previously enjoyed, or finding another unusual skill to learn, or perhaps building really solid stepping stones from their current tasks to more complex ones. But it's tricky to know what someone likes – I quite like transcribing text on sites like Trove or Notes from Nature, but I didn't like it much on Old Weather. And my own preferences change – I didn't think much of Ancient Lives the first time I saw it, but on another occasion I ended up getting completely absorbed in the task. Helping people find the right task and project is also a design issue for projects that have built an 'ecosystem' of parts that contribute to a larger programme, as discussed in 'Using crowdsourcing to manage crowdsourcing' in Frequently Asked Questions about crowdsourcing in cultural heritage and 'A suite of museum metadata games?' in Playing with Difficult Objects – Game Designs to Improve Museum Collections.

An event like ZooCon showed how much citizen science is leading the way – there are lots of useful lessons for humanities and cultural heritage crowdsourcing. If you've read this thinking 'I'd love to try it for my data, but x is a problem', try talking to someone about it – often there are computational techniques for solving similar problems, and if it's not already solved it might be interesting enough that people want to get involved and work with you on it.

On the trickiness of crowdsourcing competitions: some lessons from Sydney Design

I generally maintain a diplomatic silence about crowdsourcing competitions when I'm talking about crowdsourcing in cultural heritage as I believe spec work (or asking people to invest time in creating designs then paying just one 'winner') is unethical, and it's really tricky for design competitions to avoid looking like 'spec work'. I discovered this for myself when I ran the 'Cosmic Collections' mashup competition, so I have a lot of sympathy for museums who unknowingly get it wrong when experimenting with crowdsourcing. I also tend not to talk about poorly conceived or executed crowdsourcing projects as it doesn't seem fair to single out cultural heritage institutions that were trying to do the right thing against odds that ended up beating them, but I think the lessons to be drawn from the Sydney Design festival's competition are important enough to discuss here.

'Is it a free poster yet?'
'Is it a free poster yet?'

A crowdsourcing competition model that the museum had previously applied successfully (the Lace Award and Trainspotting, with prizes up to $AUD20,000 and display in the exhibition for winning designs) had a very different reception when the context and rewards changed. When the Powerhouse Museum's design competition to produce the visual identity for the Sydney Design festival was launched with a $US1000 prize, the design community's sensitivity to spec work and 'free pitching' was triggered, and they started throwing in some sarcastic responses.  The public feedback loop created as people could see previous designs and realised their own would also be featured on the site had a 4Chan-ish feel of a fun new meme about it, and once the norm of satirical responses was set, it was only going to escalate.

More importantly, there was a sense that Sydney Design was pulling a swifty. As Kate Sweetapple puts it in How the Sydney Design festival poster competition went horribly wrong:

'The fundamental difference [to the previous competitions], however, is that by running the competition, the Museum pulled a substantial job – worth tens of thousands of dollars – out of the professional marketplace. The submissions to Love Lace and Trainspotting did not have a commercial context one year, and none the next.'

If the previous reward was mostly monetary, offering a lesser intrinsic reward in exchange for a previously extrinsic reward is unlikely to work. If there's a bigger reward than than the competition brief itself would suggest, one important lesson is to make it unavoidably obvious. In this case, the Sydney Design Team's response said 'the Museum would have engaged the winning designer for further work and remuneration required to roll out the winning design into a more comprehensive marketing campaign', but this wasn't clear in the original brief. Many museum competitions display highly-ranked entries in their gallery spaces, and being exhibited in the museum or festival spaces might have been another form of valid reward, but only if it worked as an aspiration for the competition's audience, who in this case might well have a breadth of experience and exposure that rendered it less valuable.

Finally, in working with museums online, I've noticed the harshness of criticism is often proportionate to how deeply people care about you or identify you with certain values they hold dear.  When you're a beloved institution, people who care deeply about you feel betrayed when you get things wrong. As one commentator said in With friends like these, who needs enemies?, 'Sydney Design are meant to be in our corner'. If you regard critics as 'critical friends' you can turn the relationship around (as Merel van der Vaart discusses in the 'Opening up' section of her post on lessons from the Science Museum's Oramics exhibition) and build an even stronger relationship with them. Maybe Sydney Design can still turn this around…

Notes from 'Crowdsourcing in the Arts and Humanities'

Last week I attended a one-day conference, 'Digital Impacts: Crowdsourcing in the Arts and Humanities' (#oxcrowd), convened by Kathryn Eccles of Oxford's Internet Institute, and I'm sharing my (sketchy, as always) notes in the hope that they'll help people who couldn't attend.

Stuart Dunn reported on the Humanities Crowdsourcing scoping report (PDF) he wrote with Mark Hedges and noted that if we want humanities crowdsourcing to take off we should move beyond crowdsourcing as a business model and look to form, nurture and connect with communities.  Alice Warley and Andrew Greg presented a useful overview of the design decisions behind the Your Paintings Tagger and sparked some discussion on how many people need to view a painting before it's 'completed', and the differences between structured and unstructured tagging. Interestingly, paintings can be 'retired' from the Tagger once enough data has been gathered – I personally think the inherent engagement in tagging is valuable enough to keep paintings taggable forever, even if they're not prioritised in the tagging interface.  Kate Lindsay brought a depth of experience to her presentation on 'The Oxford Community Collection Model' (as seen in Europeana 1914-1918 and RunCoCo's 2011 report on 'How to run a community collection online' (PDF)). Some of the questions brought out the importance of planning for sustainability in technology, licences, etc, and the role of existing networks of volunteers with the expertise to help review objects on the community collection days.  The role of the community in ensuring the quality of crowdsourced contributions was also discussed in Kimberly Kowal's presentation on the British Library's Georeferencer project. She also reflected on what she'd learnt after the first phase of the Georeferencer project, including that the inherent reward of participating in the activity was a bigger motivator than competitiveness, and the impact on the British Library itself, which has opened up data for wider digital uses and has more crowdsourcing projects planned. I gave a paper which was based on an earlier version, The gift that gives twice: crowdsourcing as productive engagement with cultural heritage, but pushed my thinking about crowdsourcing as a tool for deep engagement with museums and other memory organisations even further. I also succumbed to the temptation to play with my own definitions of crowdsourcing in cultural heritage: 'a form of engagement that contributes towards a shared, significant goal or research question by asking the public to undertake tasks that cannot be done automatically' or 'productive public engagement with the mission and work of memory institutions'.

Chris Lintott of Galaxy Zoo fame shared his definition of success for a crowdsourcing/citizen science project: it has to produce results of value to the research community in less time than could have been done by other means (i.e. it must have been able to achieve something with crowd that couldn't have without them) and discussed how the Ancient Lives project challenged that at first by turning 'a few thousand papyri they didn't have time to transcribe into several thousand data points they didn't have time to read'.  While 'serendipitous discovery is a natural consequence of exposing data to large numbers of users' (in the words of the Citizen Science Alliance), they wanted a more sophisticated method for recording potential discoveries experts made while engaging with the material and built a focused 'talk' tool which can programmatically filter out the most interesting unanswered comments and email them to their 30 or 40 expert users. They also have Letters for more structured, journal-style reporting. (I hope I have that right).  He also discussed decisions around full text transcriptions (difficult to automatically reconcile) vs 'rich metadata', or more structured indexes of the content of the page, which contain enough information to help historians decide which pages to transcribe in full for themselves.

Some other thoughts that struck me during the day… humanities crowdsourcing has a lot to learn from the application of maths and logic in citizen science – lots of problems (like validating data) that seem intractable can actually be solved algorithmically, and citizen science hypothesis-based approach to testing task and interface design would help humanities projects. Niche projects help solve the problem of putting the right obscure item in front of the right user (which was an issue I wrestled with during my short residency at the Powerhouse Museum last year – in hindsight, building niche projects could have meant a stronger call-to-action and no worries about getting people to navigate to the right range of objects).  The variable role of forums and participants' relationship to the project owners and each other came up at various points – in some projects, interactions with a central authority are more valued, in others, community interactions are really important. I wonder how much it depends on the length and size of the project? The potential and dangers of 'gamification' and 'badgeification' and their potentially negative impact on motivation were raised. I agree with Lintott that games require a level of polish that could mean you'd invest more in making them than you'd get back in value, but as a form of engagement that can create deeper relationships with cultural heritage and/or validate some procrastination over a cup of tea, I think they potentially have a wider value that balances that.

I was also asked to chair the panel discussion, which featured Kimberly Kowal, Andrew Greg, Alice Warley, Laura Carletti, Stuart Dunn and Tim Causer.  Questions during the panel discussion included:

  • 'what happens if your super-user dies?' (Super-users or super contributors are the tiny percentage of people who do most of the work, as in this Old Weather post) – discussion included mass media as a numbers game, the idea that someone else will respond to the need/challenge, and asking your community how they'd reach someone like them. (This also helped answer the question 'how do you find your crowd?' that came in from twitter)
  • 'have you ever paid anyone?' Answer: no
  • 'can you recruit participants through specialist societies?' From memory, the answer was 'yes but it does depend'.
  • something like 'have you met participants in real life?' – answer, yes, and it was an opportunity to learn from them, and to align the community, institution, subject and process.
  • 'badgeification?'. Answer: the quality of the reward matters more than the levels (so badges are probably out).
  • 'what happens if you force students to work on crowdsourcing projects?' – one suggestion was to look for entries on Transcribe Bentham in a US English class blog
  • 'what's happened to tagging in art museums, where's the new steve.museum or Brooklyn Museum?' – is it normalised and not written about as much, or has it declined?
  • 'how can you get funding for crowdsourcing projects?'. One answer – put a good application in to the Heritage Lottery Fund. Or start small, prove the value of the project and get a larger sum. Other advice was to be creative or use existing platforms. Speaking of which, last year the Citizen Science Alliance announced 'the first open call for proposals by researchers who wish to develop citizen science projects which take advantage of the experience, tools and community of the Zooniverse. Successful proposals will receive donated effort of the Adler-based team to build and launch a new citizen science project'.
  • 'can you tell in advance which communities will make use of a forum?' – a great question that drew on various discussions of the role of communities of participants in supporting each other and devising new research questions
  • a question on 'quality control' provoked a range of responses, from the manual quality control in Transcribe Bentham and the high number of Taggers initially required for each painting in Your Paintings which slowed things down, and lead into a discussion of shallow vs deep interactions
  • the final questioner asked about documenting film with crowdsourcing and was answered by someone else in the audience, which seemed a very fitting way to close the day.
James Murray in his Scriptorium with thousands of word references sent in by members of the public for the first Oxford English Dictionary. Early crowdsourcing?

If you found this post useful, you might also like Frequently Asked Questions about crowdsourcing in cultural heritage or my earlier Museums and the Web paper on Playing with Difficult Objects – Game Designs to Improve Museum Collections.

Notes from 'The Shape of Things: New and emerging technology-enabled models of participation through VGC'

I've just spent two days in Leicester for the 'The Shape of Things: New and emerging technology-enabled models of participation through VGC' conference at the school of Museum Studies, part of the AHRC-funded iSay project focusing on Visitor-Generated Content (VGC) in heritage institutions. There will be lots of posts on the conference blog, so these are just some things that struck me or I've found useful concepts for thinking about my own museum practice.

I tweeted about the event as I headed to Leicester, and that started a conversation about the suitability of the term 'visitor-generated content' that continued through the event itself. I think it was Giasemi who said that one problem with 'visitor-generated content' is that the term puts the emphasis on content and that's not what it's about. Jeremy Ottevanger suggested 'inbound communications' as a possible replacement for VGC.

The first keynote was Angelina Russo, who reminded us of the importance of curiosity and of finding ways to make museum collections central to visitor engagement work. She questioned the value of some comments left on museum collections other than the engagement in the process of leaving the comment. Having spent too much time reviewing visitor comments, I have to agree that not all comments (particularly repetitive ones) have inherently valuable content or help enhance another visitor's experience – a subject that was debated during the conference. A conversation over twitter during the conference with Claire Ross helped me realise that designing interfaces that respect and value the experience of both the commenter and reading is one of the interesting challenges in digital participation.

She then used Bourdieu's ideas around 'restricted cultural production' to characterise the work of curators as producers who create cultural goods for other producers, governed by specific norms and sanctions, within relatively self-contained communities where their self-esteem depends on peers. However, this creates a tension between what curators think their role is and what museums need it to be in an age when museums are sites of large-scale cultural production for 'the public at large', driven by a quest for market share and profits. Visitor-generated content and the related issues of trust, authority, or digitisation highlight the tensions between these models of restricted or large-scale cultural production – we need to find 'a pathway through the sand'. Angelina suggested that a version of Bourdieu's 'gift economies', where products are created and given away in return for recognition might provide a solution, then asked what's required to make that shift within the museum. How can we link the drive for participation with the core work of museums and curatorial scholarship? She presented a model (which I haven't gone into here) for thinking about 'cultural communication', or communication which is collection-led; curiosity-driven; is scholarly; experiential; and offers multi-platform opportunities for active cultural participation, engagement and co-creation.

Carl Hogsden from the Museum of Archaeology and Anthropology and University of Cambridge talked about the Reciprocal Research Network and moving beyond digital feedback to digital reciprocation. This project has been doing innovative work for a long time, so it was good to see it presented again.

Jenny Kidd from Cardiff University posed some useful questions in 'VGC and ethics – what we might learn from the media and journalism' – it's questionable how much VGC (or user-generated content, UGC) has actually changed journalism, despite the promise of increased civic engagement, diversity, more relevant news and a re-framing of the audience as active citizens rather than consumers. One interesting point was the impact of the 'Arab Spring' on UGC – content that couldn't be verified couldn't be shown by traditional media so protesters started including establishing shots and improving the quality of their recordings. This was also the first of several papers that referenced 'Whose cake is it anyway', a key text for conversations about visitor participation and museums and Jenny suggested that sometimes being seen to engage in participatory activity is currently possibly end goal in itself for a museum. She presented questions for further research and debate including: is the museum interested in quality of process or product of VGC and do creators feel the same? How does VGC fit in workflow models of museums?

Giasemi Vavoula's paper on 'The role of VGC in digital transformations in Museum Learning' (slides) was fascinating, particularly as it presented frameworks for audience engagement taken from learning theory that closely matched those I'd found from studies of citizen science and engagement in heritage and sport (e.g. cognitive engagement model – highest is theorising, then applying, relating, explaining, describing, note-taking, memorising… Good visitor experiences get most visitors to use the higher engagement level processes that the more focused visitors use spontaneously). I love learning from Learning people – in museum learning/visitor studies, social interaction facilitates learning; visitors negotiate the meanings of exhibits through conversation with their companions. Giasemi called for museums to weave VGC into the fabric of visitors social contexts; to scaffold and embed it into visiting experience; and to align with visitors and organisations' social agendas.

In 'A Tale of Two Workhouses' Peter Rogers and Juliet Sprake spoke of 'filling in the gaps rather than being recipients of one-way information flow', which tied in nicely with discussion around the role of curiosity in audience participation.

In the afternoon there was a Q&A session with Nina Simon (via skype). A number of the questions were about sustainability, designing for mixed contexts, and the final question was 'where next from here?'. Nina advised designing participatory experiences so that people can observe the activity and decide to take part when they're comfortable with it – this also works for designing things that work as spectator experiences for people who don't want to join in. Nina's response to a question about 'designing better questions' – 'find questions where you have genuine interest in what the visitor has to say about it' – resonated with wider discussion about meaningful visitor participation. Nina talked about the cumulative effect of participatory work on the museum itself, changing not only how the museum sees itself but how others see it – I wonder how many museums in the UK are engaging with visitor participation to the extent that it changes the museum itself? Nina also made the point that you tend to have either highly participatory process to make conventional product, or conventional process to make highly participatory product, and that not everything has to be wholly participatory from start to finish, which is useful for thinking how co-creative projects.

On Friday morning I gave a keynote on 'crowdsourcing as productive engagement with cultural heritage'. My slides for 'Crowdsourcing as productive engagement with cultural heritage' are now online. I partly wanted to problematise the power relationships in participatory projects – whose voice can affect change? – and to tease out different ways of thinking about crowdsourcing in cultural heritage as productive both in terms of the process (engaging in cultural heritage) and the product (the sheer number of items transcribed, corrected, etc). I've been going back to research on motivations for volunteering in cultural heritage, working on open source projects and reviewing discussions with participants in crowdsourcing projects, and I hope it'll help people design projects that meet those altruistic, extrinsic and intrinsic motivations. Thinking about my paper in the context of the other presentations also got me thinking about the role of curiosity in audience engagement and encouraging people to start researching a subject (whether a ship's history, an individual or a general topic) more deeply. On a personal note, this paper was a good chance to reflect on the different types of audience engagement with museum collections or historic sources and on the inherent value of participation in cultural heritage projects that underpin my MSc and PhD research and my work in museums generally.

Areti Galani presented research she'd done with Rachel Clarke (Newcastle University), and asked 'how can accessible technology lead to inaccessible participation paradigms?'. I was really interested in the difference between quality of the visitor contributions in-gallery vs online (though of course 'quality' is a highly subjective term), a question that surfaced through the day. Areti's research might suggest that building in some delay in the process of contributing in-gallery could lead to better quality (i.e. more considered) contributions. The novelty of the technology used might also have an effect – 'pen-happy visitors' who used the technology for the sake of interacting with it but didn't know what to do after picked up the pen.

The paper from Jeremy Ottevanger (Imperial War Museums) on "Social Interpretation" as a catalyst for organisational change generated more discussion on possible reasons why online comments on museum sites tend to be more thoughtful than in-gallery comments, with one possible reason being that online commenters have deliberately sought out the content, so already have a deeper engagement with those specific items, rather than just coming across them while moving through the physical gallery. Jeremy talked about the need for the museum to find an internal workflow that was appropriately responsive to online comments – in my experience, this is one of the most difficult issues in planning for digital projects. Jeremy presented a useful categorisation of online contributions as personal (emotional, opinion, personal information, anecdotes, family history), requests and queries (object info, valuation, family history, digitisation and licencing, offering material, access, history, general/website), and informational (new information, corrections) and looked at which types of contribution were responded to by different departments. He finished with a vision of the IWM harnessing the enthusiasm and knowledge of their audiences to help serve the need of other audiences, of connecting people with expertise with people who have questions.

Jack Ashby talked about finding the right questions for the QRator project at the Grant Museum of Zoology – a turtle is a turtle, and there's not a lot of value in finding out what visitors might want to call it, but asking wider questions could be more useful. Like the wider Social Interpretation project, QRator always raises questions for me about whether museums should actively 'garden' visitor interactives, pruning out less relevant questions to create a better experience for other visitors.

Rolf Steier and Palmyre Pierroux discussed their findings on the role of the affordances of social media and visitor contributions in museums. Rosie Cardiff talked about the Tate's motivations for participatory projects with audiences, and audience motivations for participating in Tate's projects. She presented some considerations for organisations considering participatory projects: who is the audience? What motivations for visitor and for organisation? What platform will you use? How will the content be moderated? (Who will do it?) Where will it sit in relation to organisational space online or in-gallery? How long will it run for? What plans for archiving and maintaining content beyond lifetime of project? How will you measure success? How will you manage audience expectations about what's going to happen to their work? This last point was also picked up in discussions about audience expectations about how long museums will keep their contributions.

The final presentation was Ross Parry's keynote on 'The end of the beginning: Normativity in the postdigital museum. Based on new research into how six UK national (i.e. centrally funded, big, prestigious museums) have started to naturalise 'digital' into their overall museum vision, this paper gave me hope for the future. There's still a long way to go, but Ross articulated a vision of how some museums are integrating digital in the immediate future, and how it will integrated once the necessary stage of highlighting 'digital' in strategies, organisational structures and projects has given way to a more cohesive incorporation of 'digital' into the fabric of museums. It also makes sense in the context of discussions about digital strategies in museums over the past year (e.g. at the Museums Assocation and UK Museums on the Web (themes, my report) conferences).

I had to leave before the final session, so my report ends here, but I expect there'll be more reports on the project blog and I've saved an archive of isayevent_tweets_2013_02_01 (CSV).

I think the organisers, Giasemi Vavoula and Jenny Kidd, did a great job on the conference programme. The papers and audience were a well-balanced combination of academics and practioners – the academic papers gave me interesting frameworks to think with, and the case studies provided material to think about.

The ever-morphing PhD

I wrote this for the NEH/Polis Summer Institute on deep mapping back in June but I'm repurposing it as a quick PhD update as I review my call for interview participants. I'm in the middle of interviews at the moment (and if you're an academic historian working on British history 1600-1900 who might be willing to be interviewed I'd love to hear from you) and after that I'll no doubt be taking stock of the research landscape, the findings from my interviews and project analyses, and updating the shape of my project as we go into the new year. So it doesn't quite reflect where I'm at now, but at the very least it's an insight into the difficulties of research into digital history methodologies when everything is changing so quickly:

"Originally I was going to build a tool to support something like crowdsourced deep mapping through a web application that would let people store and geolocate documents and images they were digitising. The questions that are particularly relevant for this workshop are: what happens when crowdsourcing or citizen history meet deep mapping? Can a deep map created by multiple people for their own research purposes support scholarly work? Can a synthetic, ad hoc collection of information be used to support an argument or would it be just for the discovery of spatio-temporarily relevant material? How would a spatial narrative layer work?

I planned to test this by mapping the lives and intellectual networks of early scientific women. But after conducting a big review of related projects I eventually realised that there's too much similar work going on in the field and that inevitably something similar would have been created by someone with more resources by the time I was writing up. So I had to rethink my question and my methods.

So now my PhD research seeks to answer 'how do academic and family/local historians evaluate, use and contribute to crowdsourced resources, especially geo-located historical materials?', with the goal of providing some insight into the impact of digitality on research practices and scholarship in the humanities. … How do trained and self-taught historians cope with changes in place names and boundaries over time, and the many variations and similarities in place names. Does it matter if you've never been to the place and don't know that it might be that messy and complex?

I'm interested how living in a digital culture affects how researchers work. What does it mean to generate as well as consume digital data in the course of research? How does user-created content affect questions of authorship, authority and trust for amateur historians and scholarly practice? What are the characteristics of a well-designed digital resource, and how can resources and tools for researchers be improved? It's a very Human-Computer Interaction/Infomatics view of the digital humanities but it addresses the issues around discoverability and usability that are so important for people building projects.

I'm currently interviewing academic, family and local historians, focusing on those working on research on people or places in early modern England – very loosely defined, as I'll go 1600-1900. I'm asking them about the tools do they currently use in their research; how they assess new resources; if or when they might you use a resource created through crowdsourcing or user contributions? (e.g. Wikipedia or ancestry.com); how do you work out which online records to trust? How they use place names or geographic locations in your research?

So far I've mostly analysed the interviews for how people think about crowdsourcing, I'll be focusing on the responses to place when I get back.

More generally, I'm interested in the idea of 'chorography 2.0' – what would it look like now? The abundance of information is as much of a problem as an opportunity: how to manage that?"