April 2013 – Open Objects

Does 'slow art day' work online?

Saturday was 'slow art day', and the Getty Museum (@GettyMuseum) shared a Robert Hughes clip that really resonated with me:

'We have had a gutful of fast art and fast food. What we need more of is slow art: art that holds time as a vase holds water: art that grows out of modes of perception and whose skill and doggedness make you think and feel; art that isn't merely sensational, that doesn't get its message across in 10 seconds, that isn't falsely iconic, that hooks onto something deep-running in our natures. In a word, art that is the very opposite of mass media.'

I was tied to my desk writing that day so I wondered how I could have a similar experience: can you 'do' slow art online? Assuming you can switch off all the other distractions of email, social media, flashing ads, etc, and ignore the fact that your house, office or library is full of other tasks and temptations, can you slow down and sit in front of one art work and have a similar experience through an image on a screen, or does being in a gallery add something to the process? On the other hand, high-resolution images and reflectance transformation imaging (RTI) mean you can see details you'd never see in a gallery so you can explore the artwork itself more deeply*. And to remove the screen from the equation, would looking at a really good print of a painting be as rewarding as looking at the original? And what of installations and sculpture?

Related to that, I've been wondering how to relate online collections (whether thematic, exhibition-style or old school catalogues) to audience motivations for visiting museums. I've just been reading a great overview of people's motivations for visiting museums in Dimitra Christidou's Re-Introducing Visitors: Thoughts and Discussion on John Falk’s Notion of Visitors’ Identity-Related Visit Motivations. Christidou summarises Falk and Storksdieck's 2005 research on 'museum-specific identities' reflecting visitor motivations:

Explorers are driven by their personal curiosity, their urge to discover new things.
Facilitators visit the museum on behalf of others’ special interests in the exhibition or the subject-matter of the museum.
Experience seekers are these visitors who desire to see and experience a place, such as tourists.
Professional hobbyists are those with specific knowledge in the subject matter of an exhibition and specific goals in mind.
Rechargers seek a contemplative or restorative experience, often to let some steam out of their systems.

Once I'd gotten past the amusing mental image of Facebook's Mark Zuckerberg's head exploding at the concept of 'big' and 'small' online identities that change according to context, interests, motivations, etc**, I thought the article provided a useful framework for returning to the question of 'what are museum websites for?'. We can safely assume that most gallery sites consider the needs of 'professional hobbyists', but what of the other motivations? Some of these motivations are embedded in social experiences – do art sites enable multi-user experiences online, or do they assume that 'sharing' or facilitation only happens via social media? Does looking at art online go deep enough to count as an 'experience'? And how much of the 'recharging' experience is tied to the act of getting to a particular space at a particular time, or to the affordances of the space itself and its physical separation from most distractions of the world?

What new motivations should be added for online experiences of museum exhibitions and objects? What's enabled by the convenience, accessibility and discoverability of art online? And to return to slow art, how can museums use text and design to cue people to slow down and look at art for minutes at a time without getting in the way of people who want a quick experience? (And is this the same basic question I'd asked earlier about 'enabling punctum' or 'what's the effect of all this aggregation of museum content on the user experience'?)

* Assuming you don't look so closely that you slip into 'inappropriate peering'.
** I'm sure Zuckerberg knows people have different identities in different situations, it's just more convenient for Facebook not to care. Christopher 'moot' Poole opposed this push quite well in a series of talks in 2011.

Notes from 'Crowdsourcing in the Arts and Humanities'

Last week I attended a one-day conference, 'Digital Impacts: Crowdsourcing in the Arts and Humanities' (#oxcrowd), convened by Kathryn Eccles of Oxford's Internet Institute, and I'm sharing my (sketchy, as always) notes in the hope that they'll help people who couldn't attend.

Stuart Dunn reported on the Humanities Crowdsourcing scoping report (PDF) he wrote with Mark Hedges and noted that if we want humanities crowdsourcing to take off we should move beyond crowdsourcing as a business model and look to form, nurture and connect with communities. Alice Warley and Andrew Greg presented a useful overview of the design decisions behind the Your Paintings Tagger and sparked some discussion on how many people need to view a painting before it's 'completed', and the differences between structured and unstructured tagging. Interestingly, paintings can be 'retired' from the Tagger once enough data has been gathered – I personally think the inherent engagement in tagging is valuable enough to keep paintings taggable forever, even if they're not prioritised in the tagging interface. Kate Lindsay brought a depth of experience to her presentation on 'The Oxford Community Collection Model' (as seen in Europeana 1914-1918 and RunCoCo's 2011 report on 'How to run a community collection online' (PDF)). Some of the questions brought out the importance of planning for sustainability in technology, licences, etc, and the role of existing networks of volunteers with the expertise to help review objects on the community collection days. The role of the community in ensuring the quality of crowdsourced contributions was also discussed in Kimberly Kowal's presentation on the British Library's Georeferencer project. She also reflected on what she'd learnt after the first phase of the Georeferencer project, including that the inherent reward of participating in the activity was a bigger motivator than competitiveness, and the impact on the British Library itself, which has opened up data for wider digital uses and has more crowdsourcing projects planned. I gave a paper which was based on an earlier version, The gift that gives twice: crowdsourcing as productive engagement with cultural heritage, but pushed my thinking about crowdsourcing as a tool for deep engagement with museums and other memory organisations even further. I also succumbed to the temptation to play with my own definitions of crowdsourcing in cultural heritage: 'a form of engagement that contributes towards a shared, significant goal or research question by asking the public to undertake tasks that cannot be done automatically' or 'productive public engagement with the mission and work of memory institutions'.

Chris Lintott of Galaxy Zoo fame shared his definition of success for a crowdsourcing/citizen science project: it has to produce results of value to the research community in less time than could have been done by other means (i.e. it must have been able to achieve something with crowd that couldn't have without them) and discussed how the Ancient Lives project challenged that at first by turning 'a few thousand papyri they didn't have time to transcribe into several thousand data points they didn't have time to read'. While 'serendipitous discovery is a natural consequence of exposing data to large numbers of users' (in the words of the Citizen Science Alliance), they wanted a more sophisticated method for recording potential discoveries experts made while engaging with the material and built a focused 'talk' tool which can programmatically filter out the most interesting unanswered comments and email them to their 30 or 40 expert users. They also have Letters for more structured, journal-style reporting. (I hope I have that right). He also discussed decisions around full text transcriptions (difficult to automatically reconcile) vs 'rich metadata', or more structured indexes of the content of the page, which contain enough information to help historians decide which pages to transcribe in full for themselves.

Some other thoughts that struck me during the day… humanities crowdsourcing has a lot to learn from the application of maths and logic in citizen science – lots of problems (like validating data) that seem intractable can actually be solved algorithmically, and citizen science hypothesis-based approach to testing task and interface design would help humanities projects. Niche projects help solve the problem of putting the right obscure item in front of the right user (which was an issue I wrestled with during my short residency at the Powerhouse Museum last year – in hindsight, building niche projects could have meant a stronger call-to-action and no worries about getting people to navigate to the right range of objects). The variable role of forums and participants' relationship to the project owners and each other came up at various points – in some projects, interactions with a central authority are more valued, in others, community interactions are really important. I wonder how much it depends on the length and size of the project? The potential and dangers of 'gamification' and 'badgeification' and their potentially negative impact on motivation were raised. I agree with Lintott that games require a level of polish that could mean you'd invest more in making them than you'd get back in value, but as a form of engagement that can create deeper relationships with cultural heritage and/or validate some procrastination over a cup of tea, I think they potentially have a wider value that balances that.

I was also asked to chair the panel discussion, which featured Kimberly Kowal, Andrew Greg, Alice Warley, Laura Carletti, Stuart Dunn and Tim Causer. Questions during the panel discussion included:

'what happens if your super-user dies?' (Super-users or super contributors are the tiny percentage of people who do most of the work, as in this Old Weather post) – discussion included mass media as a numbers game, the idea that someone else will respond to the need/challenge, and asking your community how they'd reach someone like them. (This also helped answer the question 'how do you find your crowd?' that came in from twitter)
'have you ever paid anyone?' Answer: no
'can you recruit participants through specialist societies?' From memory, the answer was 'yes but it does depend'.
something like 'have you met participants in real life?' – answer, yes, and it was an opportunity to learn from them, and to align the community, institution, subject and process.
'badgeification?'. Answer: the quality of the reward matters more than the levels (so badges are probably out).
'what happens if you force students to work on crowdsourcing projects?' – one suggestion was to look for entries on Transcribe Bentham in a US English class blog
'what's happened to tagging in art museums, where's the new steve.museum or Brooklyn Museum?' – is it normalised and not written about as much, or has it declined?
'how can you get funding for crowdsourcing projects?'. One answer – put a good application in to the Heritage Lottery Fund. Or start small, prove the value of the project and get a larger sum. Other advice was to be creative or use existing platforms. Speaking of which, last year the Citizen Science Alliance announced 'the first open call for proposals by researchers who wish to develop citizen science projects which take advantage of the experience, tools and community of the Zooniverse. Successful proposals will receive donated effort of the Adler-based team to build and launch a new citizen science project'.
'can you tell in advance which communities will make use of a forum?' – a great question that drew on various discussions of the role of communities of participants in supporting each other and devising new research questions
a question on 'quality control' provoked a range of responses, from the manual quality control in Transcribe Bentham and the high number of Taggers initially required for each painting in Your Paintings which slowed things down, and lead into a discussion of shallow vs deep interactions
the final questioner asked about documenting film with crowdsourcing and was answered by someone else in the audience, which seemed a very fitting way to close the day.

James Murray in his Scriptorium with thousands of word references sent in by members of the public for the first Oxford English Dictionary. Early crowdsourcing?

If you found this post useful, you might also like Frequently Asked Questions about crowdsourcing in cultural heritage or my earlier Museums and the Web paper on Playing with Difficult Objects – Game Designs to Improve Museum Collections.

'An (even briefer) history of open cultural data' at GLAM-Wiki 2013

These are some of my notes for my invited plenary talk at GLAM-Wiki 2013 (Galleries, Libraries, Archives, Museums & Wikimedia, #GLAMWiki), held at the British Library on April 12-13, 2013. I don't think I stuck that closely to them on the day, and in the interests of brevity I've left out the 'timeline' bits (but you can read about some of them in a related MuseumID article, 'Where next for open cultural data in museums?') to focus on the lessons to be learnt from changes so far. There were lots of great talks and discussion at the event, you can view some of the presentations on Wikimedia UK's YouTube channel.

A (now very) brief history of open cultural data

Firstly, thank you for the invitation to speak… This morning I want to highlight some key moments of change in the history of open cultural data – a history not only of licenses and data, but also of conversations, standards, and collaborations, of moments where things changed… I've included key moments from funders, legislative influences and the commercial sector too, as they create the context in which change happens and often have an effect on what's considered possible. I'll close by considering some of the lessons learnt.

[Please help improve this talk]

A caveat – there may well be a bias towards the English-speaking world (and to museums, because of my background). If you know of an open GLAM (gallery, library, archive, museum) data source I've missed, you can add it to the open cultural data/GLAM API wiki… or Lotte's Belice's list of open culture milestones timeline.

Definitions

'open cultural data' is data from cultural institutions that is made available for use in a machine-readable format under an open licence. But each word in open, cultural, data is slightly more complicated so I'll unpack them a little…

Open

Office clerks, FNV. Voorlichting.

While the degree of openness required to be 'open' data can be contentious, at its simplest, 'open' refers to content that is available for use outside the institution that created it, whether for school homework projects, academic monographs or mobile phone apps. 'Open' may refer to licences that clarify the permissions and restrictions placed on data, or to the use of non-proprietary digital technologies, or ideally, to a combination of both open licences and technologies.

Ideally, open data is freely available for use and redistribution by anyone for any purpose, but in reality there are often restrictions. GLAMs may limit commercial use by licensing content for 'non-commercial use only', but as there is no clear definition of 'non-commercial use' in Creative Commons licences, some developers may choose not to risk using a dataset with an unclear licence. GLAMs may also release data for commercial use but still require attribution, either to help retain the provenance of the content, to help people find their way to related content or just because they'd like some credit for their work. GLAMs might also release data under custom licences that deal with their specific circumstances, but they are then difficult to integrate with content from other openly-licensed datasets.

Hybrid licensing models are a pragmatic solution for the current environment. They at least allow some use and may contribute to greater use of open cultural data while other issues are being worked out. For example, some institutions in the UK are making lower resolutions images available for re-use under an open licence while reserving high resolution versions for commercial sales and licensing. Or they may differentiate between scholarly and commercial use, or use more restrictive licences for commercially valuable images and release everything else openly.

I think this type of access is better than nothing, particularly if organisations can learn from the experience and release more data next time. Because these hybrid models are often experimental, their reception is important, and it's helpful for GLAMs to be able to show they've had a positive impact and hopefully helped create relationships with groups like Wikipedia.

Cultural

Cultural data is data about objects, publications (such as books, pamphlets, posters or musical scores), archival material, etc, created and distributed by museums, libraries, archives and other organisations.

Data

It's a useful distinction to discuss early with other cultural heritage staff as it's easy to be talking at cross-purposes: data can refer to different types of content, from metadata or tombstone records (the basic titles, names, dates, places, materials, etc of a catalogue record), to entire collection records (including data such as researched and interpretive descriptions of objects, bibliographic data, related themes and narratives) to full digital surrogates of an object, document or book as images or transcribed text. Some organisations release open metadata, others release all their data including their images. If you can't do open data (full content or 'digital surrogates' like photographs or texts) then at least open up the metadata (data about the content) as e.g. CC0 and the rest with another licence. Releasing data may involve licensing images, offering downloads from catalogue sites; 'content donations', APIs and machine-facing interfaces; term lists, etc. Much of the data that isn't images isn't immediately interesting, and may be designed for inter-collections interoperability or mashups rather than media commons.

Why is open cultural data important?

Before I go on, why do we care? Open cultural data is the foundation on which many projects can be built. It helps achieve organisational goals, mission; can help increase engagement with content; can create 'network effect' with related institutions; can be re-used by people who share your goals around access to knowledge and information – people like Wikipedians.

Some key moments in open cultural data

Events I discussed included the founding of Wikimedia, Europeana and Flickr Commons, previous GLAM-Wiki conferences, changes in licences for art images, library catalogue records and museum content, GLAM APIs and linked data services and the launch of the Digital Public Library of America next week.

Lessons learnt

Many of the changes are the results of years of conversation and collaboration – change is slow but it does happen. GLAMs work through slow iterations – try something, and if no-one dies, they'll try something else. We are all ambassadors, and we are all translators, helping each domain understand the other.

Contradictory things GLAMs are told they must do

Give content away for the benefit of all
Monetise assets; protect against loss of potential income; protect against mis-use of collections; conserve collections in perpetuity; protect the IP of artists; demonstrate ROI on digitisation

It's not easy for GLAMs to release all their data under an entirely open licence, but they don't do it just to be annoying – it's important to understand some of the pressures they're under. For example, GLAMs usually need to be able to track uses of their data and content to show the impact of digitising and publishing content, so they prefer attribution licences.

The issue of potential lost income – imaginary money that could be made one day if circumstances change, or profit that someone else makes off their opened data – is particularly difficult as hard to deal with [and here I ad-libbed, saying that it was like worrying about failing to meet the love of your life because you got on a different tube carriage – you can't live your life chasing ghosts]. Ideally, open data needs to be understood as an input to the creative economy rather than an item on the balance sheet of an individual GLAM.

GLAMs worry about reputational damage, whether appearing on the front page of a tabloid newspaper for the 'wrong' reasons, questions being asked in Parliament, or critique from Wikipedians. Over time, their mindset is changing from keeping 'our data' to being holders, custodians of our shared heritage.

Conversations, communities, collaborations

Conversations matter… we're all working towards the same goal, but we have different types of anxieties and different problems we have to address.

GLAMs are about collections, knowledge, and audiences. Unlike most online work, they are used to seeing the excitement people experience walking through their door – help GLAMs understand what Wikipedians can do for different audiences by making those audience real to them. GLAMs are also used to being wined and dined before you lay the hard word on them. Just because you don't need to ask for permission to use content doesn't mean you shouldn't start a conversation with an organisation. There are lots of people with similar goals inside organisations, so try to find them and work with them. Trust is a currency, don't blow it!

Being truly collaborative sometimes means compromising (or picking your battles) and it definitely means practising empathy. Open data people could stop talking about open data as something you *do* to GLAMs, and GLAMs could stop thinking open data people just want to make your life difficult.

The role of higher powers

Government attitudes to open data make a big difference and they can also change the risks associated with publishing orphan works. Governments can also help GLAMs open up their content by indemnifying them against the chance that someone else will monetise their data – consider it not a failure of the GLAM but a contribution to the creative and digital economy.

Things that are better than a poke in the eye with a sharp stick

Kittens (and puppies)
Cultural data that's available online but isn't (yet) openly licensed
Cultural data online that is licensed for non-commercial use

Yes, the last two aren't ideal, but they are great deal better than nothing.

Into the future…

GLAMs and Wikipedians may move at different paces, and may have different priorities and different ways of viewing the world, but we're all working towards the same goals. Not everything is as open, but a lot more is open than it used to be. I sensed yesterday [the first day of the conference] that there are still some tensions between Wikimedians and GLAMers, moments when we need to take a deep breath and put empathy before a pithy put down, but I loved that Kat Walsh's welcome yesterday described how Wikipedia used to focus on how different from others but now focuses on reaching out to others and figuring out how we're the same.

GLAMs and Wikipedians have already used open cultural data to make the world a better place. Let's celebrate the progress we've made and keep working on that…

GLAM-WIKI 2013 Friday attendees photograph by Mike Peel (www.mikepeel.net).

Congratulations to everyone who helped make it a great event, but particularly to Daria Cybulska and Andrew Gray (@generalising) for making everything work so smoothly, and Liam Wyatt (@wittylama) for the original invitation to speak.