Notes from Culture Hack Day (#chd11)

Culture Hack Day (#chd11) was organised by the Royal Opera House (the team being @rachelcoldicutt, @katybeale, @beyongolia, @mildlydiverting, @dracos – and congratulations to them all on an excellent event). As well as a hack event running over two days, they had a session of five minute 'lightning talks' on Saturday, with generous time for discussion between sessions. This worked quite well for providing an entry point to the event for the non-technical, and some interesting discussion resulted from it. My notes are particularly rough this time as I have one arm in a sling and typing my hand-written notes is slow.

Lightning Talks
Tom Uglow @tomux “What if the Web is a Fad?”
'We're good at managing data but not yet good at turning it into things that are more than points of data.' The future is about physical world, making things real and touchable.

Clare Reddington, @clarered, “What if We Forget about Screens and Make Real Things?”
Some ace examples of real things: Dream Director; Nuage Vert (Helsinki power station projected power consumption of city onto smoke from station – changed people's behaviour through ambient augmentation of the city); Tweeture (a conch, 'permission object' designed to get people looking up from their screens, start conversations); National Vending Machine from Dutch museum.

Leila Johnston, @finalbullet talked about why the world is already fun, and looking at the world with fresh eyes. Chromaroma made Oyster cards into toys, playing with our digital footprint.

Discussion kicked off by Simon Jenkins about helping people get it (benefits of open data etc) – CR – it's about organisational change, fears about transparency, directors don't come to events like this. Understand what's meant by value – cultural and social as well as economic. Don't forget audiences, it has to be meaningful for the people we're making it (cultural products) for'.

Comment from @fidotheCultural heritage orgs have been screwed over by software companies. There's a disconnect between beautiful hacks around the edges and things that make people's lives easier. [Yes! People who work in cultural heritage orgs often have to deal with clunky tools, difficult or vendor-dependent data export proccesses, agencies that over-promise and under-deliver. In my experience, cultural orgs don't usually have internal skills for scoping and procuring software or selecting agencies so of course they get screwed over.]

TU: desire to be tangible is becoming more prevalent, data to enhance human experience, the relationship between culture and the way we live our lives.

CR: don't spend the rest of the afternoon reinforcing silos, shouldn't be a dichotomy between cultural heritage people and technologists. [Quick plug for http://museum30.ning.com/, http://groups.google.com/group/antiquist, http://museum-api.pbwiki.com/ and http://museumscomputergroup.org.uk/email-list/ as places where people interested in intersection between cultural heritage and technology can mingle – please let me know of any others!] Mutual respect is required.

Tom Armitage, @infovore “Sod big data and mashups: why not hack on making art?”
Making culture is more important than using it. 3 trends: 1) collection – tools to slice and dice across time or themes; 2) magic materials 3) mechanical art, displays the shape of the original content; 3a) satire – @kanyejordan 'a joke so good a machine could make it'.

Tom Dunbar, @willyouhelp – story-telling possibilites of metadata embedded in media e.g. video [check out Waisda? for game designed to get metdata added to audio-visual archives]. Metadata could be actors, characters, props, action…

Discussion [?]:remixing in itself isn't always interesting. Skillful appropriation across formats… Universe of editors, filterers, not only creators. 'in editing you end up making new things'.

Matthew Somerville, @dracos, Theatricalia, “What if You Never Needed to Miss a Show?”
'Quite selfish', makes things he needs. Wants not to miss theatre productions with people he likes in/working on them. Theatricalia also collects stories about productions. [But in discussion it came up that the National Theatre asked him to remove data – why?! A recommendation system would definitely get me seeing more theatre, and I say that as a fairly regular but uninformed theatre-goer who relies on word-of-mouth to decide where to spend ticket money.]

Nick Harkaway, @Harkaway on IP and privacy
IP as way of ringfencing intangible ideas, requiing consent to use. Privacy is the same. Not exciting, kind of annoying but need to find ways to make it work more smoothly while still proving protection. 'Buying is voting', if you buy from Tesco, you are endorsing their policies. 'Code for the change you want to see in the world', build the tools you want cultural orgs to have so they can do better. [Update: Nick has posted his own notes at Notes from Culture Hack Day. I really liked the way he brought ethical considerations to hack enthusiasm for pushing the boundaries of what's possible – the ability to say 'no' is important even if a pain for others.]

Chris Thorpe, @jaggeree. ArtFinder, “What if you could see through the walls of every museum and something could tell you if you’d like it?”

Culture for people who don't know much about culture. Cultural buildings obscure the content inside, stop people being surprised by what's available. It's hard if you don't know where to start. Go for user-centric information. Government Art Collection Explorer – ace! Wants an angel for art galleries to whisper information about the art in his ear. Wants people to look at the art, not the screen of their device [museums also have this concern]. SAP – situated audio platform. Wants a 'flight data recorder' for trips around cultural places.

Discussion around causes of fear and resistance to open data – what do cultural orgs fear and how can they learn more and relax? Fear of loss of provenance – response was that for developers displaying provenance alongside the data gives it credibility; counter-response was that organisations don't realise that's possible. [My view is that the easiest way to get this to change is to change the metrics by which cultural heritage organisations are judged, and resolve the tension between demands to commercialise content to supplement government grants and demands for open access to that same data. Many museums have developed hybrid 'free tombstone, low-res, paid-for high-res' models to deal with this, but it's taken years of negotiation in each institution.] I also ranted about some of these issues at OpenTech 2010, notes at 'Museums meet the 21st century'.

Other discussion and notes from twitter – re soap/drama characters tweeting – I managed to out myself as a Neighbours watcher but it was worth it to share that Neighbours characters tweet and use Facebook. Facebook relationship status updates and events have been included as plot points, and references are made to twitter but not to the accounts of the characters active on the service. I wonder if it's script writers or marketing people who write the characters tweets? They also tweet in sync with the Australian showings, which raises issues around spoilers and international viewers.

Someone said 'people don't want to interact with cultural institutions online. They want to interact with their content' but I think that's really dependent on the definition of content – as pointed out, points of data have limited utility without further context. There's a catch-22 between cultural orgs not yet making really engaging data and audiences not yet demanding it, hopefully hack days like CHD11 help bridge the gap and turn data into stories and other meaningful content. We're coming up against the limits of what can be dome programmatically, especially given variation in quality and extent of cultural heritage data (and most of it is data rather than content).

[Update: after writing this I found a post The lightning talks at Culture Hack Day about the day, which happily picks up on lots of bits I missed. Oh, and another, by Roo Reynolds.]

After the lightning talks I popped over the road to check out the hacking and ended up getting sucked in (the lure of free pizza had a powerful effect!).  I worked on a WordPress plugin with Ian Ibbotson @ianibbo that lets you search for a term on the Culture Grid repository and imports the resulting objects into my museum metadata games so that you can play with objects based on your favourite topic.  I've put the code on github [https://github.com/mialondon/mmg-import] and will move it from my staging server to live over the next few days so people can play with the objects.  It's such a pain only having one hand, and I'm very grateful to Ian for the chance to work together and actually get some code written.  This work means that any organisation that's contributed records to the Culture Grid can start to get back tags or facts to enhance their collections, based on data generated by people playing the games.  The current 300-ish objects have about 4400 tags and 30 facts, so that's not bad for a freebie. OTOH, I don't know of many museums with the ability to display content created by others on their collections pages or store it in their collections management systems – something for another hack day?

Something I think I'll play around with a bit more is the idea of giving cultural heritage data a quality rating as it's ingested.  We discussed whether the ratings would be local to an app (as they could be based on the particular requirements of that application) or generalised and recorded in the CultureGrid service.  You could record the provence of a rating which might be an approach that combines the benefits of both approaches.  At the moment, my requirements for a 'high quality' record would be: title (e.g. 'The Ashes trophy', if the object has one), name or type of object (e.g. cup), date, place, decent sized image, description.

Finally, if you're interested in hacking around cultural heritage data, there's also historyhackday next weekend. I'm hoping to pop in (dependent on fracture and MSc dissertation), not least because in March I'm starting a PhD in digital humanities, looking at participatory digitisation of geo-located historical material (i.e. getting people to share the transcriptions and other snippets of ad hoc digitisation they do as part of their research) and it's all hugely relevant.

What would a digital museum be like if there was never a physical museum?

This is partly an experiment in live-blogging a conversation that's mostly happening on twitter – in trying to bridge the divide between conversation that anyone can jump into, and a sometimes intimidating comment box on an individual blog; and partly a chance to be brave about doing my thinking in public and posing a question before I've worked out my own answer…

I've been thinking about the question 'if physical museums were never invented, how would we have invented digital museums?' for a while (I was going to talk about this at GLAM-WIKI but decided not to subject people to a rambling thought piece exploring the question).  By this I don't mean a museum without objects, rather 'what if museums weren't conceived as central venues?'.  Today, in the spirit of avoiding a tricky bit of PHP I have to deal with on my day off, I tweeted: "Museums on the web, social media, apps – stories in your everyday life; visiting physical museum – special treat, experience space, objects?".  By understanding how the physical museum has shaped our thinking, can we come up with models that make the most of the strengths, and minimise the weaknesses, of digital and physical museums? How and where can people experience museum collections, objects, stories, knowledge? How would the phenomenology of a digital museum, a digital object, be experienced?

And what is a 'museum' anyway, if it's not represented by a building?  In another twitter conversation, I realised my definition is something like: museums are for collections of things and the knowledge around them.

Then a bit of explanation: "Previous tweet is part of me thinking re role of digital in museums; how to reconcile internal focus on physical with reach of digital etc" (the second part has a lot to do with a new gallery opening today at work, and casting my mind back to the opening of Who Am I? and Antenna in June).

Denver Art Museum's Koven J. Smith has been discussing similar questions: 'What things do museums do *exclusively* because of tradition? If you were building a museum from scratch, what would you do differently?'. My response was "a museum invented now would be conversational and authoritative – here's this thing, and here's why it's cool".


Other questions: Did the existence of the earlier model muddy our thinking?  How can we make online, mobile or app visitors as visible (and as important) as physical visitors?  (I never want to see another email talking about 'real [i.e. physical] and online' visitors).

So, what do you think?  And if you've come here from twitter, I'd be so thrilled if you bridged the divided and commented!  I'll also update with quotes from tweets but that'll probably be slower than commenting directly.

Anyway, I can see lots of comments coming in from twitter so I'm going to hit 'publish post' now…

[Update – as it turns out, 'live blogging' has mostly turned into me updating the post with clarifications, and continuing discussion in the comments. I find myself reluctant to re-contextualise people's tweets in a post, but maybe I'm just too sensitive about accidentally co-opting other people's voices/content.  If you want to share something on twitter rather than in a comment, I'm @mia_out.]

Notes on 'User Generated Content' session, Open Culture Conference 2010

My notes from the 'user generated content' parallel track on first day of the Open Culture 2010 conference. The session started with brief presentations by panellists, then group discussions at various tables on questions suggested by the organisers. These notes are quite rough, and of course any mistakes are mine. I haven't had a chance to look for the speakers' slides yet so inevitably some bits are missing, and I can only report the discussion at the table I was at in the break-out session. I've also blogged my notes from the plenary session of the Open Culture 2010 conference.

User-generated content session, Open Culture, Europeana – the benefits and challenges of UGC.
Kevin Sumption, User-generated content, a MUST DO for cultural institutions
His background – originally a curator of computer sciences. One of first projects he worked on at Powerhouse was D*Hub which presented design collections from V&A, Brooklyn Museum and Powerhouse Museum – it was for curators but also for general public with an interest in design. Been the source of innovation. Editorial crowd-sourcing approach and social tagging, about 8 years ago.

Two years ago he moved to National Maritime Museum, Royal Observatory, Greenwich. One of the first things they did was get involved with Flickr Commons – get historic photographs into public domain, get people involved in tagging. c1000 records in there. General public have been able to identify some images as Adam Villiers images – specialists help provide attribution for the photographer. Only for tens of records of the 000s but was a good introduction to power of UGC.

Building hybrid exhibition experiences – astronomy photographer of the year – competition on Flickr with real world exhibition for the winners of the competition. 'Blog' with 2000 amateur astronomers, 50 posts a day. Through power of Flickr has become a significant competition and brand in two years.

Joined citizen science consortia. Galaxy Zoo. Brainchild of Oxford – getting public engaged with real science online. Solar Stormwatch c 3000 people analysing and using the data. Many people who get involved gave up science in high school… but people are getting re-engaged with science *and* making meaningful contributions.

Old Weather – helping solve real-world problems with crowdsourcing. Launched two months ago.
Passion for UGC is based around where projects can join very carefully considered consortia, bringing historical datasets with real scientific problems. Can bring large interested public to the project. Many of the public are reconnecting with historical subject matter or sciences.

Judith Bensa-Moortgat, Nationaal Archief, Netherlands, Images for the Future project
Photo collection of more than 1 million photos. Images for the future project aims to save audio-visual heritage through digitisation and conservation of 1.2 million photos.

Once digitised, they optimise by adding metadata and context. Have own documentalists who can add metadata, but it would take years to go through it all. So decided to try using online community to help enrich photo collections. Using existing platforms like Wikipedia, Flickr, Open Street map, they aim to retrieve contextual info generated by the communities.  They donated political portraits to Wikimedia Commons and within three weeks more than half had been linked to relevant articles.

Their experiences with Flickr Commons – they joined in 2008. Main goal was to see if community would enrich their photos with comments and tags. In two weeks, they had 400,000 page views for 400 photos, including peaks when on Dutch TV news. In six months, they had 800 photos with over 1 million views. In Oct 2010, they are averaging 100,000 page views a month; 3 million overall.

But what about comments etc? Divided them into categories of comments [with percentage of overall contributions]:

  • factual info about location, period, people 5%; 
  • link to other sources eg Wikipedia 5%; 
  • personal stories/memories (e.g. someone in image was recognised); 
  • moral discussions; 
  • aesthetical discussions; 
  • translations.

The first two are most important for them.
13,000 tags in many languages (unique tags or total?).
10% of the contributed UGC was useful for contextualisation; tags ensure accessibility [discoverability?] on the web; increased (international) visibility. [Obviously the figures will vary for different projects, depending on what the original intent of the project was]

The issues she'd like to discuss are – copyright, moderation, platforms, community.

Mette Bom, 1001 Stories about Denmark
Story of the day is one of the 1001 stories. It's a website about the history and culture of Denmark. The stories have themes, are connected to a timeline.  Started with 50 themes, 180 expert writers writing the 1001 stories, now it's up to the public to comment and write their own stories. Broad definition of what heritage is – from oldest settlement to the 'porn street' – they wanted to expand the definition of heritage.

Target audiences – tourists going to those places; local dedicated experts who have knowledge to contribute. Wanted to take Danish heritage out of museums.

They've created the main website, mobile apps, widget for other sites, web service.  Launched in May 2010.  20,000 monthly users. 147 new places added, 1500 pictures added.

Main challenges – how to keep users coming back? 85% new, 15% repeat visitors (ok as aimed at tourists but would like more comments). How to keep press interested and get media coverage? Had a good buzz at the start cos of the celebrities. How to define participation? Is it enough to just be a visitor?

Johan Oomen, Netherlands Institute for Sound and Vision, Vrij Uni Amsterdam. Participatory Heritage: the case of the Waisda? video labelling game.
They're using game mechanisms to get people to help them catalogue content. [sounds familiar!]
'In the end, the crowd still rules'.
. Tagging is a good way to facilitate time-based annotation [i.e. tag what's on the screen at different times]

Goal of game is consensus between players. Best example in heritage is steve.museum; much of the thinking about using tagging as a game came from Games with a Purpose (gwap.com).  Basic rule – players score points when their tag exactly matches the tag entered by another within 10 seconds. Other scoring mechanisms.  Lots of channels with images continuously playing.

Linking it to twitter – shout out to friends to come join them playing.  Generating traffic – one of the main challenges. Altruistic message 'help the archive' 'improve access to collections' came out of research with users on messages that worked. Worked with existing communities.

Results, first six months – 44,362 pageviews. 340,000 tags to 604 items, 42,068 unique tags.
Matches – 42% of tags entered more than 2 times. Also looked at vocab (GTAA, Cornetto), 1/3 words were valid Dutch words, but only a few part of thesauruses.  Tags evaluated by documentalists. Documentary film 85% – tags were useful; for reality series (with less semantic density) tags less useful.

Now looking at how to present tags on the catalogue Powerhouse Museum style.  Experimenting with visualising terms, tag clouds when terms represented, also makes it easy to navigate within the video – would have been difficult to do with professional metadata.  Looking at 'tag gardening' – invite people to go back to their tags and click to confirm – e.g. show images with particular tags, get more points for doing it.

Future work – tag matching – synonyms and more specific terms – will get more points for more specific terms.

Panel overview by Costis Dallas, research fellow at Athena, assistant professor at Panteion University, Athens.
He wants to add a different dimension – user-generated content as it becomes an object for memory organisations. New body of resources emerging through these communication practices.
Also, we don't have a historiography anymore; memory resides in personal information devices.  Mashups, changes in information forms, complex composed information on social networks – these raise new problems for collecting – structural, legal, preservation in context, layered composition.  What do we need to do now in order to be able to make use of digital technologies in appropriate, meaningful ways in the future? New kinds of content, participatory curation are challenges for preservation.

Group discussion (breakout tables)
Discussion about how to attract users. [It wasn't defined whether it was how to attract specifically users who'll contribute content or just generally grow the audience and therefore grow the number of content creators within the usual proportions of levels of participation e.g. Nielsen, Forrester; I would also have liked to discussed how to encourage particular kinds of contributions, or to build architectures of participation that provided positive feedback to encourage deeper levels of participation.]

Discussion and conclusions included – go with the strengths of your collections e.g. if one particular audience or content-attracting theme emerges, go with it.  Norway has a national portal where people can add content. They held lots of workshops for possible content creators; made contact with specialist organisations [from which you can take the lesson that UGC doesn't happen in a vacuum, and that it helps to invest time and resources into enabling participants and soliciting content].  Recording living history.  Physical presence in gallery, at events, is important.  Go where audiences already are; use existing platforms.

Discussion about moderation included – once you have comments, how are they integrated back into collections and digital asset management systems?  What do you do about incorrect UGC displayed on a page?  Not an issue if you separate UGC from museum/authoritative content in the interface design.  In the discussion it turned out that Europeana doesn't have a definition of 'moderation'.  IMO, it should include community management, including acknowledging and thanking people for contributions (or rather, moderation is a subset of community management).  It also includes approving or reviewing and publishing content, dealing with corrections suggested by contributors, dealing with incorrect or offensive UGC, adding improved metadata back to collections repositories.

User-generated content and trust – British Library apparently has 'trusted communities' on their audio content – academic communities (by domain name?) and 'everyone else'.  Let other people report content to help weed out bad content.

Then we got onto a really interesting discussion of which country or culture's version of 'offensive' would be used in moderating content.  Having worked in the UK and the Netherlands, I know that what's considered a really rude swear word and what's common vocabulary is quite different in each country… but would there be any content left if you considered the lowest common standards for each country?  [Though thinking about it later, people manage to watch films and TV and popular music from other countries so I guess they can deal with different standards when it's in context.]  To take an extreme content example, a Nazi uniform as memorabilia is illegal in Germany (IIRC) but in the UK it's a fancy dress outfit for a member of the royal family.

Panel reporting back from various table discussions
Kevin's report – discussion varied but similar themes across the two tables. One – focus on the call to action, why should people participate, what's the motivation? How to encourage people to participate? Competitions suggested as one solution, media interest (especially sustained). Notion of core group who'll energise others. Small groups of highly motivated individuals and groups who can act as catalysts [how to recruit, reward, retain]. Use social media to help launch project.

1001 Danish Stories promotional video effectively showed how easy the process of contributing content was,  and that it doesn't have to to be perfect (the video includes celebrities working the camera [and also being a bit daggy, which I later realised was quite powerful – they weren't cool and aloof]).
Giving users something back – it's not a one-way process. Recognition is important. Immediacy too – if participating in a project, people want to see their contributions acknowledged quickly. Long approval processes lose people.
Removal of content – when different social, political backgrounds with different notions of censorship.

Mette's report – how to get users to contribute – answers mostly to take away the boundaries, give the users more credit than we otherwise tend to. We always think users will mess things up and experts will be embarrassed by user content but not the case. In 1001 they had experts correcting other experts. Trust users more, involve experts, ask users what they want. Show you appreciate users, have a dialouge, create community. Make it a part of life and environment of users. Find out who your users are.

Second group – how Europeana can use the content provided in all its forms. Could build web services to present content from different places, linking between different applications.
How to set up goals for user activity – didn't get a lot of answers but one possibility is to start and see how users contribute as you go along. [I also think you shouldn't be experimenting with UGC without some goal in mind – how else will you know if your experiment succeeded?  It also focusses your interaction and interface design and gives the user some parameters (much more useful than an intimidating blank page)].

Judith's report (including our table) – motivation and moderation in relation to Europeana – challenging as Europeana are not the owners of the material; also dealing with multilingual collections. Culturally-specific offensive comments. Definition and expectations of Europeana moderation. Resources need if Europeana does the moderation.
Incentives for moderation – improving data, idealism, helping with translations – people like to help translate.

Johan's report – rewards are important – place users in social charts or give them a feeling of contributing to larger thing; tap into existing community; translate physical world into digital analogue.
Institutional policy – need a clear strategy for e.g. how to integrate the knowledge into the catalogue. Provide training for staff on working with users and online tools. There's value in employing community managers to give people feedback when they leave content.
Using Amazon's Mechanical Turk for annotations…
Doing the projects isn't only of benefit in enriching metadata but also for giving insight into users – discover audiences with particular interests.

Costis commenting – if Europeana only has thumbnails and metadata, is it a missed opportunity to get UGC on more detailed content?

Is Europeana highbrow compared to other platforms like Flickr, FB, so would people be afraid to contribute? [probably – there must be design patterns for encouraging participation from audiences on museum sites, but we're still figuring out what they are]
Business model for crowdsourcing – producing multilingual resources is perfect case for Europeana.

Open to the floor for questions… Importance of local communities, getting out there, using libraries to train people. Local newspapers, connecting to existing communities.

Notes from Europeana's Open Culture Conference 2010

The Open Culture 2010 conference was held in Amsterdam on October 14 – 15. These are my notes from the first day (I couldn't stay for the second day). As always, they're a bit rough, and any mistakes are mine. I haven't had a chance to look for the speakers' slides yet so inevitably some bits are missing.  If you're in a hurry, the quote of the day was from Ian Davis: "the goal is not to build a web of data. The goal is to enrich lives through access to information".

The morning was MCd by Costis Dallas and there was a welcome and introduction from the chair of the Europeana Foundation before Jill Cousins (Europeana Foundation) provided an overview of Europeana. I'm sure the figures will be available online, but in summary, they've made good progress in getting from a prototype in 2008 to an operational service in 2010. [Though I have written down that they had 1 million visits in 2010, which is a lot less than a lot of the national museums in the UK though obviously they've had longer to establish a brand and a large percentage of their stats are probably in the 'visit us' areas rather than collections areas.]

Europeana is a super-aggregator, but doesn't show the role of the national or thematic aggregators or portals as providers/collections of content. They're looking to get away from a one-way model to the point where they can get data back out into different places (via APIs etc). They want to move away from being a single destination site to putting information where the user is, to continue their work on advocacy, open source code etc.

Jill discussed various trends, including the idea of an increased understanding that access to culture is the foundation for a creative economy. She mentioned a Kenneth Gilbraith [?] quote on spending more on culture in recession as that's where creative solutions come from [does anyone know the reference?]. Also, in a time of Increasing nationationalism, Europeana provided an example to combat it with example of trans-Euro cooperation and culture. Finally, customer needs are changing as visitors move from passive recipients to active participants in online culture.

Europeana [or the talk?] will follow four paths – aggregration, distribution, facilitation, engagement.

  • Aggregation – build the trusted source for European digital cultural material. Source curated content, linked data, data enrichment, multilinguality, persistent identifiers. 13 million objects but 18-20thC dominance; only 2% of material is audio-visual [?]. Looking towards publishing metadata as linked open data, to make Europeana and cultural heritage work on the web, e.g. of tagging content with controlled vocabularies – Vikings as tagged by Irish and Norwegian people – from 'pillagers' to 'loving fathers'. They can map between these vocabularies with linked data.
  • Distribution – make the material available to the user wherever they are, whenever they want it. Portals, APIs, widgets, partnerships, getting information into existing school systems.
  • Facilitate innovation in cultural heritage. Knowledge sharing (linked data), IPR business models, policy – advocacy and public domain, data provider agreements. If you write code based on their open sourced applications, they'd love you to commit any code back into Europeana. Also, look at Europeana labs.
  • Engagement – create dialogue and participation. [These slides went quickly, I couldn't keep up]. Examples of the Great War Archive into Europe [?]. Showing the European connection – Art Nouveau works across Europe.

The next talk was Liam Wyatt on 'Peace love and metadata', based in part on his experience at the British Museum, where he volunteered for a month to coordinate the relationship between Wikipedia as representative of the open web [might have mistyped that, it seems quite a mantle to claim] and the BM as representatiave of [missed it]. The goal was to build a proactive relationship of mutual benefit without requiring change in policies or practices of either. [A nice bit of realism because IMO both sides of the museum/Wikipedia relationship are resistant to change and attached firmly to parts of their current models that are in conflict with the other conglomeration.]

The project resulted in 100 new Wikipedia articles, mostly based on the BM/BBC A History of the World in 100 Objects project (AHOW). [Would love to know how many articles were improved as a result too]. They also ran a 'backstage pass' day where Wikipedians come on site, meet with curators, backstage tour, then they sit down and create/update entries. There were also one-on-one collaborators – hooking up Wikipedians and curators/museums with e.g. photos of objects requested.

It's all about improving content, focussing on personal relationshiips, leveraging the communities; it didn't focus on residents (his own work), none of them are content donation projects, every institution has different needs but can do some version of this.

[I'm curious about why it's about bringing Wikipedians into museums and not turning museum people into Wikipedians but I guess that's a whole different project and may be result from the personal relationships anyway.]

Unknown risks are accounted for and overestimated. Unknown rewards are not accounted for and underestimated. [Quoted for truth, and I think this struck a chord with the audience.]

Reasons he's heard for restricting digital access… Most common 'preserving the integrity of the collection' but sounds like need to approve content so can approve of usages. As a result he's seen convoluted copyright claims – it's easy tool to use to retain control.

Derivative works. Commercial use. Different types of free – freedom to use, freedom to study and apply knowledge gained; freedom to make and redistribute copies; [something else].

There are only three applicable licences for Wikipedia. Wikipedia is a non-commercial organisation, but don't accept any non-commercially licenced content as 'it would restrict the freedom of people downstream to re-use the content in innovative ways'. [but this rules out much museum content, whether rightly or not, and with varying sources from legal requirements to preference. Licence wars (see the open source movement) are boring, but the public would have access to more museum content on Wikipedia if that restriction was negotiable. Whether that would outweight the possible 'downstream' benefit is an interesting question.]

Liam asked the audience, do you have a volunteer project in your institution? do you have an e-volunteer program? Well, you do already, you just don't know it. It's a matter of whether you want to engage with them back. You don't have to, and it might be messy.

Wikipedia is not a social network. It is a social construction – it requires a community to exist but socialising is not the goal. Wikipedia is not user generated content. Wikipedia is community curated works. Curated, not only generated. Things can be edited or deleted as well as added [which is always a difficulty for museums thinking about relying on Wikipedia content in the long term, especially as the 'significance' of various objects can be a contested issue.]

Happy datasets are all alike; every unhappy dataset is unhappy in its own way. A good test of data is that it works well with others – technically or legally.

According to Liam, Europeana is the 21st century of the gallery painting – it's a thumbnail gallery but it could be so much more if the content was technically and legally able to be re-used, integrated.
Data already has enough restrictions already e.g. copyright, donor restrictions. but if it comes without restrictions, its a shame to add them. 'Leave the gate as you found it'.

'We're doing the same thing for the same reason for the same people in the same medium, let's do it together.'

The next sessions were 'tasters' of the three thematic tracks of the second part of the day – linked data, user-generated content, and risks and rewards. This was a great idea because I felt like I wasn't totally missing out on the other sessions.

Ian Davis from Talis talked about 'linked open culture' as a preview of the linked data track. How to take practices learned from linked data and apply them to open culture sector. We're always looking for ways to exchange info, communicate more effecively. We're no longer limited by the physicality of information. 'The semantic web fundamentally changes how information, machines and people are connected together'. The semantic web and its powerful network effects are enabling a radical transformation away from islands of data. One question is, does preservation require protection, isolation, or to copy it as widely as possible?

Conjecture 1 – data outlasts code. MARC stays forever, code changes. This implies that open data is more important than open source.
Conjecture 2 – structured data is more valuable than unstructured. Therefore we should seek to structure our data well.
Conjecture 3 – most of the value in our data will be unexpected and unintended. Therefore we should engineer for serendipity.

'Provide and enable' – UK National Archives phrase. Provide things you're good at – use unique expertise and knowledge [missed bits]… enable as many people as possible to use it – licence data for re-use, give important things identifiers, link widely.

'The goal is not to build a web of data. The goal is to enrich lives through access to information.'
[I think this is my new motto – it sums it up so perfectly. Yes, we carry on about the technology, but only so we can get it built – it's the means to an end, not the end itself. It's not about applying acronyms to content, it's about making content more meaningful, retaining its connection to its source and original context, making the terms of use clear and accessible, making it easy to re-use, encouraging people to make applications and websites with it, blah blah blah – but it's all so that more people can have more meaningful relationships with their contemporary and historical worlds.]

Kevin Sumption from the National Maritime Museum presented on the user-generated content track. A look ahead – the cultural sector and new models… User-generated content (UGC) is a broad description for content created by end users rather than traditional publishers. Museums have been active in photo-sharing, social tagging, wikipedia editing.

Crowdsourcing e.g. – reCAPTCHA [digitising books, one registration form at a time]. His team was inspired by the approach, created a project called 'Old Weather' – people review logs of WWI British ships to transcribe the content, especially meterological data. This fills in a gap in the meterological dataset for 1914 – 1918, allows weather in the period to be modelled, contributes to understanding of global weather patterns.

Also working with Oxford Uni, Rutherford Institute, Zooniverse – solar stormwatch – solar weather forecast. The museum is working with research institutions to provide data to solve real-world problems. [Museums can bring audiences to these projects, re-ignite interest in science, you can sit at home or on the train and make real contributions to on-going research – how cool is that?]

Community collecting. e.g. mass observation project 1937 – relaunched now and you can train to become an observer. You get a brief e.g. families on holidays.

BBC WW2 People's War – archive of WWII memories. [check it out]

RunCoCO – tools for people to set up community-lead, generated projects.

Community-lead research – a bit more contentious – e.g. Guardian and MPs expenses. Putting data in hands of public, trusting them to generate content. [Though if you're just getting people to help filter up interesting content for review by trusted sources, it's not that risky].

The final thematic track preview was by Charles Oppenheim from Loughborough University, on the risks and rewards of placing metadata and content on the web. Legal context – authorisation of copyright holder is required for [various acts including putting it on the web] unless… it's out of copyright, have explicit permission from rights holder (not implied licence just cos it's online), permission has been granted under licensing scheme, work has been created by a member of staff or under contract with IP assigned.

Issues with cultural objects – media rich content – multiple layers of rights, multiple rights holders, multiple permissions often required. Who owns what rights? Different media industries have different traditions about giving permission. Orphan works.

Possible non-legal ramifiations of IPR infringements – loss of trust with rights holders/creators; loss of trust with public; damage to reputation/bad press; breach of contract (funding bodies or licensors); additional fees/costs; takedown of content or entire service.

Help is at hand – Strategic Content Alliance toolkit [online].

Copyright less to do with law than with risk management – assess risks and work out how will minimise them.

Risks beyond IPR – defamation; liability for provision of inaccurate information; illegal materials e.g. pornography, pro-terrorism, violent materials, racist materials, Holocaust denial; data protection/privacy breaches; accidental disclosure of confidential information.

High risk – anything you make money from; copying anything that is in copyright and is commercially availabe.
Low risk – orphan works of low commercial value – letters, diaries, amateur photographs, films, recordings known by less known people.
Zero risk stuff.
Risks on the other side of the coin [aka excuses for not putting stuff up]

'Go forth and digitise' – Bill Thompson at OpenTech 2010

I've realised events like OpenTech are a bit like geek Christmas – a brief intense moment of brilliant fun with inspiring people who not only get what you're saying, they'll give you an idea back that'll push you further… then it's back to the inching progress of everyday life, but hopefully with enough of that event energy to make it all easier. Anyway, enough rambling and onto my sketchy notes from the talk. Stuff in square brackets is me thinking aloud, any mistakes are mine, etc.

Giving the Enlightenment Another Five Hundred Years, Bill Thompson
Session 3, Track A #3A
[A confession – working in a museum, and a science museum at that, I have a long-standing interest in conserving enough of the past to understand the present and plan for the future, and just because it's fascinating. It was ace to hear from someone passionate about the role of archives and cultural heritage in the defence of reason, and even more ace to see the tweets flying around as other people got excited about it too.]

The importance of the scientific method; of asking hard questions and looking for refutation not confirmation.

But surely history is all about progress – what could go wrong? But imagine President Palin… History has shown that it's possible for progress to go backwards.

What can we do? He's not speaking on behalf of the BBC here, but his job is to figure out what you can do with the BBC's archive. [Video of seeing the BBC charter – the powerful impact of holding the actual physical object is reason enough to conserve things from the past, it's an oddly visceral connection to the people who made it that I've noticed again and again while working in museums and archaeology.]

We need to remember. To remember is to understand, to resist. We need to digitise. Remembering comes along with digitising; our experience of the world is so mediated by bits that unless we makes archives digital in some form, there's a real danger that they will be forgotten, inaccessible. Also need to build mechanisms so that stuff that's created now are preserved alongside the records of the past. We need to do it all. If we do it well, we'll give current and future generations the evidence they need to resist the onslaught of ignorance, the tide of unreason that's sweeping the world. Need to create reasonable digitisation of solid artefacts too.

We need to do it soon 'because the kids may not want to'. The technology exists but thinks there's a real danger that if not done in the next ten years, it won't be done; people won't realise the value of the archives and understand why it has to be done. Kids who've grown up on Google will never do the deep research that will take them to the stuff that's not digitised; non-digital stuff will fall into disuse; conservation/preservation will stop.
Don't let Google do it, they don't value the right things.

Once it's in bits, preserve the data and the artefact; catalogue it, make it findable, make it usable – open data world meets open knowledge world. Access to APIs and datasets is important to make sure material can be found. If you know it's there you can ask for it to be digitised. Build layers on top of the assets that have been digitised.

Need to make it usable so have to sort out the rights fiasco… Need a place to put it all, not sure that exists yet. New tools, services, standards so it can be preserved forever and found in future. Not a trivial task but vitally important. The information in the archives supports true understanding. Possibility of doing something transformative at the moment. [He finished with:] 'Go forth and digitise. And don't forget the metadata'.

Crowdsourcing metadata seems like a good idea; V&A gets a shout-out for crowdsourcing image cropping [with an ad hoc description 'which one of these are in focus' – they might be horrified to hear their photography described like that. I got all excited that other people were excited about crowdsourcing metadata, because creating interfaces with game dynamics to encourage people to create content about collections is my MSc dissertation project.]

OCRing text in digitised images – amazing [I need to find a reference to that – if we can do it it'd instantly make our archives and 2D collections much more accessible and discoverable]

Question re Internet Archive – ans that it doesn't have enough curation – 'like throwing your archives down a well before the invaders arrive' – they might be there in a usable form when you come back for them, they might not be.

Question: preservation and digital archaeology are two different things, how closely are they aligned? [digital archaeology presumably not destructive though]

[And that's the end of my notes for that session, notes on the Guardian platform and game session to come]

'Museums meet the 21st century' – OpenTech 2010 talk

These are my notes for the talk I gave at OpenTech 2010 on the subject of 'Museums meet the 21st Century'. Some of it was based on the paper I wrote for Museums and the Web 2010 about the 'Cosmic Collections' mashup competition, but it also gave me a chance to reflect on bigger questions: so we've got some APIs and we're working on structured, open data – now what? Writing the talk helped me crystallise two thoughts that had been floating around my mind. One, that while "the coolest thing to do with your data will be thought of by someone else", that doesn't mean they'll know how to build it – developers are a vital link between museum APIs, linked data, etc and the general public; two, that we really need either aggregated datasets or data using shared standards to get the network effect that will enable the benefits of machine-readable museum data. The network effect would also make it easier to bridge gaps in collections, reuniting objects held in different institutions. I've copied my text below, slides are embedded at the bottom if you'd rather just look at the pictures. I had some brilliant questions from the audience and afterwards, I hope I was able to do them justice. OpenTech itself was a brilliant day full of friendly, inspiring people – if you can possibly go next year then do!

Museums meet the 21st century.
Open Tech, London, September 11, 2010

Hi, I'm Mia, I work for the Science Museum, but I'm mostly here in a personal capacity…

Alternative titles for this talk included: '18th century institution WLTM 21st century for mutual benefit, good times'; 'the Age of Enlightenment meets the Age of Participation'. The common theme behind them is that museums are old, slow-moving institutions with their roots in a different era.

Why am I here?

The proposal I submitted for this was 'Museums collaborating with the public – new opportunities for engagement?', which was something of a straw man, because I really want the answer to be 'yes, new opportunities for engagement'. But I didn't just mean any 'public', I meant specifically a public made up of people like you. I want to help museums open up data so more people can access it in more forms, but most people can't just have a bit of a tinker and create a mashup. “The coolest thing to do with your data will be thought of by someone else” – but that doesn’t mean they’ll know how to build it. Audiences out there need people like you to make websites and mobile apps and other ways for them to access museum content – developers are a vital link in the connection between museum data and the general public.

So there's that kind of help – helping the general public get into our data; and there's another kind of help – helping museums get their data out. For the first, I think I mostly just want you to know that there's data out there, and that we'd love you to do stuff with it.

The second is a request for help working on things that matter. Linkable, open data seems like a no-brainer, but museums need some help getting there.

Museums struggle with the why, with the how, and increasingly with the "we are reducing our opening hours, you have to be kidding me".

Chicken and the egg

Which comes first – museums get together and release interesting data in a usable form under a useful licence and developers use it to make cool things, or developers knock on the doors of museums saying 'we want to make cool things with your data' and museums get it sorted?

At the moment it's a bit of both, but the efforts of people in museums aren't always aligned with the requests from developers, and developers' requests don't always get sent to someone who'll know what to do with it.

So I'm here to talk about some stuff that's going on already and ask for a reality check – is this an idea worth pursuing? And if it is, then what next?
If there’s no demand for it, it won’t happen. Nick Poole, Chief Executive, Collections Trust, said on the Museums Computer Group email discussion list: "most museum people I speak to tend not to prioritise aggregation and open interoperability because there is not yet a clear use case for it, nor are there enough aggregators with enough critical mass to justify it.”

But first, an example…

An experiment – Cosmic Collections, the first museum mashup competition

The Cosmic Collections project was based on a simple idea – what if a museum gave people the ability to make their own collection website for the general public? Way back in December 2008 I discovered that the Science Museum was planning an exhibition on astronomy and culture, to be called ‘Cosmos & Culture’. They had limited time and resources to produce a site to support the exhibition and risked creating ‘just another exhibition microsite’. I went to the curator, Alison Boyle, with a proposal – what if we provided access to the machine-readable exhibition content that was already being gathered internally, and threw it open to the public to make websites with it? And what if we motivated them to enter by offering competition prizes? Competition participants could win a prize and kudos, and museum audiences might get a much more interesting, innovative site. Astronomy is one of the few areas where the amateur can still make valued scientific contributions, so the idea was a good match for museum mission, exhibition content, technical context, and hopefully developers – but was that enough?

The project gave me a chance to investigate some specific questions. At the time, there were lots of calls from some quarters for museums to produce APIs for each project, but there was also doubt about whether anyone would actually use a museum API, whether we could justify an investment in APIs and machine-readable data. And can you really crowdsource the creation of collections interfaces? The Cosmic Collections competition was a way of finding out.

Lessons? An API isn't a magic bullet, you still need to support the dev community, and encourage non-technical people to find ways to play with it. But the project was definitely worth doing, even if just for the fact that it was done and the world didn't end. Plus, the results were good, and it reinforced the value of working with geeks. [It also got positive coverage in the technical press. Who wouldn’t be happy to hear ‘the museum itself has become an example of technological innovation’ or that it was ‘bringing museums out into the open as places of innovation’?]

Back to the chicken and the egg – linking museums

So, back to the chicken and the egg… Progress is being made, but it gets bogged down in discussions about how exactly to get data online. Museums have enough trouble getting the suppliers they work with to produce code that meets accessibility standards, let alone beautifully structured, re-usable open data.

One of the reasons open, structured data is so attractive to museum technologists is that we know we can never build interfaces to meet the needs of every type of audience. Machine-readable data should allow people with particular needs to create something that supports their own requirements or combines their data with ours to make lovely new things.

Explore with us – tell museums what you need

So if you're someone who wants to build something, I want to hear from you about what standards you're already working with, which formats work best for you…

To an extent that's just moving the problem further down the line, because I've discovered that when you ask people what data standards they want to use, and they tell you it turns out they're all different… but at least progress is being made.

Dragons we have faced

I think museums are getting to the point where they can live with the 80% in the interest of actually getting stuff done.

Museums need to get over the idea that linkable data must be perfect – perfectly clean data, perfectly mapped to perfect vocabularies and perfectly delivered through perfect standards. Museums are used to mapping data from their collections management systems for a known end-use, they've struggled with open-ended requirements for unknown future uses.

The idea that aggregated data must be able to do everything that data provided at source can do has held us back. Aggregated data doesn't need to be able to do everything – sometimes discoverability is enough, as long as you can get back to the source if you need the rest of the data. Sometimes it's enough to be able to link to someone else's record that you've discovered.

Museum data and the network effect

One reason I'm here (despite the fact that public speaking is terrifying) is a vision of the network effect that could apply when we have open museum data.

We could re-unite objects across time and place and people, connecting visitors and objects, regardless of owing institution or what type of object or information it is. We could create highlight collections by mining data across museums, using the links people are making between our collections. We can help people tell their local stories as well as the stories about big subject and world histories. Shared data standards should reduce learning curve for people using our data which would hopefully increase re-use.

Mismatches between museums and tech – reasons to be patient

So that's all very exciting, but since I've also learnt that talking about something creates expectations, here are some reasons to be patient with museums, and tolerant when we fail to get it right the first time…

IT is not a priority for most museums, keeping our objects secure and in one piece is, as is getting some of them on display in ways that make sense to our audiences.

Museums are slow. We'll be talking about stuff for a long time before it happens, because we have limited resources and risk-averse institutions. Museum project management is designed for large infrastructure projects, moving hundreds of delicate objects around while major architectural builds go on. It's difficult to find space for agility and experimentation within that.

Nancy Proctor from the Smithsonian said this week: "[Museum] work is more constrained than a general developer" – it must be of the highest quality; for everybody – public good requires relevance and service for all, and because museums are in the 'forever business' it must be sustainable.

How you can make a difference

Museums are slowly adapting to the participation models of social media. You can help museums create (backend) architectures of participation. Here are some places where you can join in conversations with museum technologists:

Museums Computer Group – events, mailing list http://museumscomputergroup.org.uk/ #ukmcg @ukmcg

Linking Museums – meetups, practical examples, experimenting with machine-readable data http://museum-api.pbworks.com/

Space Time Camp – Nov 4/5, #spacetimecamp

‘Museums and the Web’ conference papers online provide a good overview of current work in the sector http://www.archimuse.com/conferences/mw.html

So that‘s all fun, but to conclude – this is all about getting museums to the point where the technology just works, data flows like water and our energy is focussed on the compelling stories museums can tell with the public. If you want to work on things that matter – museums matter, and they belong to all of us – we should all be able to tell stories with and through museums.

Thank you for listening

Keep in touch at @mia_out or https://openobjects.org.uk/

Some thoughts on linked data and the Science Museum – comments?

I've been meaning to finish this for ages so I could post it, but then I realised it's more use in public in imperfect form than in private, so here goes – my thoughts on linked data, APIs and the Science Museum on the 'Museums and the machine-processable web' wiki. I'm still trying to find time to finish documenting my thoughts, and I've already had several useful comments that mean I'll need to update it, but I'd love to hear your thoughts, comments, etc.

On 'cultural heritage technologists'

A Requirements Engingeering lecture at uni yesterday discussed 'satisfaction arguments' (a way of relating domain knowledge to the introduction of a new system in an environment), emphasising the importance of domain knowledge in understanding user and system requirements – an excellent argument for the importance of cultural heritage technologists in good project design.  The lecture was a good reminder that I've been meaning to post about 'cultural heritage technologists' for a while. In a report on April's Museums and the Web 2009, I mentioned in passing:

…I also made up a new description for myself as I needed one in a hurry for moo cards: cultural heritage technologist. I felt like a bit of a dag but then the lovely Ryan from the George Eastman House said it was also a title he'd wanted to use and that made me feel better.

I'd expanded further on it for the first Museums Pecha Kucha night in London:

Museum technologists are not merely passive participants in the online publication process. We have skills, expertise and experience that profoundly shape the delivery of services. In Jacob Nielsen's terms, we are double domain experts.  This brings responsibilities on two fronts – for us, and for the museums that employ us.

Nielsen describes 'double usability specialists' or 'double experts' as those with expertise in human-computer interaction and in the relevant domain or sector (e.g. ref).  He found that these double experts were more effective at identifying usability issues, and I've extrapolated from that to understand the role of dual expertise in specifying and developing online and desktop applications.
Commenters in the final session of MW2009 conference described the inability of museums to recognise and benefit from the expertise of their IT or web staff, instead waiting until external gurus pronounced on the way of the future – which turns out to be the same things museum staff had been saying for years.  (Sound familiar?)

So my post-MW2009 'call to arms' said "museums should recognise us (museum technologists) as double domain experts. Don’t bury us like Easter eggs in software/gardens. There’s a lot of expertise in your museum, if you just look. We can save you from mistakes you don't even know you're making. Respect our expertise – anyone can have an opinion about the web but a little knowledge is easily pushed too far".

However, I'm also very aware of our responsibilities. A rough summary might be:

Museum technologists have responsibilities too.  Don’t let recognition as a double domain expert make you arrogant or a ‘know it all’. Be humble. Listen. Try to create those moments of understanding, both yours from conversation with others, and others from conversation with you – and cherish that epiphany.  Break out of the bubble that tech jargon creates around our discussions.  Share your excitement. Explain how a new technology will benefit staff and audiences, show them why it's exciting. Respect the intelligence of others we work with, and consider it part of our job to talk to them in language they understand. Bring other departments of the museum with us instead of trying to drag them along.

Don't get carried away with idea that we are holders of truth; we need to take advantage of the knowledge and research of others. Yes, we have lots of expertise but we need to constantly refresh that by checking back with our audiences and internal stakeholders. We also need to listen to concerns and consider them seriously; to acknowledge and respect their challenges and fears.  Finally, don’t be afraid to call in peers to help with examples, moral support and documentation.

My thoughts on this are still a work in progress, and I'd love to hear what you think.  Is it useful, is it constructive?  Does a label like 'cultural heritage technologist' or 'museum technologist' help others respect your learning and expertise?  Does it matter?

[Update, April 2012: as the term has become more common, its definition has broadened.  I didn't think to include it here, but to me, a technologist ia more than just a digital producer (as important as they are) – while they don't have to be a coder, they do have a technical background. Being a coder also isn't enough to make one a technologist as it's also about a broad range of experience, ideally across analysis, implementation and support.  But enough about me – what's your definition?]