Happy developers + happy museums = happy punters (my JISC dev8D talk)

This is a rough transcript of my lightning talk 'Happy developers, happy museums' at JISC's dev8D 'developer happiness' days last week. The slides are downloadable or embedded below. The reason I'm posting this is because I'd still love to hear comments, ideas, suggestions, particularly from developers outside the museum sector – there's a contact form on my website, or leave a comment here.

"In this talk I want to show you where museums are in terms of data and hear from you on how we can be more useful.

If you're interested in updates I use my blog to [crap on a bit, ahem] talk about development at work, and also to call for comment on various ideas and prototypes. I'm interested in making the architecture and development process transparent, in being responsive to not only traditional museum visitors as end users, but also to developers. If you think of APIs as a UI for developers, we want ours to be both usable and useful.

I really like museums, I've worked in three museums (or families of museums) now over ten years. I think they can do really good things. Museums should be about delight, serendipity and answers that provoke more questions.

A recent book, 'How does one become a scientist? : survey on the birth of a Vocation' states that '60% of scientists over 30 and 40% of scientists under 30 note claim, without prompting, that the Palais de la Découverte [a science museum in Paris] triggered their vocation'.

Museums can really have an impact on how people think about the world, how they think about the possibilities of their lives. I think museums also have a big responsibility – we should be curating collections for current and future audiences, but also trying to provide access to the collections that aren't on display. We should be committed to accessibility, transparency, curation, respecting and enabling expertise.

So today I'm here because we want to share our stuff – we are already – but we want to share better.

We do a lot of audience research and know a lot about some of our users, including our specialist users, but we don't know so much about how people might use our data, it's a relatively new thing for us. We're used to saying 'here are objects in a case, interpretation in label', we're not used to saying 'here's unmediated access, access through the back door'.

Some of the challenges for museums: technology isn't that much of a challenge for us on the whole, except that there are pockets of excellence, people doing amazing things on small budgets with limited resources, but there are also a lot of old-fashioned monolithic project designs with big overheads that take a long time to deliver. Lots of people mean well but don't know what's possible – I want to spread the news about lightweight, more manageable and responsive ways of developing things that make sense and deliver results.

We have a lot of data, but a lot of it's crap. Some of what we have is wrong. Some of it was written 100 years ago, so it doesn't match how we'd describe things now.

We face big institutional challenges. Some curators – (though it does depend on the museum) – fear loss of control, fear intellectual vandalism, that mistakes in user-generated content published on museum sites will cause people to lose trust in museums. We have fears of getting the IT wrong (because for a while we did). Funding and metrics are a big issue – we are paid by how many people come through our door or come to our websites. If we're doing a mashup, how do we measure the usage of that? Are we going to cost our organisations money if we can't measure visits and charge back to the government? [This is particularly an issue for free museums in the UK, an interesting by-product of funding structures.]

Copyright is a huge issue. We might not even own an object that appears in our collections, we might not own the rights to the image of our object, or to the reproductions of an image. We might not have asked for copyright clearance at the time when an object was donated, and the cost of tracing it might be too high, so we can't use that object online. Until we come up with a reliable model that reduces the risk to an institution of saying 'copyright unknown', we're stuck.

The following are some ways I can think of for dealing with these challenges…
Limited resources – we can't build an interface to meet every need for every user, but we can provide the content that they'd use. Some of the semantic web talks here have discussed a 'thin layer' of application over data, and that's kind of where we want to go as well.

Real examples to reduce institutional fear and to provide real examples of working agile projects. [I didn't mean strictly 'agile' methodology but generally projects that deliver early and often and can respond to the changing technical and social environment]

Finding ways for the sector to reward intelligent failure. Some museums will never ever admit to making a mistake. I've heard over the past few days that universities can be the same. Projects that are hyped up suddenly aren't mentioned, and presumably it's failed, but no-one [from the project] ever talks about why so we don't learn from those mistakes. 'Fail faster, succeed sooner'.
I'd like to hear suggestions from you on how we could deal with those challenges.

What are museums known for? Big buildings, full of stuff; experts; we make visitors come to us; we're known for being fun; or for being boring.

Museum websites traditionally appear to be about where we are, when we're open, what's on, is there a cafe on site. Which is useful, but we can do a lot more.

Traditionally we've done pretty exhibition microsites, which are nice – they provide an experience of the exhibition before or after your visit. They're quite marketing-led, they don't necessarily provide an equivalent experience and they don't really let you engage with the content beyond the fact that you're viewing it.

We're doing lots of collections online projects, some of these have ended up being silos – sometimes to the extent if we want to get data out of them, we have to screen-scrape our own data. These sites often aren't as pretty, they don't always have the same design and usability budgets (if any).

I think we should stick to what we're really good at – understanding the data (collections), understanding how to mediate it, how to interpret it, how to select things that are appropriate for publication, and maybe open it up to other people to do the shiny pretty things. [Sounds almost like I'm advocating doing myself out of a job!]

So we have lots of objects, images, lots of metadata; our collections databases also include people, events, dates, places, businesses and organisations, lots of qualified information around things like dates, they're not necessarily simple fields but that means they can convey a lot more meaning. I've included that because people don't always realise we have information beyond objects and object metadata. This slide [11 below] is an example of one of the challenges – this box of objects might not be catalogued as individual instruments, it might just be catalogued as a 'box of stuff', which doesn't help you find the interesting objects in the box. Lots of good stuff is hidden in this way.

We're slowly getting there. We're opening up access. We're using APIs internally to share data between gallery interactives and the web, we're releasing them as data points, we're using them to provide direct access to collections. At the moment it still tends to be quite mediated access, so you're getting a lot of interpretation and a fewer number of objects because of the resources required to create really nice records and the information around them.

'Read access' is relatively easy, 'write access' is harder because that's when we hit those institutional issues around authority, authorship. Some curators are vaguely horrified that they might have to listen to what the public have to say and actually take some of it back into their collections databases. But they also have to understand that they can't know everything about their collections, and there are some specialist users who will know everything there is to know about a particular widget on a particular kind of train. We'd like to capture that knowledge. [London Transport Museum have had a good go at that.]

Some random URLs of cool stuff happening in museums [http://dashboard.imamuseum.org/, http://www.powerhousemuseum.com/collection/database/menu.php, http://www.brooklynmuseum.org/opencollection/collections/, http://objectwiki.sciencemuseum.org.uk/] – it's still very much in small pockets, it's still difficult for museum staff to convince people to take what seems like a leap of faith and try these non-traditional things out.

We're taking our content to where people hang out. We're exploring things like Flickr Commons, asking people to tag and comment. Some museums have been updating collections records with information added by the public as a result. People are geo-tagging photos for us, which means you can do 'then and now' mashups without a big metadata enhancement budget.

I'd like to see an end to silos. We are kinda getting there but there's not a serious commitment to the idea that we need to let things go, that we need to make sure that collections online shareable, that they're interoperable, that they can mesh with other things.

Particularly for an education audience, we want to help researchers help themselves, to help developers help others. What else do we have that people might find useful?

What we can do depends on who you are. I could hope that things like enquiry-based learning, mashups, linked data, semantic web technologies, cross-collections searches, faceted browsing to make complex searches easy would be useful, that the concept of museums as a place where information lives – a happy home for metadata mapped around objects and authority records – are useful for people here but I wouldn't want to put words into your mouths.

There's a lot we can do with the technology, but if we're investing resources we need to make sure that they're useful. I can try things in my own time because it's fun, but if we're going to spend limited resources on interfaces for developers then we need to that it's actually going to help some group of people out there.

The philosophy that I'm working with is 'we've got really cool things, but we can have even cooler things if we can share what we have with everyone else'. "The coolest thing to do with your data will be thought of by someone else". [This quote turns out to be on the event t-shirts, via CRIG!] So that said… any ideas, comments, suggestions?"

And that, thankfully, is where I stopped blathering on. I'll summarise the discussion and post back when I've checked that people are ok with me blogging their comments.

[If the slide show below has a brown face on a black background, it's the right one – slideshare's embed seems to have had a hiccup. If it's not that, try viewing it online directly.]

[My slide images include the Easter Egg museum in Kolomyya, Ukraine and 'Laughter in Odd Places' event at the Museum of London.]

This is a quick dump of some of the text from an interview I did at the event, cos I managed to cover some stuff I didn't quite articulate in my talk:

[On challenges for museums:] We need to change institutional priorities to acknowledge the size of the online audience and the different levels of engagement that are possible with the online experience. Having talked to people here, museums also need to do a bit of a sell job in letting people know that we've changed and we're not just great big imposing buildings full of stuff.

[What are the most exciting developments in the museum sector, online?] For digital collections, going outside the walls of the museum using geo-location to place objects in their original context is amazing. It means you can overlay the streets of the city with past events and lives. Outsourcing curation and negotiating new models of expertise is exciting. Overcoming the fear of the digital surrogate as a competitor for museum visits and understanding that everything we do builds audiences, whether digital or physical.

What makes a good API? JISC want to know

Tony Hirst blogged about a JISC survey on good APIs, so if you're an API producer or consumer with a few minutes to spare then have your say on good APIs:

The aim of this survey is to identify best practice which should be adopted when making use of APIs (Application Programming Interfaces). The feedback will inform a report for JISC on best practices related to the development of and use of APIs in JISC's development activities and will be made freely available.

You might not be directly affected by JISC's funding decisions, but I think the entire cultural heritage sector could benefit from better information on the best practices for API creation and use. Early last year I heard a speaker say 'APIs are UIs for programmers' and the nicer the UI we get to work with, the easier our jobs are. Apart from anything else, the more good examples out there, the more creating an API for any digitisation project will become the norm.

Nice summary of web 2.0 for the digital humanities

It's an old post (2006, gasp!) but the points Web 2.0 and the Digital Humanities raises are still just as relevant in the digital cultural heritage sector today:

In summary:

  • Give users tools to visualise and network their own data. And make it easy.
  • Harness the self-interest of your users – "help the user with their own research interests as a first priority".
  • Have an API -"You don’t know what you’ve got until you give it away", "Sharing data in a machine readable and retrievable format, is the most important feature. It lets other people build features for you"
  • Embrace the chaos of knowledge – "a bottom-up method of knowledge representation can be more powerful and more accurate than traditional top-down methods".

Introducing modern bluestocking

[Update, May 2012: I've tweaked this entry so it makes a little more sense.  These other posts from around the same time help put it in context: Some ideas for location-linked cultural heritage projectsExposing the layers of history in cityscapes, and a more recent approach  '…and they all turn on their computers and say 'yay!" (aka, 'mapping for humanists'). I'm also including below some content rescued from the ning site, written by Joanna:

What do historian Catharine Macauley, scientist Ada Lovelace, and photographer Julia Margaret Cameron have in common? All excelled in fields where women’s contributions were thought to be irrelevant. And they did so in ways that pushed the boundaries of those disciplines and created space for other women to succeed. And, sadly, much of their intellectual contribution and artistic intervention has been forgotten.

Inspired by the achievements and exploits of the original bluestockings, Modern Bluestockings aims to celebrate and record the accomplishments not just of women like Macauley, Lovelace and Cameron, but also of women today whose actions within their intellectual or professional fields are inspiring other women. We want to build up an interactive online resource that records these women’s stories. We want to create a feminist space where we can share, discuss, commemorate, and learn.

So if there is a woman whose writing has inspired your own, whose art has challenged the way you think about the world, or whose intellectual contribution you feel has gone unacknowledged for too long, do join us at http://modernbluestocking.ning.com/, and make sure that her story is recorded. You'll find lots of suggestions and ideas there for sharing content, and plenty of willing participants ready to join the discussion about your favourite bluestocking.

And more explanation from modernbluestocking on freebase:

Celebrating the lives of intellectual women from history…

Wikipedia lists bluestocking as 'an obsolete and disparaging term for an educated, intellectual woman'.  We'd prefer to celebrate intellectual women, often feminist in intent or action, who have pushed the boundaries in their discipline or field in a way that has created space for other women to succeed within those fields.

The original impetus was a discussion at the National Portrait Gallery in London held during the exhibition 'Brilliant Women, 18th Century Bluestockings' (http://www.npg.org.uk/live/wobrilliantwomen1.asp) where it was embarrassingly obvious that people couldn't name young(ish) intellectual women they admired.  We need to find and celebrate the modern bluestockings.  Recording and celebrating the lives of women who've gone before us is another way of doing this.

However, at least one of the morals of this story is 'don't get excited about a project, then change jobs and start a part-time Masters degree.  On the other hand, my PhD proposal was shaped by the ideas expressed here, particularly the idea of mapping as a tool for public history by e.g using geo-located stories to place links to content in the physical location.

While my PhD has drifted away from early scientific women, I still read around the subject and occasionally adding names to modernbluestocking.freebase.com.  If someone's not listed in Wikipedia it's a lot harder to add them, but I've realised that if you want to make a difference to the representation of intellectual women, you need to put content where people look for information – i.e. Wikipedia.

And with the launch of Google's Knowledge Graph, getting history articles into Wikipedia then into Freebase is even more important for the visibility of women's history: "The Knowledge Graph is built using facts and schema from Freebase so everyone who has contributed to Freebase had a part in making this possible. …The Knowledge Graph is built using facts and schema from Freebase soeveryone who has contributed to Freebase had a part in making this possible. (Source: this post to the Freebase list).  I'd go so far as to say that if it's worth writing a scholarly article on an intellectual woman, it's worth re-using  your references to create or improve their Wikipedia entry.]

Anyway. On with the original post…]

I keep meaning to find the time to write a proper post explaining one of the projects I'm working on, but in the absence of time a copy and paste job and a link will have to do…

I've started a project called 'modern bluestocking' that's about celebrating and commemorating intellectual women activists from the past and present while reclaiming and redefining the term 'bluestocking'.  It was inspired by the National Portrait Gallery's exhibition, 'Brilliant Women: 18th-Century Bluestockings'.  (See also the review, Not just a pretty face).

It will be a website of some sort, with a community of contributors and it'll also incorporate links to other resources.

We've started talking about what it might contain and how it might work at modernbluestocking.ning.com (ning died, so it's at modernbluestocking.freebase.com…)

Museum application (something to make for mashed museum day?): collect feminist histories, stories, artefacts, images, locations, etc; support the creation of new or synthesised content with content embedded and referenced from a variety of sources. Grab something, tag it, display them, share them; comment, integrate, annotate others. Create a collection to inspire, record, commemorate, and build on.
What, who, how should this website look? Join and help us figure it out.

Why modernbluestocking? Because knowing where you've come from helps you know where you're going.

Sources could include online exhibition materials from the NPG (tricky interface to pull records from).  How can this be a geek/socially friendly project and still get stuff done?  Run a Modernbluestocking, community and museum hack day app to get stuff built and data collated?  Have list of names, portraits, objects for query. Build a collection of links to existing content on other sites? Role models and heroes from current life or history. Where is relatedness stored? 'Significance' -thorny issue? Personal stories cf other more mainstream content?  Is it like a museum made up of loan objects with new interpretation? How much is attribution of the person who added the link required? Login v not? Vandalism? How do deal with changing location or format of resources? Local copies or links? Eg images. Local don't impact bandwidth, but don't count as visits on originating site. Remote resources might disappear – moved, permissions changed, format change, taken offline, etc, or be replaced with different content. Examine the sources, look at their format, how they could be linked to, how stable they appear to be, whether it's possible to contact the publisher…

Could also be interesting to make explicit, transparent, the processes of validation and canonisation.

What would you create with public (UK) information?

Show Us a Better Way want to know, and if your idea is good they might give you £20,000 to develop it to the next level.

Do you think that better use of public information could improve health, education, justice or society at large?

The UK Government wants to hear your ideas for new products that could improve the way public information is communicated.

Importantly, you don't need to be a geek:

You don't have to have any technical knowledge, nor any money, just a good idea, and 5 minutes spare to enter the competition.

And they've made "gigabytes of new or previously invisible public information" available for the project, including health, crime and education data (but no personal information).

Scripting enabled – accessibility mashup event and random Friday link

Scripting Enabled, "a two day conference and workshop aimed at making the web a more accessible place", is an absolutely brilliant idea, and since it looks like it'll be on September 19 and 20, the weekend after BathCamp, I'm going to do my best to make it down. (It's the weekend before I start my Masters in HCI so it's the perfect way to set the tone for the next two years).

From the site:

The aim of the conference is to break down the barriers between disabled users and the social web as much as giving ethical hackers real world issues to solve. We talked about improving the accessibility of the web for a long time – let's not wait, let's make it happen.

A lot of companies have data and APIs available for mashups – let’s use these to remove barriers rather than creating another nice visualization.

And on a random Friday night, this is a fascinating post on Facial Recognition in Digital Photo Collections: "Polar Rose, a Firefox toolbar that does facial recognition on photos loaded in your browser."

Lonely Planet launch API

Lonely Planet launched their 'Explore API' and developer network at the BBC Mashed 2008 day. Available content includes 'destination content, including geocoded points of interest reviews, destination profiles, traveller-created "best of" lists and travel photographs' from their image library so as a travel junkie I'm already itching to have a play.

There's more background in this interview with Chris Heilmann and Chris Boden on the Yahoo! Developer blog, 'Lonely Planet starts developer program at mashed08 in London', but I thought it was worth pulling out this quote about the benefit of APIs, particularly as they're an organisation whose business model relies on its reputation and content:

Where do you see the benefit in releasing an API? How do you plan to monetize it or is it a loss-leader for you?

We don't have a funky web app like Twitter or Dopplr at this stage but we do have content – in a sense, that content is our platform. We want to take the Lonely Planet content and community experience onto relevant new platforms and make it accessible to travellers in new ways. We're not going to be able to do all of that on our own so we're looking to tap into external sources of innovation and creativity through open collaboration to help us imagine and execute the next generation of services that might enrich the lives of our community.

In terms of monetization, we'll look to work commercially with those developers who come up with innovations that we believe have the potential to create commercial value.

Some ideas for location-linked cultural heritage projects

I loved the Fire Eagle presentation I saw at the WSG Findability event [my write-up] because it got me all excited again about ideas for projects that take cultural heritage outside the walls of the museum, and more importantly, it made some of those projects seem feasible.

There's also been a lot of talk about APIs into museum data recently and hopefully the time has come for this idea. It'd be ace if it was possible to bring museum data into the everyday experience of people who would be interested in the things we know about but would never think to have 'a museum experience'.

For example, you could be on your way to the pub in Stoke Newington, and your phone could let you know that you were passing one of Daniel Defoe's hang outs, or the school where Mary Wollstonecraft taught, or that you were passing a 'Neolithic working area for axe-making' and that you could see examples of the Neolithic axes in the Museum of London or Defoe's headstone in Hackney Museum.

That's a personal example, and those are some of my interests – Defoe wrote one of my favourite books (A Journal of the Plague Year), and I've been thinking about a project about 'modern bluestockings' that will collate information about early feminists like Wollstonecroft (contact me for more information) – but ideally you could tailor the information you receive to your interests, whether it's football, music, fashion, history, literature or soap stars in Melbourne, Mumbai or Malmo. If I can get some content sources with good geo-data I might play with this at the museum hack day.

I'm still thinking about functionality, but a notification might look something like "did you know that [person/event blah] [lived/did blah/happened] around here? Find out more now/later [email me a link]; add this to your map for sharing/viewing later".

I've always been fascinated with the idea of making the invisible and intangible layers of history linked to any one location visible again. Millions of lives, ordinary or notable, have been lived in London (and in your city); imagine waiting at your local bus stop and having access to the countless stories and events that happened around you over the centuries. Wikinear is a great example, but it's currently limited to content on Wikipedia, and this content has to pass a 'notability' test that doesn't reflect local concepts of notability or 'interestingness'. Wikipedia isn't interested in the finds associated with an archaeological dig that happened at the end of your road in the 1970s, but with a bit of tinkering (or a nudge to me to find the time to make a better programmatic interface) you could get that information from the LAARC catalogue.

The nice thing about local data is that there are lots of people making content; the not nice thing about local data is that it's scattered all over the web, in all kinds of formats with all kinds of 'trustability', from museums/libraries/archives, to local councils to local enthusiasts and the occasional raving lunatic. If an application developer or content editor can't find information from trusted sources that fits the format required for their application, they'll use whatever they can find on other encyclopaedic repositories, hack federated searches, or they'll screen-scrape our data and generate their own set of entities (authority records) and object records. But what happens if a museum updates and republishes an incorrect record – will that change be reflected in various ad hoc data solutions? Surely it's better to acknowledge and play with this new information environment – better for our data and better for our audiences.

Preparing the data and/or the interface is not necessarily a project that should be specific to any one museum – it's the kind of project that would work well if it drew on resources from across the cultural heritage sector (assuming we all made our geo-located object data and authority records available and easily queryable; whether with a commonly agreed core schema or our own schemas that others could map between).

Location-linked data isn't only about official cultural heritage data; it could be used to display, preserve and commemorate histories that aren't 'notable' or 'historic' enough for recording officially, whether that's grime pirate radio stations in East London high-rise roofs or the sites of Turkish social clubs that are now new apartment buildings. Museums might not generate that data, but we could look at how it fits with user-generated content and with our collecting policies.

Or getting away from traditional cultural heritage, I'd love to know when I'm passing over the site of one of London's lost rivers, or a location that's mentioned in a film, novel or song.

[Updated December 2008 to add – as QR tags get more mainstream, they could provide a versatile and cheap way to provide links to online content, or 250 characters of information. That's more information than the average Blue Plaque.]

Fun with Freebase

A video of a presentation to the Freebase User Group with some good stuff on data mining, visualisation (and some bonus API action) via the Freebase blog.

If you haven't seen it before, Freebase is 'an open database of the world's information', 'free for anyone to query, contribute to, built applications on top of, or integrate into their websites'. Check out this sample entry on the early feminist (and Londoner) Mary Wollstonecraft. The Freebase blog is generally worth a look, whether you're interested in Freebase or just thinking about APIs and data mashups.

Google release AJAX loader

From the Google page, AJAX Libraries API:

The AJAX Libraries API is a content distribution network and loading architecture for the most popular open source JavaScript libraries. By using the Google AJAX API Loader's google.load() method, your application has high speed, globaly available access to a growing list of the most popular JavaScript open source libraries including:

Google works directly with the key stake holders for each library effort and accept the latest stable versions as they are released. Once we host a release of a given library, we are committed to hosting that release indefinitely.

The AJAX Libraries API takes the pain out of developing mashups in JavaScript while using a collection of libraries. We take the pain out of hosting the libraries, correctly setting cache headers, staying up to date with the most recent bug fixes, etc.

There's also more information at Speed up access to your favorite frameworks via the AJAX Libraries API.

To play devil's avocado briefly, the question is – can we trust Google enough to build functionality around them? It might be a moot point if you're already using their APIs, and you could always use the libraries directly, but it's worth considering.