Technology – Page 7 – Open Objects

Performance testing and Agile – top ten tips from Thoughtworks

I've got a whole week and a bit off uni (though of course I still have my day job) and I got a bit over-excited and booked two geek talks (and two theatre shows). This post is summarising a talk on Top ten secret weapons for performance testing in an agile environment, organised by the BCS's SPA (software practice advancement) group with Patrick Kua from ThoughtWorks.

His slides from an earlier presentation are online so you may prefer just to head over and read them.

[My perspective: I've been thinking about using Agile methodologies for two related projects at work, but I'm aware of the criticisms from a requirements engineering perspective that doesn't deal with non-functional requirements (i.e. not requirements about what a system does, but how it does it and the qualities it has – usability, security, performance, etc) and of the problems integrating graphic and user experience design into agile processes (thanks in part to an excellent talk @johannakoll gave at uni last term. Even if we do the graphic and user experience design a cycle or two ahead, I'm also not sure how it would work across production teams that span different departments – much to think about.

Wednesday's talk did a lot to answer my own questions about how to integrate non-functional requirements into agile projects, and I learned a lot about performance testing – probably about time, too. It was intentionally about processes rather than tools, but JMeter was mentioned a few times.]

1. Make performance explicit.
Make it an explicit requirement upfront and throughout the process (as with all non-functional requirements in agile).
Agile should bring the painful things forward in the process.

Two ways: non-functional requirements can be dotted onto the corner of the story card for a functional requirement, or give them a story card to themselves, and manage them alongside the stories for the functional requirements. He pointed out that non-functional requirements have a big effect on architecture, so it's important to test assumptions early.

[I liked their story card format: So that [rationale] as [person or role] I want [natural language description of the requirement].]

2. One team.
Team dynamics are important – performance testers should be part of the main team. Products shouldn't just be 'thrown over the wall'. Insights from each side help the other. Someone from the audience made a comment about 'designing for testability' – working together makes this possible.

Bring feedback cycles closer together. Often developers have an insight into performance issues from their own experience – testers and developers can work together to triangulate and find performance bottlenecks.

Pair on performance test stories – pair a performance tester and developer (as in pair programming) for faster feedback. Developers will gain testing expertise, so rotate pairs as people's skills develop. E.g. in a team of 12 with 1 tester, rotate once a week or fortnight. This also helps bring performance into focus through the process.

3. Customer driven
Customer as in end user, not necessarily the business stakeholder. Existing users are a great source of requirements from the customers' point of view – identify their existing pain points. Also talk to marketing people and look at usage forecasts.

Use personas to represent different customers or stakeholders. It's also good to create a persona for someone who wants to bring the site down – try the evil hat.

4. Discipline
You need to be as disciplined and rigorous as possible in agile. Good performance testing needs rigour.

They've come up with a formula:
Observe test results – what do you see? Be data driven.
Formulate hypothesis – why is it doing that?
Design an experiment – how can I prove that's what's happening? Lightweight, should be able to run several a day.
Run experiment – take time to gather and examine evidence
Is hypothesis valid? If so –
Change application code

Like all good experiments, you should change only one thing at a time.

Don't panic, stay disciplined.

5. Play performance early
Scheduling around iterative builds makes it more possible. A few tests during build is better than a block at the end. Automate early.

6. Iterate, Don't (Just) Increment
Fishbone structure – iterate and enhance tests as well as development.

Sashimi slicing is another technique. Test once you have an end-to-end slice.

Slice by presentation or slice by scenario.
Use visualisations to help digest and communicate test results. Build them in iterations too. e.g. colour to show number of http requests before get error codes. If slicing by scenario, test by going through a whole scenario for one persona.

7. Automate, automate, automate.
It's an investment for the future, so the amount of automation depends on the lifetime of the project and its strategic importance. This level of discipline means you don't waste time later.

Automated compilation – continuous integration good.
Automated tests
Automated packaging
Automated deployment [yes please – it should be easy to get different builds onto an environment]
Automated test orchestration – playing with scenarios, put load generators through profiles.
Automated analysis
Automated scheduling – part of pipeline. Overnight runs.
Automated result archiving – can check raw output if discover issues later

Why automate? Reproducible and constant; faster feedback; higher productivity.
Can add automated load generation e.g. JMeter, which can also run in distributed agent mode.
Ideally run sanity performance tests for show stoppers at the end of functional tests, then a full overnight test.

8. Continuous performance testing
Build pipeline.
Application level – compilation and test units; functional test; build RPM (or whatever distribution thingy).
Into performance level – 5 minute sanity test; typical day test.

Spot incremental performance degradation – set tests to fail if the percentage increase is too high.

9. Test drive your performance test code
Hold it to the same level of quality as production code. TDD useful. Unit test performance code to fail faster. Classic performance areas to unit test: analysis, presentation, visualisation, information collecting, publishing.

V model of testing – performance testing at top righthand edge of the V.

10. Get feedback.
Core of agile principles.
Visualisations help communicate with stakeholders.
Weekly showcase – here's what we learned and what we changed as a result – show the benefits of on-going performance testing.

General comments from Q&A: can do load generation and analyse session logs of user journeys. Testing is risk migitation – can't test everything. Pairing with clients is good.

In other news, I'm really shallow because I cheered on the inside when he said 'dahta' instead of 'dayta'. Accents FTW! And the people at the event seemed nice – I'd definitely go to another SPA event.

Cosmic Collections – the results are in. And can you help us ask the right questions?

For various reasons, the announcement of the winners of our mashup competition has been a bit low key – but we're working on a site that combines the best bits of the winners, and we'll make a bit more of a song and dance about it when that's ready.

I'd like to take the opportunity to personally thank the winners – Simon Willison and Natalie Down in first place, and Ryan Ludwig as runner-up – and equally importantly, those who took part but didn't win; those who had a play and gave us some feedback; those who helped spread the word, and those who cheered along the way.

I have a cheeky final request for your time. I would normally do a few interviews to get an idea of useful questions for a survey, but it's not been possible lately. I particularly want to get a sense of the right questions to ask in an evaluation because it's been such a tricky project to explain and 'market', and I'm far too close to it to have any perspective. So if you'd like to help us understand what questions to ask in evaluation, please take our short survey http://www.surveymonkey.com/s/5ZNSCQ6 – or leave a comment here or on the Cosmic Collections wiki. I'm writing a paper on it at the moment, so hopefully other museums (and also the Science Museum itself) will get to learn from our experiences.

And again – my thanks to those who've already taken the survey – it's been immensely useful, and I really appreciate your honesty and time.

'Cosmic Collections' launches at the Science Museum this weekend

I think I've already said pretty much everything I can about the museum website mashup competition we're launching around the 'Cosmos and Culture' exhibition, but it'd be a bit silly of me not to mention it here since the existence and design of the project reflects a lot of the issues I've written about here.

If you make it along to the launch at the Science Museum on Saturday, make sure you say hello – I should be easy to find cos I'm giving a quick talk at some point.

Right now the laziest thing I could do is to give you a list of places where you can find out more:

You can RSVP at eventbrite or simply find out more about the Cosmic Collections launch event and about the mashup competition.
You can read 'A new API and hack competition – this time not from a tech company but by a museum!', an interview with Chris Heilmann on the Yahoo Developer Network blog.
There are two separate interviews with me, 'Cosmic Collections: the geeky stuff' and Ali Boyle, the Curator of Astronomy, 'Background on our Cosmos & Culture exhibition'. (My apologies to the readers of the collections blog for my nerdtastic interruption. And that was me trying to speak like a normal person – tragic, really.)
You can also ask questions about it, connect with other participants, share tips, etc on the competition wiki.

Finally, you can talk to us @coscultcom on twitter, or tag content with #coscultcom.

Btw – if you want an idea of how slowly museums move, I think I first came up with the idea in January (certainly before dev8D because it was one of the reasons I wanted to go) and first blogged about it (I think) on the museum developers blog in March. The timing was affected by other issues, but still – it's a different pace of life!

On 'cultural heritage technologists'

A Requirements Engingeering lecture at uni yesterday discussed 'satisfaction arguments' (a way of relating domain knowledge to the introduction of a new system in an environment), emphasising the importance of domain knowledge in understanding user and system requirements – an excellent argument for the importance of cultural heritage technologists in good project design. The lecture was a good reminder that I've been meaning to post about 'cultural heritage technologists' for a while. In a report on April's Museums and the Web 2009, I mentioned in passing:

…I also made up a new description for myself as I needed one in a hurry for moo cards: cultural heritage technologist. I felt like a bit of a dag but then the lovely Ryan from the George Eastman House said it was also a title he'd wanted to use and that made me feel better.

I'd expanded further on it for the first Museums Pecha Kucha night in London:

Museum technologists are not merely passive participants in the online publication process. We have skills, expertise and experience that profoundly shape the delivery of services. In Jacob Nielsen's terms, we are double domain experts. This brings responsibilities on two fronts – for us, and for the museums that employ us.

Nielsen describes 'double usability specialists' or 'double experts' as those with expertise in human-computer interaction and in the relevant domain or sector (e.g. ref). He found that these double experts were more effective at identifying usability issues, and I've extrapolated from that to understand the role of dual expertise in specifying and developing online and desktop applications.
Commenters in the final session of MW2009 conference described the inability of museums to recognise and benefit from the expertise of their IT or web staff, instead waiting until external gurus pronounced on the way of the future – which turns out to be the same things museum staff had been saying for years. (Sound familiar?)

So my post-MW2009 'call to arms' said "museums should recognise us (museum technologists) as double domain experts. Don’t bury us like Easter eggs in software/gardens. There’s a lot of expertise in your museum, if you just look. We can save you from mistakes you don't even know you're making. Respect our expertise – anyone can have an opinion about the web but a little knowledge is easily pushed too far".

However, I'm also very aware of our responsibilities. A rough summary might be:

Museum technologists have responsibilities too. Don’t let recognition as a double domain expert make you arrogant or a ‘know it all’. Be humble. Listen. Try to create those moments of understanding, both yours from conversation with others, and others from conversation with you – and cherish that epiphany. Break out of the bubble that tech jargon creates around our discussions. Share your excitement. Explain how a new technology will benefit staff and audiences, show them why it's exciting. Respect the intelligence of others we work with, and consider it part of our job to talk to them in language they understand. Bring other departments of the museum with us instead of trying to drag them along.

Don't get carried away with idea that we are holders of truth; we need to take advantage of the knowledge and research of others. Yes, we have lots of expertise but we need to constantly refresh that by checking back with our audiences and internal stakeholders. We also need to listen to concerns and consider them seriously; to acknowledge and respect their challenges and fears. Finally, don’t be afraid to call in peers to help with examples, moral support and documentation.

My thoughts on this are still a work in progress, and I'd love to hear what you think. Is it useful, is it constructive? Does a label like 'cultural heritage technologist' or 'museum technologist' help others respect your learning and expertise? Does it matter?

[Update, April 2012: as the term has become more common, its definition has broadened. I didn't think to include it here, but to me, a technologist ia more than just a digital producer (as important as they are) – while they don't have to be a coder, they do have a technical background. Being a coder also isn't enough to make one a technologist as it's also about a broad range of experience, ideally across analysis, implementation and support. But enough about me – what's your definition?]

Top ten tips for selling your IT project

I'm spending two days in Manchester for the JISC event, Rapid Innovation in Development. I've already had some interesting, inspiring and useful conversations and I'm looking forward to tomorrow (and more importantly, getting some quality programming time to try them out).

The event has a focus on helping developers effectively market their projects or ideas to wider audiences (aka 'normal people'). With that in mind, here are my notes from Alice Gugan's talk on her 'top ten tips for selling your project'.

She pointed out that it's not exhaustive but does list the key tips to focus on.

Focus on your audience. Who they are, their interests, their technical level. If you're talking to a journalist, talk to who they're writing for.
USP – what is yours? How does your project really change the lives of your audience? This is your main message. What makes your project stand out?
Short and snappy sub-points. Not too many, make sure they lead logically on from your main message.
Be confident. Be sure of your ground, be believable, be enthusiastic.
Project your voice!
Engage eye contact with your interviewer – if you have to scan your notes, still try to make eye contact with the audience.
No gimmicks! They can be great but they won't necessarily make people remember what your project was about.
No jargon! It's often a barrier to your audience. This includes acronyms.
Practice, practice, practice. But keep it fresh, enthusiastic and believable.
Test it on a stranger and adjust according to their reactions.

All good points! Based on years of geek conversations across several domains, I'd suggest making your pitch into a story about the engaging/useful/inexpensive/secure (etc, you get the picture) experience someone has while using your product. You can always bring out the technical details and features list later – once you've got people interested.

It's often hard to step back from the detailed perspective and remember how to talk about your project who haven't been living with it daily, but if you can't do that it's hard to make the best of your work by sharing it with a wider audience.

Focusing on your audience can be tricky – it's easier for pitches than more general presentations, but working out how to address audiences with different levels of technical or sector knowledge can be tricky. Maybe that's why I like user stories as pitches – it makes you step back from the acronomic** detail and think about what really makes your idea unique.

** Yeah, I made that up, but it's a nice cross between acronyms, macro and moronic.

[Update: thanks to Paul Walk for Alice's surname.

Also, I've found myself thinking about the event quite a bit since Friday – both in terms of the tips for presenting technical projects to non-technical staff, and generally in terms of the useful tips and inspiring ideas I picked up in conversation with other attendees. Congratulations to all concerned for a great event!]

Museum pecha kucha night

The first museum pecha kucha night was held in London at the British Museum on June 18, 2009. I took rough notes during the presentations, and have included the slides and notes from my own presentation. The event used the tag 'mwpkn' to gather together tweets, photos, etc. The focus of this first museum pecha kucha was on sharing insights and inspiration from the Museums and the Web conference held in Indianapolis in April.

The event was organised by Shelley Mannion, who introduced the event, emphasising that it was about fun and connecting the museum tech community in an interesting way.

Gail Durbin (V&A), takeaways from MW2009
She's a practical person, looks for ideas to nick. Good idea as things get hazy after a conference, good intentions disappear.

First takeaway – Dina Helal let her play with her iPhone, decided she had to have one. She liked her mobile for the first time in her life.

Second – twittering was very important. Decided to do something with it. Twittering is hard, sending out messages that are interesting is difficult.

Enthusiasm at conferences is short lived – e.g. people excited about wedding site, but did they send in wedding photos? She talked to people about a self-portraiture idea, 'life on a postcard', but hasn't had a single response.

RSS feeds – came away knowing we had to review our RSS feeds, had been without attention for a long time.

Learnt that wikis are very hard work, they don't automatically look after themselves.
Creative use of Flickr – museum 'my karsh' collection

Resolved that had to work with Development. Looking at something like the British Library's – adopt a book for fathers day.

Something that bothers her – many museums think of 'Web 2.0' just as more channels to push out information, there's no sense of pulling in information about visitors.

Beck Tench, one of the most interesting people she met at the conference – practice and work go together very closely. Flickr plant project. She wants to get staff involved – has meeting on Fridays, in local bar, tweets to everyone, conducts something called Experimonth.

Last thing learnt – librarians have better cakes.

Silvia Filippini Fantoni (British Museum and Sorbonne University)
Silvia makes a plea for extra seconds as a non-native speaker (and synthesis not the best feature of Italians). Lecturer in museum informatics and evaluation methods at Sorbonne and project manager for multimedia guide project at British Museum.

So her focus at the conference was mostly on guides. Particularly Samis and Pau and others. Mini workshops and workshops on the topic before and during the conference. Demos from Paul Clifford (Museum of London). Exhibitors. Lots of museums are planning to develop applications.

Interest in using mobile technology as an interpretive tool is constantly growing, especially delivered on visitors own devices. Proliferations of mobile platforms. Proliferation of different functionalities – not just audio – visual, games, way finding, web access and communication, notes and comments. Have all these new platforms and functionalities improved the visitor experience? Yes, but there are some disadvantages.

Asks: aren't we trying to do too much? Are we trying to turn a useful interpretive tool into something too complex? Aren't we forgetting about core audio guide audience?

Are people interested in using their own devices? Do they have the time to pre-download, do they bring their devices? Samis and Pau – the answer is no/not yet. For the medium and short term still need to provide media in the museums. Touch screen devices are easier to use. Limited functionality makes interface simpler. Focus on content – AV messages, touch and listen.
Importance of sharing and learning from best practice. Some efforts at and after MW2009 – handheldconference.org. Discussion of developing open source content management system for mobile devices – contact Nancy Proctor.

Daniel Incandela (Indianapolis Museum of Art)
He's from America so should have extra time too. Also sick and medicated (so at least one of us will have a good time during the presentation).

Enjoys robots, dinosaurs, football and a good point. On holiday while here.

Slide – Shelley's twitter profile – she's responsible for him being here while on holiday.

He blogged about preparing for the presentation and got a comment from one of the pecha kucha founders – the main thing is to have fun, be passionate about something you love.

Twitterfall on the big screen was a major breakthrough at MW2009, (#mw2009 trended as a topic and attracted the attention of) pantygirl.

Digital story telling and tech can't happen without support, Max Anderson has been dream leader.

He's here representing IMA so going to showcase some projects – Roman Art from Louvre webisodes – paved the way for informal, agile, multiple content source creation.

Art Babble. IMA blog – ripped off other museums – gives many departments from museum a digital voice.

Half time experiment with awkward silence (blank slide). [In the pub afterwards, I discovered that this actually made at least one of the English people feel socially awkward!]

Brooklyn Museum – for him the real innovators for digital content for museums, won many awards at MW2009.

Te Papa's 'build a squid' had him at 'hello'. First example of a museum project that actually went viral?

Perhaps we could upgrade MW site? Better integration of social media, multimedia from previous conferences.

Loves Bruce Wyman – reason to go to MW2010.

art:21 – smart team, good approaches to publishing across platforms.

Wonders about agility – love new and emerging projects (?) we hear about at conferences, but how do we face an idea and deal with own internal issues?

The Dutch at Indy (were great) – but somewhere outside north America next for Museums and the Web?

Philip Poole (British Museum)
Everything I got from MW2009 can be put into one statement – spread it about. Enable your content to be spread by other people through APIs.

Does spreading out content dilute our authority? By putting it onto other websites, putting it in contact with other people. No, of course not.

Video was big at MW2009.

If going to use different platforms, will people come? We need to tailor content to different websites – can't just build it and assume people will come. Persian coins vs. ritual Mayan sacrifice on YouTube – which will get bigger audience? [Pick content delivery to suit audience and context.]

Platforms include ArtBabble, YouTube (shorter, edgier), iTunes U. Viral content – we can put features on our website, but a YouTube or Vimeo audience are going to spread things better. iTunes, U, can download and listen on train – takes out of website entirely.

Stats are important – e.g. need to include stats of video on different platforms, make sure people above you recognise the value in that. DCMS – very basic stats – perhaps they should be asking for different stats. "If DCMS ask how much video we put on YouTube, we'd all start doing it." [Brilliant point]

API – take content from website and put elsewhere. IMA Explore section – advertise the repeating pattern in their URLs – someone used them but wasn't going very well, they got in contact with him and helped him succeed, now biggest referrer outside search engines. He wants to do that for the British Museum – he knows the quirks, the data.

Why the 'softly softly' approach? Creating an entire API interface is huge mountain, people above you will want to avoid it if you show them the size of the whole mountain.

Digital NZ – fantastic example. Can create custom search, embed on website, also into gallery and people can vote for it

The British Museum is a museum of the world for the world, why should their web presence be any different?

Mia Ridge (Science Museum)
Yes, that's me. My slides on 'Bubbles and Easter eggs – Museum Pecha Kucha' are on slideshare – scroll down the page for full text and notes – or available as a PDF (2mb).

I talked about:

keeping the post-conference momentum going, particularly the 'do one thing' idea;
museum technologists as 'double domain experts';
not hiding museum geeks like Easter eggs but making more of them as a resource;
the responsibilities of museum geeks as their expertise is recognised;
breaking down internal silos; intelligent failure;
broken metrics and better project design (pitch the goal, not the method);
audience expectations in 2009;
possible first questions for digital projects and taking a whole museum view for new projects;
who's talking/listening to your audiences? trust and respect your audiences;
your museum is an iceberg (lots of the good stuff is hidden);
(s)mash the system (hold a mashup day);
and a challenge for your museum – has the web fundamentally changed your organisation?

Frankie Roberto (Rattle)
Went to the conference with a 'fan' hat on, just really enjoys museums. Loved the zoo – live exhibits are interactive, visceral. Role of live interpretation – how could it work with digital technology? Everyone loves dinosaur – Indy Children's Museum. All museums should have a carousel (can't remember what he was going to say about it).

The Power of Children; making a difference – really powerful stories.

Still thinking about the idea of creating visceral experiences.

ArtBabble – shouldn't generally create silos but ArtBabble spotted that YouTube wasn't working for certain types of content.

Davis LAB – kiosks and sofa. Said 'we are on the web'.

Drupal – lots of museums switching to it.

Richard Morgan (V&A) on APIS – ask, what is your museum good at?, and build an API for that – it may not be collections stuff.

'Things to do' page on V&A. Good way of highlighting ways to interact on website.

Semantic data, Aaron's talk on interpretation of bias, relocation from Flickr photos.
Breaking down ideas about authority on where an area is bounded by. OpenStreetMap – wants to add a historical layer to that so can scroll backwards and forwards in time. [I should ask whether this means layering old maps (with older street layouts like pre-Great Fire of London, or earlier representations?). Geo-rectification is expensive because it's time-consuming, but could it be crowdsourced? Geo-locating old images would be easier for the average person to do.]

Open Plaques – alpha project.

Thinks we won't need to digitise in the future as stuff will be born digital (ha, as if! Though it depends where you draw the lines about the end of collections – in my imagination they're like that warehouse scene at the end of Indiana Jones and the Raiders of the Lost Arc and we won't run out of things to properly digitise any time soon. Still, it's a useful question.)

Dan Zambonini (Box UK)
'Every film needs a villain'. In his impressions and insights from MW2009 he'll say things we may or may not agree with.

Slide – stuff we can do vs. stuff we can't do on either side of a gulf of perceived complexity. It's hard to progress from one to the other. Three questions to bridge gap – how to make relevant to everyday job, how to show advantages, how to make it easy.

Then he realised should talk about personal things – people and connections made. About people, stuff that happens in the evening. The evening drinks don't happen at UKMW – it's a shame we have to go to the other side of the world to talk to each other. [It does it you're at an event like mashed museum the day before – another reason to open it up to educators, curators, etc.]

Small museums vs. big museums – [should make stuff accessible to small museums.] Can get value by helping people. (He tells his ex-girlfriend that ) small is the new big. Also small quick wins. Break down the big things into smaller things, find ways can get to them through small changes in behaviour, bits of information.

How small is small? Greater or less than one day. If less than a day, might as well try it. If it's going to take a week, not small.

Museums should share data – not just as API – share data on traffic, spill gossip on marketing costs, etc. [Information is power, etc]

Celebrate failure – admit that some things go wrong.

Bigger picture – be honest. Tell us when to shut up (on e.g. the

If not on twitter, get on it. The more people talking to each other, the more powerful we are as a group. [But what happens if you miss a few days of twitter? I like twitter, but it's inaccessible if you don't have time to constantly keep up, or don't have a computer at home. Still, getting more people talking is an excellentbl point, even if twitter itself doesn't work for some people.]

The sector is missing practical, specific blog, not news and opinions. [Do collections system specific user groups take the place of blogs?]

Use grants to innovate and produce open source stuff. Right now private agencies will take a lot of the strain of applying for grants.

Sort out that copyright stuff. How difficult can it be?

Final slide summing up and last bit of innuendo. 'Beer makes you more attractive' – it's the after sessions stuff at conferences that's so valuable.

Frankie, Dan and Daniel's slides are also available in the 'Museum Tech Pecha Kucha' event on slideshare (and mine has now got an audio track, thanks to Shelley).

Tom Morris, SPARQL and semweb stuff – tech talk at Open Hack London

Tom Morris gave a lightning talk on 'How to use Semantic Web data in your hack' (aka SPARQL and semantic web stuff).

He's since posted his links and queries – excellent links to endpoints you can test queries in.

Semantic web often thought of as long-promised magical elixir, he's here to say it can be used now by showing examples of queries that can be run against semantic web services. He'll demonstrate two different online datasets and one database that can be installed on your own machine.

First – dbpedia – scraped lots of wikipedia, put it into a database. dbpedia isn't like your averge database, you can't draw a UML diagram of wikipedia. It's done in RDF and Linked Data. Can be queried in a language that looks like SQL but isn't. SPARQL – is a w3c standard, they're currently working on SPARQL 2.

Go to dbpedia.org/sparql – submit query as post. [Really nice – I have a thing about APIs and platforms needing a really easy way to get you to 'hello world' and this does it pretty well.]

[Line by line comments on the syntax of the queries might be useful, though they're pretty readable as it is.]

'select thingy, wotsit where [the slightly more complicated stuff]'

Can get back results in xml, also HTML, 'spreadsheet', JSON. Ugly but readable. Typed.

[Trying a query challenge set by others could be fun way to get started learning it.]

One problem – fictional places are in Wikipedia e.g. Liberty City in Grand Theft Auto.

Libris – how library websites should be
[I never used to appreciate how much most library websites suck until I started back at uni and had to use one for more than one query every few years]

Has a query interface through SPARQL

Comment from the audience BBC – now have SPARQL endpoint [as of the day before? Go BBC guy!].

Playing with mulgara, open source java triple store. [mulgara looks like a kinda faceted search/browse thing] Has own query language called TQL which can do more intresting things than SPARQL. Why use it? Schemaless data storage. Is to SQL what dynamic typing is to static typing. [did he mean 'is to sparql'?]

Question from audence: how do you discover what you can query against?
Answer: dbpedia website should list the concepts they have in there. Also some documentation of categories you can look at. [Examples and documentation are so damn important for the update of your API/web service.]

Coming soon [?] SPARUL – update language, SPARQL2: new features

The end!

[These are more (very) rough notes from the weekend's Open Hack London event – please let me know of clarifications, questions, links or comments. My other notes from the event are tagged openhacklondon.

Quick plug: if you're a developer interested in using cultural heritage (museums, libraries, archives, galleries, archaeology, history, science, whatever) data – a bunch of cultural heritage geeks would like to know what's useful for you (more background here). You can comment on the #chAPI wiki, or tweet @miaridge (or @mia_out). Or if you work for a company that works with cultural heritage organisations, you can help us work better with you for better results for our users.]

There were other lightning talks on Pachube (pronounced 'patchbay', about trying to build the internet of things, making an API for gadgets because e.g. connecting hardware to the web is hard for small makers) and Homera (an open source 3d game engine).

Christian Heilmann on Yahoo!'s YQL, open data tables, APIs

My notes from Christian Heilmann's talk on 'Reaching those web folk' with Yahoo!'s new-ish YQL, open data tables and APIs at the National Maritime Museum [his slides]. My notes are a bit random, but might be useful for people, especially the idea of using YQL as an easy way to prototype APIs (or implement APIs without too much work on your part).

For him it's about data on the web, not just technology.

Number of users is a crap metric, [should consider the user experience].

Stats should be what you use to discover areas where are the problems, not to pat yourself on the back.

People with blackberries have no Javascript, no CSS. Don't have front-loading navigation they have to scroll through – cos they won't.

If you think of your site as content, then visitors can become 'broadcasting stations' and relay your message. Information flows between readers and content. They're passing it on through distribution channels you're not even aware of.

Content on the web is validated with links and quotes from other sources e.g. Wikipedia. People mix your information with other sources to prove a point or validate it. eg. photos on maps.

How can you be part of it?
Make it easy to access. Structure your websites in (plain old semantic HTML) a semantic manner. Title is important, etc. Add more semantic richness with RDF and microformats. Provide data feeds or RSS. Consider the Rolls Royce of distribution – an API. Help other machines make sense of your content – search engines will love you too.

Yahoo index via BOSS API – Yahoo do it because they know 'search engines are dying'. Catch-all search engines are stupid. Apples are not the same apples for everyone. Build a cleverer web search.

http://ask-boss.appspot.com/ – nlp analysis of search results. Try 'who is batman in the dark knight' – amazing.

BOSS provides mainstream channel for semantic web and microformats. Microformats are chicken and egg problem. Using searchmonkey technology, BOSS lists this information in the results. BOSS can return all known information about a page, structured.

Key terms parameter in BOSS – what did people enter to find a site/page? http://keywordfinder.org/ – what successful websites have for a given keyword.

Clean HTML is the most important thing, semantic and microformats are good.

If your data is interesting enough, people will try to get to it and remix it.

[Curl has grown up since I last used it! Can be any browser, do cookies, etc.]

Now the web looks like an RSS reader.

Include RSS in your stats.

Guardian – any of their content websites put out RSS through CMS. They then provided an API so end users can filter down to the data they need.

Programmable Web – excellent resource but can be overwhelming.

The more data sources you use, the more time you spend reading API documentation, sos every API is different. Terms, formats, etc. The more sources you connect to, the more chances of error. The more stuff you pull in, the slower the performance of your website.

So you need systems to aggregate sources painlessly. Yahoo Pipes. A visual interface, changes have to be made by hand.

You can't quickly use a pipe in your code and change it on the fly. e.g. change a parameter for one implementation. No version control.

So that's one of the reasons for YQL: Yahoo Query Language. SQL style interface to all yahoo data (all Yahoo APIs) and the web. Yahoo build things with APIs cos it's the only way to scale. Book: 'scalable websites', all about APIs.

Build queries to Yahoo APIs, try them out in YQL console. Provides diagnostics – which URLs, how long it took, any problems encountered. Allows nesting of API calls.

Outputs XML or JSON, consistent format so you know how to use that information.

YQL also helped internally because of varying APIs between departments.

Gives access to all Yahoo services, any data sources on the web, including html and microformats, and can scrape any website.

Open tables
Easy way to add own information to YQL. Tell Yahoo end point where can get the info.

Jim wanted to allow people to access data without building an API. All it needed was a simple XML file.

[Though you do need RSS results from a search engine to point to – I'm going to see what we can output from our Google Mini and will share any code – or would appreciate some time-saving pointers if anyone has any. Yes, hello, lazyweb, that's my coat, thanks.]

Basically it's a way of providing an API without having to develop one.

Concluding: you can piggyback on people's social connections with other people by making data shareable. [Then your data is shared, yay. Assuming your institution is down with that, and no copyrights or puppies were hurt in the process.]

APIs are a commitment – have to be available all the time, lot of traffic, but hard to measure traffic and benefits. Making APIs scale is a pain and have to be clever to do it. Pointing YQL open data table pointing to search engine on your site also works.

Saves documenting API? [??]

YQL handles the interface, caching and data conversion for you. Also limits the access to sensible levels – 10,000 hits/hour.

Jim – 'images from collection' displayed on page as badge thing with YQL as RSS browser. Can just create RSS feed for exhibition than can new badge for new exhibition.

Using YQL protects against injection attacks.

Comment from audience – YQL as meta-API.

Registering is basically making the XML file. You need a Yahoo ID to use the console. [The console is cool, basically like a SQL 'enterprise' system console, with errors and transaction processing costs.]

We had questions about adding in metrics, stats, to use both for reporting and keeping funders/bosses happy and for diagnostics – to e.g. find out which areas of the collection are being queried, what people are finding interesting.

github repository as place to register open tables to make them discoverable.

There's a YQL blog.

[So, that's it – it's probably worth a play, and while your organisation might not want to use it in production without checking out how long the service is likely to be around, etc, it seems like an easy way of playing with API-able data. It'd be really interesting to see what happened if a few museums with some overlap in their collections coverage all made their data available as an open table.]

Running notes, day 3 (Saturday) of MW2009

These are my running notes from day 3 of the Museums and the Web conference – as the perfect is the enemy of the good I'm getting these up 'as is'. I did a demo [abstract] in the morning but haven't written up my notes yet – shame on me!

The session 'Building and using online collections' included three papers, I've got notes from all three but my laptop battery died halfway through the session so only some of them are already typed – I'll update this entry when I can sneak some time.

Paul Rowe presented on NZMuseums: Showcasing the collections of all New Zealand museums (the linked abstract includes the full paper and slides).

National Services Te Paerangi (NSTP).

4 million NZers, 400 museums. NZMuseums website – focal point for all NZ museums. NSTP administers the site, Vernon Systems is solution provider.

Each museum has a profile page including highlights of their collections. Web-based collection management system.

What needs to be in place for small museums to contribute? How can a portal be built with limited resources? What features of the website would encourage re-use of the data?

Some museums had good web presences, but what about the small museums? Facing same issues that small or local govt museums in the UK face.

Museums are treasures of the country, they show who we are. Website needs to reflect that.

Focus groups – volunteers are important – keep it simple; keep costs low; some places had limited internet connectivity; reservations about content being on the internet were common.

Promoting involvement to the sector – used existing national monthly newsletters to advertise workshops and content deadlines. Minimum of 20 items for placement on site to avoid 'box ticking' [some real commitment required]. Used online forum for FAQs.

Lack of skills – NSTP were trained so could then train staff and volunteers in museums. Digitising, photography for the web.

Had to explain benefits to small museums. It gave them an easy start to getting an online presence.

They overcame resistance by allowing watermarking and clear copyright statements; they showed existing museums sites that allowed tagging; promoted that would help them reach a diverse dispersed audience.

First tag on site – 'shiny nose'. First comment was someone admitting they'd touched the nose on a bronze sculpture.

eHive.

Could also import Excel spreadsheets as content management system didn't exist at early stage of project. Also provided a workaround for people with lack of internet – the spreadsheet could be posted on CD.

API provides glue to connect eHive (Collections Management System) and NZMuseums site together.

Tips for success
Use OS software where possible; use existing online forums and communication networks to save answering questions over again.

90% of these collection items not previously available on the internet. 99% of collection items have images.

[Kiwis are heroes! Everyone was incredibly modest about their achievements, but I think they're amazing.]

Next was Eero Hyvönen on CultureSampo – Finnish Culture on the Semantic Web 2.0: Thematic Perspectives for the End-user (the linked abstract includes the full paper and slides).

Helsinki semantic web thingies
Part of national ontology project, Finland
Vision – international semantic web of cultural heritage. Marriage between semweb and web 2.0

Challenges – content heterogeneity, complexity

Other challenge relates to the way cultural content is produced – Freebase, Wikipedia, open street maps, etc,

Semweb for data integration; web.2 0 approach for content production

Automatically enriched by each piece of knowledge.

In Finnish the sampo is a magic drum that makes everything possible.

Portal intended for human users and machines. Trying to establish a national way of producing content so can be published automatically.

Infrastructure – 37,000 class concepts in ontology. MAO, TAO – museum ontologies, collaboratively built ontologies, then mapped to national system. End user sees one unified ontology. [A little pause while I pick my jaw up from the ground.] 66 vocabularies, taxonomies and ontologies available online as services, can be used as AJAX widgets. Some vocabularies are proprietary so can't be published online in the service.

28 content providers, 22 libraries and museums and some international associates like Getty places, Wikipedia.

16 different metadata schemas. [Including some for poetry!]

134,000 cultural collection items (artefacts, books, videos, etc)

285,000 other resources (places, people etc)

Annotation channel for content items – web 2.0 type interface.

Semantic web 2.0 portal

Portal users – for humans, Google-like but semantic search. Nine perspectives into cultural heritage. Three languages. Recently view items, recently commented items.

Map view.

With one line of JavaScript on own website, can incorporate CultureSampo on own website.

[Sadly my laptop died here and the rest of my notes are handwritten. You can probably get the gist from the published paper and the slide, but the coolness of their project was summed up by this tweet: Musebrarian: What can you do with a semantic knowledgebase? Search for "beard fashion in Finland" across time and place. #mw2009

It might not sound like much, but the breadth of content, and the number of interfaces onto it was awe-inspiring.]

Sadly my notes from Brian Dawson's paper, Collection effects: examining the actual use of on-line archival images are also still on notepaper. The paper was a really useful examination of analytical approaches to understanding the motivations of people using cultural heritage collections.

Notes from the closing plenary, MW2009

These are my quick and dirty notes from the closing plenary of the 2009 Museums and the Web conference . If I've quoted you but gotten your name wrong, I'm very sorry – please let me know and I'll correct it. I haven't put links in for anyone yet so I'll be editing the entry anyway.

'We are the program.' Awards for blog posts, tweets, Flickr photos then David Bearman invited people to come up and talk about what they've learnt, what they'll take away.

Nina, Museum 2.0 – inspired by Max's keynote address. But she didn't feel that difference in the institution. Didn't see the transparency and openness that you get on the web, on their dashboard. Not saying they have to do that, but wants to bring up idea of participatory ghetto… forming relationships with visitors on the web, who'll show up at museums and wonder why the same relationship isn't reflected in the building. Pushing in institutions to establish parity, not to give up on physical space also being somewhere for openness and transparency. IMA – had experience of extreme cognitive dissonance. How can you start the conversation, taking great stuff from web world into physical environment of institutions. Her first time at MW.

Heather from Balbao – new to conference and museum world, great introduction.

Nate, Walker Art Centre – I always leave inspired, seen it happen every time- a month worth of trying new things, then it trickles off and fades… go to the wiki and take the post-conference challenge to do one thing in April – choose one task that you can achieve by the end of April. Distributed agile development … beyond API, everyone can benefit from going home and immediately doing just one thing. [eek I feel weird taking notes about my ideas]

Frankie, Rattle – be excited about tin mining.

Brian, UKOLN – danger that losing accessibility cos doing innovative things, but there have been some really great examples. Universally accessible – pushing it (the definition) of it forward.

Seb, Powerhouse – need to bring people in, curators, management.

Julie (?) – boundaries between web and physical boundaries – problematising the name of the conference. Is 'web' starting to constrain what we're about?

Nina – comment on that – conference in US called WebWise – lousy content but less funded projects, mostly director level people who go. How do we get these people in a situation that's more blended with the kind of people who are here?

Victoria, Smithsonian? carrying on Nina and Seb's point – spends first month being excited, but directors etc aren't going to come to conferences like this. You may have five minutes to articulate why something is important – and it's not heard when it's someone outside, even if you've been saying it on the inside for years. Having someone who's succeeded from outside, doing snippets of video or whatever – convincing.

David – seeing what can share back. Spend time at conference demanding people write papers, share slides… would really love for the post-conference discussion that takes place online to incorporate thoughts, experience about what doing. Extension into social space of a discourse we've never really had – how do you use that post-conference excitement… how do organisations change, which is becoming the centre of the discourse… take it further, keep talking to each other about how do you make it work.

Jennifer – the thing we can do by the end of April, if you write a report, share it with your colleagues. Let people pinch your ideas, send it out. Share the reports as well as the stuff that happens when we're right here.

Jon Pratty – we need a more social media within the museum.

Peter Samis – can remember this camaraderie in 1991… hearing it just as fresh now with people who are coming to their first conference, loving it… this is going to have legs, it's going to keep running, continue this spirit throughout the year.

Rich (another Rich) – haven't really felt the amount of community before, but have been coming since 1999. Being able to catch up on the things he missed while he was here.

Brian – people in the community can fall out, it's happened in the UK. People have strongly held views, need to depersonalise disputes, constructive criticism.

Scott (?) – we're not the only people talking about these subjects, it's happening in higher education, the commercial sector, not a whole of discussion here about what's happening out there and what impact it has here. Would be neat to do some headlines on what's going on in the world outside museum, add to the implications for this audience.
[This final session probably contributed quite a bit to my summary of MW2009 – I'd written the 'MW2009 challenge' a little while before (after discussions at the ice cream API meet) and it was wonderful to feel so much excitement (tempered with realistic cynicism) in the room about the positive changes we could make when we went back to our home institutions.]