What better way to fill in stopover time in Abu Dhabi than continuing to post my notes from DHA2012? [Though I finished off the post and re-posted once I was back home.] These are my very rough notes from day 2 of the inaugural Australasian Association for Digital Humanities conference (see also Quick and dirty Digital Humanities Australasia notes: day 1 and Slow and still dirty Digital Humanities Australasia notes: day 3). In the interests of speed I'll share my notes and worry about my own interpretations later.
Keynote panel, 'Big Digital Humanities?'
Day 2 was introduced by Craig Bellamy, and began with a keynote panel with Peter Robinson, Harold Short and John Unsworth, chaired by Hugh Craig. [See also Snurb's liveblogs for Robinson, Short and Unsworth.] Robinson asked 'what constitutes success for the digital humanities?' and further, what does the visible successes of digital humanities mask? He said it's harder for scholars to do high quality research with digital methods now than it was 20 years ago. But the answer isn't more digital humanists, it's having the ingredients to allow anyone to build bridges… He called for a new generation of tools and methods to support the scholarship that people want to do: 'It should be as easy to make a digital edition (of a document/book) as it is to make a Facebook page', it shouldn't require collaboration with a digital humanist. To allow data made by one person to be made available to others, all digital scholarship should be made available under a Creative Commons licence (publishers can't publish it now if it's under a non-commercial licence), and digital humanities data should be structured and enriched with metadata and made available for re-use with other tools. The model for sustainability depends on anyone and everyone being able to access data.
Harold Short talked about big (or at least unescapable) data and the 'Svensson challenge' – rather than trying to work out how to take advantage of infrastructure created by and for the sciences, use your imagination to figure out what's needed for the arts and humanities. He called for a focus on infrastructure and content rather than 'data'.
John Unsworth reminded us that digital humanities is a certain kind of work in the humanities that uses computational methods as its research methods. It's not just using digital materials, though it does require large collections of data – it also requires a sense of how how the tools work.
What is the digital humanities?
Very different versions of 'digital humanities' emerged through the panel and subsequent discussion, leaving me wondering how they related to the different evolutionary paths of digital history and digital literature studies mentioned the day before. Meanwhile, on the back channel (from the tweets that are to hand), I wondered if a two-tier model of digital humanities was emerging – one that uses traditional methods with digital content (DH lite?); another that disrupts traditional methods and values. Though thinking about it now, the 'tsunami' of data mentioned is disruptive in its own right, regardless of the intentional choices one makes about research practices (which might have been what Alan Liu meant when he asked about 'seamless' and 'seamful' views of the world)…. On twitter, other people (@mikejonesmelb, @bestqualitycrab, @1n9r1d) wondered if the panel's interpretation of 'big' data was gendered, generational, sectoral, or any other combination of factors (including as the messiness and variability of historical data compared to literature) and whether it could have been about 'disciplinary breadth and inclusiveness' rather than scale.
Data morning session
The first speaker was Toby Burrows on 'Using Linked Data to Build Large‐Scale e‐Research Environments for the Humanities'. [Update: he's shared his slides and paper online and see also Snurb's liveblog.] Continuing some of the themes from the morning keynote panel, he said that the humanities has already been washed away in the digital deluge, the proliferation of digital stuff is beyond the capacity of individual researchers. It's difficult to answer complex humanities questions only using search with this 'industrialised' humanities data, but large-scale digital libraries and collections offer very little support for functions other than search. There's very little connection between data that researchers are amassing and what institutions are amassing.
He's also been looking at historians/humanists research practices [and selfishly I was glad to see many parallels with my own early findings]. The tools may be digital rather than paper and scissors, but historians are still annotating and excerpting as they always have. The 'sharing' part of their work has changed the most – it's easier to share, and they can share at an earlier stage if they choose to do that, but not a lot has changed at the personal level.
Burrows said applying applying linked data approach to manuscript research would go a long way to addressing the complexity of the field. For example, using global URIs for manuscripts and parts; separating names and concepts from descriptive information; and using linked data functions to relate scholarly activities (annotations, excerpts, representations etc) to manuscript descriptions, objects and publications. Linked data can provide a layer of entities that sits between research activities and descriptions/collections/publications, which avoids conflating the entities and the source material. Multiple naming schemes are necessary for describing entities and relationships – there's no single authoritative vocabulary. It's a permanent work in progress, with no definitive or final structure. Entities need to include individuals as well as categories, with a network graph showing relatedness and the evidence for that relatedness as the basic structure.
He suggested a focus on organising knowledge, not collections, whether objects or texts. Collaborative activities should be based around this knowledge, using tools that work with linked data entities. This raised the issue of contested ground and the application of labels and meaning to data: your 'discovery' is my 'invasion'. This makes citizen humanities problematic – who gets to describe, assign, link, and what does that mean for scholarly authority?
My notes aren't clear but I think Burrows said these ideas were based on analysis of medieval manuscript research, which Jane Hunter had also worked on, and they were looking towards the architecture for HuNI. It was encouraging to see an approach to linked data so grounded in the complexity of historians research practices and data, and is yet another reason I'm looking forward to following HuNI's progress – I think it will have valuable lessons for linked data projects in the rest of the world. [These slides from the Linked Open Data workshop in Melbourne a few weeks later show the academic workflow HuNI plans to support and some of the issues they'll have to tackle.]
The second speaker was the University of Sydney's Stephen Hayes on 'how linked is linked enough?'. [See also Snurb's liveblog.] He's looking at projects through a linked data lens, trying to assess how much further projects need to go to comfortably claim to be linked data. He talked about the issues projects encountered trying to get to be 5 star Linked Data.
He looked at projects like the Dictionary of Sydney, which expresses data as RDF as well in a public-facing HTML interface and comes close to winning 5 stars. It is a demonstration of the fact that once data is expressed in one form, it can be easily expressed in another form – stable entities can be recombined to form new structures. The project is powered by Heurist, a tool for managing a wide range of research data. The History of Balinese Painting could not find other institutions that exposed Balinese collection data in programmable form so they could link to them (presumably a common problem for early adopters but at least it helps solve the 'chicken or the egg' problem that dogs linked data in cultural heritage and the humanities). The sites URLs don't return useful metadata but they do try to refer to image URLs so it's 'sorta persistent'. He gave it a rating of 3.5 stars. Other projects mentioned (also built on Heurist?) were the Charles Harpur Critical Archive, rated at 3.5 stars and Virtual Zagora, rated at 3 stars.
The paper was an interesting discussion of the team work required to get the full 5 stars of linked data, and the trade-offs in developing functions for structured data (e.g. implementing schema.org's painting markup versus focussing on the quality of the human-facing pages); reassuring curators about how much data would be released and what would be kept back; developing ontologies throughout a project or in advance and the overhead in mapping other projects concepts to their own version of Dublin Core.
The final paper in the session was 'As Curious An Entity: Building Digital Resources from Context, Records and Data' by Michael Jones and Antonina Lewis (abstract). [See also Snurb's liveblog.] They said that improving the visibility of relationships between entities enriches archives, as does improving relationships between people. The title quote in full is 'as curious an entity as bullshit writ on silk' – if the parameters, variables and sources of data are removed from material, then it's just bullshit written on silk. Visualisations remove sources, complexity and 'relative context', and would be richer if they could express changes in data over time and space. They asked how one would know that information presented in a visualisation is accurate if it doesn't cite sources? You must seek and reference original material to support context layers.
They presented an overview of the Saulwick Archive project (Saulwick ran polls for the Fairfax newspapers for years) and the Australian Women's Register, discussed common issues faced in digital humanities, and the role of linked data and human relationships in building digital resources. They discussed the value of maintaining relationships between archives and donors after the transfer of material, and the need to establish data management plans to make provision for raw data and authoritative versions of related contextual material, and to retain data to make sense of the archives in the future. The Australian Women's Register includes content written for the site and links out to the archival repositories and libraries where the records are held. In a lovely phrase, they described records as the 'evidential heart' for the context and data layers. They also noted that the keynote overlooked non-academic re-use of digital resources, but it's another argument for making data available where possible.
Digital histories session
The first paper was 'Community Connections: The Renaissance of Local History' by Lisa Murray. Murray discussed the 'three Cs' needed for local history: connectivity, community, collaboration.
Is the process of geo-referencing forcing historians to be more specific about when or where things happened? Are people going from the thematic to the particular? Is it exciting for local historians to see how things fit into state or national narratives? Digital history has enormous potential for local and family history and to represent complicated relationships within a community and how they've changed over time. Digital history doesn't have to be article-centric – it enables new forms of presentation. Historians have to acknowledge that Wikipedia is aligned to historians' processes. Local history is strongly represented on Wikipedia. The Dictionary of Sydney provides a universal framework for accessing Sydney's history.
The democratisation of historical production is exciting but raises it challenges for public understandings of how history undertaken and represented. Are some histories privileged? Making History (a project by Museum Victoria and Monash University) encourages the use of online resources but does that privilege digitised sources, and will others be neglected? Are easily accessible sources privileged, and does that change what history is written? What about community collections or vast state archives that aren't digitised?
History research methodologies are changing – Google etc is shaping how research is undertaken; the ubiquity of keyword searching reinforces the primacy of names. She noted the impact of family historians on how archives prioritise work. It's not just about finding sources – to produce good history you need to analyse the sources. Professional historians are no longer the privileged producers of knowledge. History can be parochial, inclusive, but it can also lack sense of historical perspective, context. Digital history production amplifies tensions between popular history and academic history [and presumably between amateur and academic historians?].
Apparently primary school students study more local history than university students do. Local and community history is produced by broad spectrum of community but relatively few academic historians are participating. There's a risk of favouring quirky facts over significance and context. Unless history is more widely taught, local history will be tarred with same brush as antiquarians. History is not only about narrative and context… Historians need to embrace the renaissance of local and community history.
In the questions there was some discussion of the implications of Sydney's city archives being moved to a more inconvenient physical location. The justification is that it's available through Ancestry but that removes it from all context [and I guess raises all the issues of serendipity etc in digital vs physical access to archives].
The next speaker was Tim Sherratt on 'Inside the bureaucracy of White Australia'. His slides are online and his abstract is on the Invisible Australians site. The Invisible Australians project is trying to answer the question of what the White Australia policy looked like to a non-white Australian. He talked about how digital technology can help explore the practice of exclusion as legislation and administrative processes were gradually elaborated. Chinese Australians who left Australia and wanted to return had to prove both their identity and their right to land to convince officials they could return: 'every non-white resident was potentially a prohibited immigrant just waiting to be exposed'. He used topic modelling on file titles from archival series and was able to see which documents related to the White Australia policy. This is a change from working through hierarchical structures of archives to working directly through the content of archives. This provides a better picture of what hasn't survived, what's missing and would have many other exciting uses. [His post on Topic modelling in the archives explains it better than my summary would.]
The final paper was Paul Turnbull on 'Pancake history'. He noted that in e-research there's a difference between what you can use in teaching and what makes people nervous in the research domain. He finds it ironic that professional advancement for historians is tied to writing about doing history rather than doing history. He talked about the need to engage with disciplinary colleagues who don't engage with digital humanities, and issues around historians taking digital history seriously.
Sherratt's talk inspired discussion of funding small-scale as well as large-scale infrastructure, possibly through crowdfunding. Turnbull also suggested 'seeding ideas and sharing small apps is the way to go'.
[Note from when I originally posted this: I don't know when my flight is going to be called, so I'll hit publish now and keep working until I board – there's lots more to fit in for day 2! In the afternoon I went to the 'Digital History' session. I'll tidy up when I'm in the UK as I think blogger is doing weird LTR things because it may be expecting Arabic.]
See also Slow and still dirty Digital Humanities Australasia notes: day 3.