The case for opening up library data: #jiscmosaic at Wolves


I made the (deceptively long) journey over to the city campus of the University of Wolverhampton yesterday (18 Nov 2009) for the concluding JISC MOSAIC project event. I was without an Internet connection all day so wasn’t able to waste time contribute to the backchannel by tweeting from the event, so here’s my writeup.

Interlude 1: what’s all this about?

MOSAIC stands for ‘making our shared activity information count‘ (but took its name from an earlier project with the acronym TILE. Lots of TILEs = a MOSAIC. Geddit?). It’s being funded by JISC, and it’s “investigating the technical feasibility, service value and issues around exploiting activity data”. For ‘activity data’, read (in the main) library book-circulation data. Say the JISC:

“MOSAIC aims to build on [the TILE project] by aggregating library activity data from several institutions and making it available for re-use and experimentation. The Talis podcast with Dave [Pattern, University of Huddersfield] provides further background.”

I.e., if lots of university libraries shared their anonymised circulation data in a common format, what Web2.0-type-o’-magic could we build on top of that data, and how would it benefit our users?

We’ve been invited to contribute some of Lincoln’s own (anonymised) Horizon circulation data to this project. I’ll write more about our own involvement in a future blog post.

After introductions, we were welcomed to Wolverhampton by Fiona Parsons (Director of Learning Centres at Wolves, and vice-chair of SCONUL), then talked through the project’s progress-to-date by David Kay of Sero Consulting Ltd, the lead institution in the project. There was discussion about some of the challenges that have faced institutions wanting to contribute their own activity data – in particular, the difficulties involved in extracting the data from different models of LMS, and institutional concerns about privacy, utility, cost, and the ownership and re-use of ‘their’ data*.

The keynote presentation came from Paul Miller of Cloud of Data Ltd - ‘Activity data and the global information economy: the who, what, when, where, how, why of an emerging future’. I hope Paul (or the MOSAIC project team) will put his slides online soon – it’s well worth a read.

In the meantime you might like to look at Paul’s blog: http://cloudofdata.com/

Next came coffee and the first of two breakout sessions. In a group with Ken Chad, Jill Griffiths (MMU), Alex Parker (So’ton University) and a guy from Wolverhampton**. The breakout session was titled ‘Being Practical’ and we were tasked to come up with real-life use-case scenarios for one or more different types of HE library user.

We focused on the undergraduate, and spent a bit of time discussing students’ trust in the reading materials given to them by their institution, students’ reading behaviour and information literacy, how improvements to library processes (including considerations of VfM) impacted on the student experience, how to ‘sell’ the utility of library usage data to universities, and particularly students’ motivation to read in particular ways.

This led us to what was meant to be our use-case scenario, which we were a bit nervous about, so we rather tentatively posed it as a question instead! (I’ve had to rephrase it from memory below because I left our original notes in Wolverhampton, but this was the gist…)

Interlude 2: our use-case scenario: “Read Your Way to a First”

Can we use library activity data to learn anything about the reading behaviour of students who get higher degree classifications that we could use to inform the reading behaviour of all students?

Obviously, there are some huge questions and potential dangers hidden behind that innocuous question, and a hypothesis (i.e., that there’s some relationship between your reading behaviour and the degree you end up with) that would have to be tested first. It sparked some lively discussion in the run up to lunch, as did the other groups’ use-case scenarios for undergrads, researchers, academic staff, library directors, and developers of web applications for libraries.

Over lunch (good sandwiches; always important) most people sat down to watch a pre-recorded slideshow presentation with voiceover from an absent Dave Pattern about how Huddersfield are really using real circulation data to really improve their students’ experiences of the library. (It’s an adaptation of a similar presentation I saw the flesh-and-blood D.P. deliver at last year’s Mash Oop North event.)

I recommend you take a look at his presentation - it’s only 17 minutes long and it’ll be time well spent…

View more presentations from daveyp.

After lunch we ran through a series of ‘perspectives on the problem space’ – some excellent and genuinely thought-provoking presentations from Mark van Harmelen, Ken Chad, Paul Walk, Jenny Craven & Jill Griffiths of MMU, and Helen Harrop of Sero, which led to some equally thought-provoking discussions.

Once the presentations are available online, I’ll post a link.

At about 3:15pm the group broke up for tea and started another breakout session & discussions - unfortunately I had to take this opportunity to go and tackle the M6.

Interlude 3: what next for us?

 As I said, I’ll blog more about how Lincoln can and will [fingers crossed] contribute some of our own anonymised Horizon circ. data (also, I hope, some e-resource usage data) to this project before 2009 is out. In the meantime, the data’s still out there, just waiting for someone to come along and build innovative library services on top of it…

* I’d argue that it’s not libraries’ data at all. It’s our users‘ data; we’re just keeping it safe for them.

** Sorry! I hadn’t had coffee when we did introductions.