David Kay going to talk about ‘attention data’ – what users are looking at or showing interest in – and also how it relates to user generated content as starting to believe that attention data is key to getting user engagement.
The TILE project – looked at library attention data – could this inform recommendations for students. David mentioning well known ‘recommendation’ services – e.g. Amazon, also in the physical world – Clubcard, Nectar card informs marketing etc.
David Pattern at University of Huddersfield – “Libraries could gain valuable insights into user behaviour by data mining borrowing data and uncovering usage trends.”
Types of attention data:
- Attention – behaviour indicating interest/connections – such as queries, navigation, details display, save for later
- Activity – formal transactions such as requesting, borrowing, downloading
- Appeal – formal and informal lists – types of recommendations – such as reading lists – can be a proxy for activity
- And …
We could concentrate and contextualise the intelligence (patterns of user activity) existing in HE systems at institutional level whilst protecting anonymity – we know which institution a user is in, what course they are on, what modules they are doing. This contextual data is a mix of HE ‘controlled’ (e.g. studies, book borrowing), user controlled (e.g. social networks) and some automatically generated data.
The possibility of critical mass of activity data from ‘day 1’ brings to life te opportunity and motivation to embrace and curate user contribution – such as ratings, reviews, bookmarks, lists. To achieve this need to make barriers to contribution as low as possible.
What types of questions might be asked – did anyone highly rate this textbook, what did last year’s students download most.
What level is this type of information useful – institutional, consortial, national, international?
MOSAIC ran a developer competition based on usage data from the University of Huddersfield (see Read to Learn). 6 entries fell into three areas:
- Improving Resource Discovery
- Supporting learning choices
- Supporting decision making (in terms of collection management and development)
However – some dangers – does web scale aggregation of data to provide personalised service threaten privacy of individual? David says we believe that as long as good practice is followed. We need to be careful but not scared off. Already examples:
- California State University show how you can responsibly use recommendation data
- MESUR project – contains 1 billion usage events (2002-2007) to drive recommendations
In MOSAIC project, CERLIM did some research with MMU (Manchester Metropolitan University) students – 90% of students said they would like to be able to find out what other people are using:
- To provide a bigger picture of what is available
- To aid retrieval of relevant resources
- To help with course work
CERLIM found students were very familiar and happy with the idea of recommendations – from Amazon, Ebay etc.
University of Huddersfield have done this:
- suggestions based on circ data – people who borrowed this also borrowed…
- suggestions for what to borrow next – most people who read book x, go on to read book y next
Impact of borrowing – when recommendations introduced into the catalogue there is an increase in the range of books borrowed by students and the average number of books borrowed went up – really striking correlations here.
Also done analysis of usage data by faculty – so can see which faculties have well used collections. Also identify low usage material.
Not only done this for themselves – released data publicly.
Conclusion/thoughts from a recent presentation by Dave Pattern:
- serendipity is helping change borrowing habits
- analysis of usage data allows greater insights in how our services are used (or not)
- would national aggregation of usage data be even more powerful?
Now David (Kay) moving onto some thoughts from Paul Walk – should we be looking at aggregating usage data, or engaging with people more directly? Paul asks the question “will people demand more control over their own attention data?”
Paul suggests that automated recommendation systems might work for undergraduate level, but in academia need to look beyond automatic recommendations – because it is ‘long tail all the way’. Recommendations from peers/colleagues going to work much much better.
David relating how user recommendations appear on bittorrent sites and drive decisions about which torrents to download. Often very small numbers – but sometimes one recommendation (from the right person) can be enough. Don’t need to necessarily worry about huge numbers – quality is powerful.
Q & A
Comment: (Dan Greenstein) At Los Alamos use usage data for researchers moving between disciplines (interdisciplinary studies) – fast way of getting up to speed.
Comment: (Liz Lyon) Flips peer-review on its head – post review recommendation – and if you know who is making that recommendation allows you to make better judgements about the review…
Comment: (Chris Lintott) Not all ‘long tail’ – if you aggregate globally – there are more astronomer academics that there are undergraduates at the university of Huddersfield.
Comment: (Andy Ramsden) Motivation of undergraduate changes over time – assessment changes across years of study – and groups of common study become smaller. Need to consider this when thinking about how we make recommendations
Q: (Peter Burnhill) Attention data about ‘the now’ – what about historical perspective on this.
A: Examples on bittorrent sites of preserving older reviews – in some cases material posted 30 years old – so yes, important.