INF11 – Repositories take up and embedding strand

The blog post is written on behalf of JISC.

The programme manager for this strand is Balviar Notay and detailed description at http://infrastructurecalloct2010.jiscpress.org/appendix-g-repositories-take-up-and-embedding/. A briefing paper is available at http://inf11briefingoct2010.jiscpress.org/repositories-take-up-and-embedding/

This strand is about embedding proven good practice to develop and existing repository – not about new development. But not just about ‘JISC good practice’ – good practice from other places/communities are welcome – but need to be clear about what good practice you are going to implement – and perhaps ideally be in contact with those who originated the relevant good practice.

Need to see service improvement – not just about ‘better technology’. Also need to show how a wide range of instituitons will benefit from this work – and need to see this – including working with the RSP. This also has to be sustainable – about sustainable innovation – needs to go beyond the lifetime of the funding available from JISC.

Q: Would there be an expectation that the originators of good practice you are going to use would become a partner in the bid?

A: No – especially given the size of the bid – so a consultancy role may be appropriate

Q: Is partnering/good practice limited to the UK?

A: No – good practice from around the world whereever appropriate – but have to ensure you can manage this within the funding and time available (e.g. think about travel etc.)

Q: Are there hashtags this

A: No hashtag mentioned for this particular strand (I’ll clarify this if I can)

INF11 – Activity Data

This blog post is written on behalf of JISC

This strand looking for projects that explore user activity data to improve services to institutional staff and students – also there will be a single ‘synthesis’ project in this strand. The detailed description of this strand are at http://infrastructurecalloct2010.jiscpress.org/appendix-f-activity-data/ and a briefing paper is available at http://inf11briefingoct2010.jiscpress.org/infrastructure-for-resource-discovery/

All about identifying tools and techniques that can work for the sector. Looking for very practical projects – lookin at how services wil be improve, who will it affect, and how they will be affected. Each project should start with a hypothesis (see http://infrastructurecalloct2010.jiscpress.org/appendix-f-activity-data/?paragraph=27#27 and http://infrastructurecalloct2010.jiscpress.org/appendix-f-activity-data/?paragraph=33#33) and expect projects to look at proving/exploring the hypothesis.

Expect project to release datasets using an open licence wherever possible – but to be clear about any legal or moral problems with this within the bid.

Activity data related to all instituitonal systems is in scope.

There is a related call out at the moment – 12/10 the JISC Business Intelligence Programme – Andy highly recommend that anyone thinking of bidding under the #inf11 Activity data strand should also read this call. Note JISC does not want duplicate bids to both of these calls.

INF11 – Infrastructure for Resource Discovery

This blog post is written on behalf of JISC.

Projects to release open metadata about the ollections and resources of HE libraries, museums and archives – details in Appendix E of the call at http://infrastructurecalloct2010.jiscpress.org/appendix-e-infrastructure-for-resource-discovery/ – Andy McGregor (giving this briefing) suggests this is a good place to ask questions via the commenting system, and also may be a way of finding possible partners for bids through the comments. Also see the briefing paper at http://inf11briefingoct2010.jiscpress.org/infrastructure-for-resource-discovery/

Projects in this strand should take into consideration the fact that they are part of a wider vision and should take this into account, and consider how they contribute to this (and that they have the time/resource to do it).

Look very carefully at the strict methdology in place – if bids don’t adhere to this won’t get funded. ‘Linked data’ is encouraged but not compulsory – see http://infrastructurecalloct2010.jiscpress.org/appendix-e-infrastructure-for-resource-discovery/?paragraph=15#15 and http://infrastructurecalloct2010.jiscpress.org/appendix-e-infrastructure-for-resource-discovery/?paragraph=18#18

Funding is focussed on HE institutions – but partnerships with institutions outside HE is welcome.

Project are about establishing practices that can be adopted by other institutions to spread the benefits around the sector – looking for projects that have ways of doing this embedded into them – not just lip-service to concept.

Data and process must be sustainable – looking for more than just a simple declaration in the bid here but clear ideas of how projects will tackle this.

INF11 – Geospatial strand

This blog post is written on behalf of JISC.

David stressing that projects in this area should have user facing/user interface expertise involved.

Details of this strand are at http://infrastructurecalloct2010.jiscpress.org/appendix-d-geospatial/, and a briefing paper is available at http://inf11briefingoct2010.jiscpress.org/geospatial/

Looking for 8-12projects, with total funding available £700k – David noting the very tight deadlines on these projects – 6 months – so projects with teams already in place will probably have an advantage.

Proposals required to:

  • Contribute to wider geo community
  • Define end user requirements
  • Describe interoperability as part of other geospatial tools, services and infrastructure
  • have experts in end user, geospatial data, and development
  • Applicability to current and future policies – David mentions the Inspire Directive specifically
  • Reuse existing tools/components where applicable
  • Particiapte in programme activity (13 days)
  • Support from Senior instituiton staff that active participation in community is viable – and that it fits with institutional strategy/objectives

Q: Look at LandMap – something that might be reused by projects

A: Absolutely – LandMap mentioned in briefing document – reuse important

Again David is on Skype as david.flanders

INF11 – Identifiers Programme Area

This blog post is written on behalf of JISC.

The tag for this strand is #jiscpid (JISC Persistent Identifiers). A full description of this strand is available at http://infrastructurecalloct2010.jiscpress.org/appendix-c-identifiers/, and a briefing papers is available at http://inf11briefingoct2010.jiscpress.org/identifiers/

Who will want to bid in this strand?

  • Web managers – the Institutional Web Managers community supported by UKOLN specifically mentioned

David Flanders (who is giving this part of the briefing) is emphasising that he is very very happy for potential bidders to get in contact with him – either email or Skype (he is david.flanders on Skype) is a great way of getting in touch with him

David recommends subscribing to the Information Environment team blog http://infteam.jiscinvolve.org/wp/

David paraphrasing from call – but highlighting the need for projects to:

Engage stakeholders in the institutions

Build consensus for structure as a data model

Communicate structure to a large audience

David suggests posting questions to the JISC Press site for the call

Q: Call implies it should be about identifiers at instituitonal level – but could it focus on a more specific set of resources?

A: As long as it is public facing and about persistent identifiers – doesn’t have to be entire website. E.g. could just be course websites

Q: There is an EC project on research identifiers – should there be a connection between bids and this recent project?

A: Looking to make projects part of larger communities – so could definitely include this, but not limited to this

INF11 – Infrastructure for education and research briefing day

Today I’m doing something different here and blogging on behalf of JISC at a briefing day on the “Infrastructure for education and research briefing day” funding call. The briefing is happening at Goodenough College in London, and kicks off with a “Welcome and overview of e-infrastructure innovation” by Neil Jacobs.

Neil starts with the Vision, and what Infrastructure means:

  • Services, policies, tools, and frameworks

The vision talks about the persistence of information, and the extent to which it can be understood, trusted and reused – so semantics and provenance are key concepts.

Neil notes that programmes are supported by the Innovation Support Centres – that is UKOLN, CETIS and OSSWatch – and these also are key in ensuring coherence across projects and programmes.

Various places to look for information about the programme:

Bouncing, Chunking and Squirrelling

I’ve made some brief notes on this talk by Lynn Silipigni Connaway (OCLC) about the behaviours of digital information seekers. I have to admit I found lots to disagree with in this presentation – but food for thought as well!

[Update: See comment from Dr Connaway below with some further information about the work she was summarising in this presentation – including the very important point that the themes she covered were common themes from the 12 studies that were reviewed rather than her opinions – sorry if this isn’t clear from the notes]


Lynn carried out a JISC funded analysis of 12 user behaviour studies conducted in the US and UK, all published within the last 5 years. 5 of the studies came out of OCLC, and others from JISC, and the RIN User Behaviour Project.  A brief summary of this is available at http://www.jisc.ac.uk/publications/reports/2010/digitalinformationseekers.aspx

Essentially users want access to digital content. Convenience dictates choice between physical and virtual library. Even in situations where the difference is very small – e.g. walking over to a reference desk, or sitting at a desk in the library and asking the question via a virtual service – the users will do the latter because it is more convenient.

Found users spent very little time using the content (while they are in ‘seeking’ mode I think this means) – they ‘squirrel’ downloads – get  quick chunks of information.  They tend to visit resource for just a few minutes, and tend to use very basic search. There was no evidence that more advanced searching was needed.

Tended to use snippets from e-books, viewing only a few pages, using Google-like interfaces. Used ‘Power Browsing’ rather than doing more finessed searches. Users really valued ‘human resources’ – liked face to face (e.g. with librarians) [not sure how this works alongside previous statement about convenience?]

Users tended to associate libraries with collections of books, but on otherhand felt that the more digital content the better.

Tended to find ‘faculty’ praise physical collection – and when asked what they wanted, they said ‘wine & beer & easy chairs’!

Electronic databases not perceived as library sources – although there is an awareness that the University pays for access to content.

Users frustrated with locating and accessing full-text copies.

Found Information literacy skills were lacking – not kept pace with digital literacy. Researchers generally self-taught and have (often misplaced) confidence in their skills. General people stick with what is familiar. Found that doctoral students take cues from their professors/supervisors – will do what they seem their ‘seniors’ doing – and this is probably what will get passed on in turn.

Found that the more familiar people were with a subject area, the broader they will be in their searching – they don’t want to miss anything, and they trust their judgement over those who might index the resources. Those less familiar with an area, will be more specific with their searching.

Found people often turned to general search engines to get overview of an area.

Users:

  • Value database and other online sources
  • Do not understand what resource available in libraries
  • Cannot ditinguish between database held by a library and other online sources
  • Library OPACs difficult to use

Searh behaviours vary by discipline

Desire seamless process from Discovery to Delivery. Sciences most satisfied, Social Science and Arts & Humanitites have serious gaps – particularly difficult to find:

  • Foreign Language materials
  • Multi-author materials
  • Journal backfiles

Inadequately catalogued resources result in underuse

Library ownership of sources important – “where can I get this?”

Differences exist between the catalogue data quality priorities of users and librarians.

‘One size fits noone’

Conclusions

  • Simpler searches & power browsing
  • Squirreling of downloads
  • Natural language
  • Convenience very important
  • Human resource valued
  • D2D of full-text digital content desired
  • Transparency of ranking results
  • Evaluative information included in catalog
  • More robust metadata

Implications for librarians:

  • Serve different constituencies
  • Adapt to changing user behaviours – look at 12 year-olds now
  • Offer service in multiple formats
  • Provide seamless access to digital resources
  • Better branding/marketing of our services …

Implications for library systems:

  • Build on and integrate search engine features
  • Provide search help at time of need – e.g. Chat and IM help during search
  • Adopt user-centered development approach

What does this mean for libraries?

  • Keep talking
  • Keep moving – and we need to move faster
  • Keep the gates open – make it easier to get to stuff
  • Keep it simpler

VuFind Virtual Bootcamp

As part of the Lucero project I’m currently working on at the Open University, I’m looking at lots of library catalogue records. While exploring the first set of data I was playing with (around 25,000 records in MARC format) it struck me that one of the more recent library ‘search’ products might be helpful. These new products (sometimes known as ‘next gen’ (NG) discovery platforms) are being taken up by libraries to replace their (often aging, rarely pretty) ‘OPACs’ (online public access catalogues) which tend to be a web interface onto what is, at heart, a ‘business’ system – one that administers books, users, serials, and other library stuff.

These discovery platforms tend to work by taking an import of data from the library catalogue on a regular basis, and specialise in indexing the data, rather than the many other administrative tasks that the library catalogue hides. Using dedicated software, that isn’t worrying about any other functionality, these new platforms tend to be much faster returning search results, and give a lot of flexibility in how indexes are built on the data.

While many of the available products are commercial pieces of software (or increasingly, services), there are a couple of relatively high profile open source solutions –  VuFind and Blacklight. If you are interested in a comparison of these two systems, keep an eye on the CREDAUL at the University of Sussex (http://credaul.wordpress.com) which is looking at the both.

So I decided I’d try installing VuFind and use that to explore the data. VuFind is PHP based, but also makes use of the SOLR search platform, which runs on Java. It took me a couple of hours or so fiddling to get the whole thing working – but I thought that was pretty good going – by the end of it, I had my 25k records fully indexed, and was ready to use the system to explore the data.

All of this gave me an idea – this is something you can run on a laptop, and is a great way of looking at your library catalogue data – often exposing issues with the data that you can correct in the catalogue if you want to as well. So, I had the idea that at the next Mashed Library event (Mashspa in Bath) we could run a VuFind ‘bootcamp’, helping delegates get VuFind installations up and running.

Being an impatient sort, 29th October was far too long to wait to get started, so then I thought that maybe I could do a ‘virtual’ version of the bootcamp beforehand (and that would also make sure I was prepared on the day!). So, the idea is that I’m going to post weekly blog posts dealing with the installation of VuFind step by step. I’ll focus on Windows, but already have some people who are interested in doing an install on Linux and Mac OS X. Along side these, I’ll run weekly ‘support sessions’ where I’ll be online to try to help work through problems/issues that people are having – the idea is that these will be live sessions – although I don’t know whether that will be via chat, voice or something else.

Anyway, the starting point is this blog post, and this forum on the Mashed Library site. If you are interested in joining in, sign up to http://www.mashedlibrary.com/groups/vufind-virtual-bootcamp/ and follow along – I’m intending to post the first set of instructions within the week, with a support session to follow shortly after.

Finally if you are interested in the various ‘next gen’ discovery interfaces for libraries, I’d recommend having a look at this list of JISC projects http://code.google.com/p/jisclms/w/list that all deal with improving/experimenting with the library discovery interface and experience.

Sir Louie

I’m working with the University of Oxford on a new project called ‘Sir Louie‘ (which has a website and a blog) to integrate Reading Lists with their online learning environment called WebLearn (which is Sakai under the bonnet). This project has some similarities to the JISC funded TELSTAR project I recently finished at the Open University – but with some different angles, approaches and different systems involved.

Sakai already has a ‘resource list’ functionality called the ‘Citation Helper‘ (which came out of the Sakaibrary project I first heard about at IGeLU 2006 – not 2008 as I originally stated  – thanks to Lukas for the correction to the date).

With the Sir Louie Project, the hope is we can further enhance the Citation Helper through some quite ‘loose coupling’ of various systems. In essence we want to enable:

  • The addition of citations to a Citation Helper resource list from the ‘resource discovery’ system run by the library service at Oxford University called SOLO (actually Primo by Ex Libris under the bonnet)
  • The addition of holdings/availability information to resources in a Citation Helper resource list so that students (or staff) can see at a glance what is available (and where)

The first part we are hoping to emulate the existing functionality the Citation Helper has for Google Scholar (described in this blog post). This adds an extra button to search results in Google Scholar to import the reference into a Citation Helper resource list. However, where Google Scholar seems to push the metadata across in a reasonably arbitrary format, instead we want to enable the Citation Helper to translate any citation formatted as an OpenURL – which should mean that the Citation Helper can then import citations from any database/search interface that provides OpenURLs for results.

The second part we are planning to use the Juice framework, which in turn is built on JQuery. The Juice framework is designed to enable additional functionality, generally in library systems, using relatively simple javascript. Juice has two main components:

  • Metadefs
  • Extensions

Metadefs enable Juice to grab relevant pieces of metadata from a webpage. Essentially it is a way of telling Juice where specific pieces of information are stored on a page – a typical example is to define where an ISBN is stored. So we will be creating a new Metadef for the Citation Helper screens. However, rather than simply creating a metadef that just works with Citation Helper, we are intending to create a metadef that understands COinS – a way of inserting an OpenURL into an html ‘span’ tag.

COinS are already used by a variety of systems, including the Zotero reference management software, and the LibX library browser plugin/toolbar – so if we add COinS to the Citation Helper lists (it already supports OpenURL), not only can we use it for our own purposes, but we are also enabling these existing applications to work with Citation Helper.

There has already been some work done in making Juice work with COinS as part of the VuFind metadef, so I’m hoping that it won’t be too much of a stretch to get this working with Citation Helper.

Once COinS has been added to the Citation Helper, and we have the metadef working, we can look at the Juice ‘extension’ we need to build. This will need to use metadata from the Citation Helper page – probably an identifier (or set of identifiers) such as ISBN, DOI, ISSN, etc. – and then query appropriate systems to get holdings/availability data back. Rather than build a query to each relevant system (and deal with any cross-site scripting issues that may arise) we are planning to write an additional piece of software here to mediate these requests.

We hope to use a standard format for holdings/availability data using the DLF-ILS ‘GetAvailability’ specification, and possibly looking at DAIA (Document Availability Information API) developed by Jakob Voss. We know Ex Libris (who provide the software for SOLO, and also the core library management system in use at Oxford) are committed to this approach (see the Ex Librian newsletter from 2009), and the DLF specification is also being used by other JISC funded projects, such as Summon4HN.

We are very interested in feedback on this approach – any issues people can spot in our approach, questions, or suggestions are very welcome – just leave a comment below.