Finally in this set of three ‘perspectives’ session, Ed Chamberlain from Cambridge University Library.
Why expose bibliographic data?
- Natural follow on from philosophy of ‘meeting reader in their (online) place’
- Already exposing data to others (OCLC, COPAC, SUNCAT etc.) – lots of work to setup each agreement and export – Open data approach might give easier way of approaching this
- Offer value for money (for taxpayer)
- Internal academic pressure – ‘we are being asked for data’
e.g. use Rufus Pollock – wanted to do ‘analysis of size and growth of the public domain using CUL bibliographic data (http://rufuspollock.org/tags/eupd)
The COMET (Cambridge Open METadata) project will be releasing large amounts of bibliographic data under an Open Data Commons License. Formats will include MARC21 and RDF – partnering with OCLC so linking into related services such as FAST and VIAF.
Ed thinks the library sector should have following ambitions around resource discovery:
- Hope to see ‘long tail’ effect – exposing data to large audience
- ‘Out of domain’ discovery
- Multiple points of discovery at multiple levels for multiple audiences
- Services for Undergraduates, for academics AND for developers
Practicalities/Challenges:
- Licensing
- While individual records may not be protected by copyright, collections of records may be – and often obtained by library from shared catalogue resources/commercial suppliers under contract
- Ideal is full unrestricted access
- Better to publish data (as much as you can, even if necessary to have more restrictive licensing attached)
- RDF vocab and mappings – no standard
- Triplestores – for managing RDF – but new technology, seems complex
Opportunities:
- Strong platform for future development
- Linked formats and open licenses are virtuous pairing
- Huge scope for back office benefits
Need to also think beyond bibliography – what about holdings? libraries (physical locations)? librarians as linked data (!) (finding people with specialisms etc.)