Neil Wilson from the BL doing this talk.
Government has been pushing to open up data for a while. This has started to change some expectations around publishing of ‘publicly owned’ data.
BL wanted to start to develop an Open Metadata Strategy. They wanted to:
- Try and break away from library specific format and use more cross-domain XML based standards – but keep serving libraries not engaged in cutting edge stuff
- Develop the new formats with communities using the metadata
- Get some form of attribution while also adopting a licensing model appropriate to the widest re-use of the metadata
- Adopt a multi-track approach to all this
So first steps were:
Develop capability to supple metadata using RDF/XML
Worked with variety of community and organisations etc…
Current status:
Created a new enquiry point for BL metadata issues
Signed up c400 orgs to the free MARC21 z39.50 service
Worked with JISC, Talis and other linked data implementers on technical challenges, standards and licensing issues
Begun to offer sets of RDF/XML to various projects etc.
Some of the differences between traditional library metadata and Linked data
Traditional library metadata uses a self contained proprietary document based model
Linked data more dynamic data based model to establish relationships between data
By migrating from traditional modles libraries could begin to:
- Integrate their resources in the web
- Increase visibiilty, reach new users
- Offer users a richer resource discovery experience
- Moving from niche costly specialist technologies and suppliers to more ‘standard’ and widely adopted approaches
BL wanted to offer data allowing useful experimentation and advancing discussion from theory to practice. BNB (British National Bibliography) has lots of advantages – general database of published output – not just ‘BL stuff’; reasonably consistent; good identifiers.
Wanted to undertake the work as extension of existing activities – wanted to develop local experitise, using standard hardware for conversion. Starting point was Library MARC21 data. Wanted to focus on data issues not building infrastructure and also on linking to stuff.
First challenge – how to migrate the metadata:
Staff training in linked data – modelling concepts and increased familiarisation with RDF and XML concepts
Experience working with JISC Open Bibliography Project and others
Feedback on MARC to XML conversion
Incremental approach adopted – with several interations around data and data model.
Wanted to palce library data in wider context and supplement or replace literal values in records. Linked to both library sites:
Dewey Info
LCSH SKOS
VIAF
but also non library sites:
GeoNames
…
Three main approaches:
Automatic Generation of URIs from elements in records (e.g. DDC)
Matching of text in records with linked data dumps – e.g. personal names to VIAF
Two stage crosswalking [? missed this]
Lots of preprocessing of MARC records before tackling the transform to RDFXML using XSLT
Can see the data model at http://www.bl.uk/bibliographic/pdfs/british_library_data_model_v1-00.pdf and more information – http://www.bl.uk/bibliographic/datafree.html
Next steps:
Staged release over coming months for books, serials, multi-parts
Monthly updates [I think?]
New data sets being thought about
Lessons learnt…
It is a new way of thinking – legacy data wasn’t designed for this purpose
There are many opinions out there, but few real certainties – your opinion may well be as valid as anyone else – especially when it’s your data
Don’t reinvetn the wheel – there are tools and experience you can use – start simple and develop in line with evolving staff expertise
Reality check by offering samples for feedback to wider groups
Be prepared for some technical criticism in addition to positive feedback and improve in response
Conversion inevitably identifiers hidden data issues – and creates new one
But better to release an imperfect something than a perfect nothing
There is a steep learning curve – but look for training opportunities for staff and develop skills; Cultivate a culture of enquiry and innovation among staff to widen perspectives on new possibilities
It’s never going to be perfect first time – we expect to make mistakes – have to make sure we learn from there and ensure that everyone benefits from the experience. So if anyone is thinking of undertaking a similar journey – Just do it!
Q: How much of the pipeline will you ‘open source’
A: Quite a few of the tools are ‘off the shelf’ (not clear if open source or not?). The BL written utilities could be released in theory – but would need work (not compiled with Open Source compilers at the moment) – so will be looked at…