PTFS Europe, Koha and Evergreen

Nick Dimant now, starts off by covering the growth in PTFS Europe customer base – now including East Dunbartonshire Council in Scotland who are to be the first Evergreen installation for PTFS Europe in the UK.

Growth in customer base has meant PTFS Europe have to extend their staff at the same time (quick run down of their current staff).

Why both Koha and Evergreen? Koha meets the needs of a large number of libraries, Evergreen focussed on consortium.
Nick outlining how library automation has changed over recent years:

  • Growth in web tech
  • Increase in self-service
  • Acquistions via EDI
  • Catalogue records imported
  • Serials mostly electronic
  • etc.

At the same time:

  • Growth in library consortia
  • Growth in shared services

These two (Nick suggests conflicting) trends served by the two different systems.

PTFS charge daily rate for services as needed, and charge for hosting where desired (which is what most customers do)
To give broad idea of costs, for customers in 2011:

  • Y1 costs – £8k-£60k per org
  • Annual hosting and maintenance have ranged from £3k-£20k
  • Quotes on request

Nick says, typically annual maintenance on old system pays for complete migration and thereafter significant savings on hosting and maintenance. Nick stresses not just about cost, but also good, modern, library systems.

  • Hosting – PTFS Europe use Bytemark for hosting.
  • Support – help desk; bug fixes; upgrades – all part of annual service
  • Most customers don’t have custom software development done, but PTFS Europe do offer this where required

Evolution in the UK – enhanced functionality:

  • EDIFACT EDI
  • SIP2 Enhancements
  • Circulation and Acquisitions enhancements

PTFS Europe have formed partnerships with various organisations – work together with other Koha users/customers – enhancements going back into Koha code base.

Future plans for PTFS Europe – development of ILL, possibility of supporting other software (to be discussed later today).

Q: What is relationship with PTFS Inc.
A: Work with them where makes sense, but PTFS Europe committed to community version of Koha in contrast to approach of PTFS inc/LibLime in the US

Q: What is PTFS Europe attitude to the FE Market – what knowledge do they have of those needs etc.
A: Price can be an issue – FE colleges used to paying for low cost library systems, difficult for PTFS to compete This led to some discussion around this -feeling that this isn’t always true, and some FE concerns paying much larger amounts – especially where there are consortium etc.

Open Source Library Management Systems

Today I’m at a PTFS Europe event looking at Open Source library (management) systems. PTFS Europe obviously have an interest here as they sell support for the Koha open source LMS, and it seems that they are also considering supporting other library systems – such as VuFind (for search/discovery), CUFTS (for electronic resource management), a reading list system etc.

The day is opening with talk from Ken Chad. Ken noting we are in period of continuous, disruptive change. Ken says libraries must compete with all kinds of other providers such as Google, Amazon, Wikipedia, LibraryThing, acadremia etc. The point, Ken says, is that users can go elsewhere to fulfil their information/library needs.

Ken going to talk about elements of strategy, business case and how these things fit together.

Ken starts by saying what strategy is not vision or mission – but how we persue these. Universities could have the same or similar visions but very different strategies. Elements of strategy are:

  • objective – single precise objective that will drive the org for the next 3-5 years. While ‘single’ may seem limiting, a single objective can often require many other things to happen, but honing your aims down to a single precise objective can be hugely helpful.
  • scope – identifying what the organisation will not address
  • advantage – this is the most critical aspect in developing an effective strategy statement – it means reall understanding the value that the organisation brings to the customer
Ken describes the ‘strategic sweet spot’ – intersection between library’s capabilities and customer’s needs, where no competitors can serve the needs.
One approach – answer the question what are your capabilties?
  • what is your ‘essential advantage’ – an ingrained ability to succeed … sustained over time and is almost impossible to copy
  • ‘way to play’ – means a considered approach for creating and capturing value in a particular market – what are the things that set the organization apart from competitors
Some business case building blocks:
  • Summary – demonstration of strategic fit
  • description of proposal
  • market analysis
  • options
  • cost
  • resource requirements
  • risk assessment
  • project implementation and review
Common errors:
  • over-optimistic projects – too much to be delivered too soon
  • inadequate market research
  • underestimation of the resource requirements – particularly support
  • insufficient attention to full economic cost
  • writing the plan to give answer you want – doesn’t necessarily convince others
What factors can open source contribute to a strategy? In particular how will it make us more competitive and increase the value of our offering?
What makes OSS attractive? Surveys consistently highlight points such as:
  • lower costs
  • superior security
  • avoid vendor lock-in
Low cost seen as the most attractive aspect. Need to look at ‘total cost of ownership’ (TCO) – all the other factors that add above the purchase price.
Freedom from vendor lock in – open source software does not ‘go out of print’ – the source is available to all. If you don’t like the support you get you can go elsewhere or do it yourself. While not always true, open source tends to be based on open standards – no ‘vendor’ interest in locking you into a proprietary standard.
Community – collaborative efforts to build open source applications can produce software that better meets the needs of partner institutions. Belief is largely founded on hope of overcoming the historical disconnect between producers of software & HE users – “who have complex, unique and poorly understood needs” (from report by Paul Courant and one other – didn’t get the details)
Options and flexibility – OSS can offer these to orgs
“How to chose a Free and Open source Integrated Library system” – from Tristan Muller of the Fondation pour une Bibliotheque Globale, Quebec

Spotlight on Names

A few people have been kind enough to test out my Composed bookmarklet and give some feedback (here on Google+ amongst other places). A couple of people identified composers on COPAC for which my bookmarklet didn’t produce any information, and when I checked this was because the underlying data I’m using, the MusicNet Codex, didn’t have a record for those examples.

This reinforced an idea that keeps coming back to me as I think about Library data and Linked Data – which is that we need good ways of capturing and then re-expressing in our data feedback from consumers/users of the data. In the case of the Composed bookmarklet it seems sensible to have a way to allow people to say at least:

  • “This record contains a composer [name or identifier] you don’t have in your data]”
  • “For this record the bookmarklet displays a composer not mentioned in the record”

This triggered some further thoughts that a bookmarklet could be a nice way of generally allowing those interested (librarians or others) to add information to the record – specifically structured information and identifiers for people (and perhaps some other entities). Then discussing the #discodev competition with Mathieu D’Aquin at the Open University today he mentioned the DBpedia ‘Spotlight’ tool which does entity extraction and gives back identifiers from DBpedia.

So how about a bookmark which:

  • Grabs the ‘people’ fields (100, 700, 600 – others?)
  • Passes contents to Spotlight and gets back possible DBPedia matches
  • Links to VIAF (think this is possible where VIAF has a Wikipedia URI) (possibly do this after decision made below)
  • Allows the user to confirm or reject the suggestions – if they confirm allows them to state a relationship as defined by MARC relators (available as Linked Data at http://id.loc.gov/vocabulary/relators.html)
  • Posts a triple expressing a link between the catalogue record, the relator, and DBPedia URI and/or VIAF URI

This could then be harvested back by libraries or others to get more expressive linked data relating bibliographic entities to people entities with a meaningful relationship. I haven’t looked at this in detail – but I don’t think it would be very difficult – my guess is just a few hours work.

I think this also starts to address another issue that always comes up when discussing libraries and linked data which is how might linked data start to become part of the metadata creation process in libraries – although since it relies on an existing record it doesn’t really get there – but if libraries are going to successfully exploit linked data we need to play around with interfaces that help us integrate linked data into our data as it is created.

Compose yourself

‘Composed’ is my entry for the UK Discovery #discodev developer competition. Composed helps you link between information about composers of classical music by exploiting the MusicNet Codex.

Specifically currently Composed enables linking from COPAC catalogue records mentioning a composer, to other information and records about the composer.

What is it and how do I install it?
Composed is a Bookmarklet which you can install by dragging this link to your browser bookmark bar: Composed.

How do I use it?
If you haven’t already, drag this Composed bookmarklet to your browser bookmark bar (or otherwise add to your bookmarks). Next find a record on COPAC which mentions a classical composer –  such as this one, and once you are viewing it in your browser, click the bookmarklet.

Assuming it is all working, you should find that the display is enhanced by the addition of some links on the right hand side of the screen. If you’ve used the example given above you should see:

  • An image of the composer (Handel) [UPDATE 5th August 2011: See my comment below about problems with displaying images]
  • Links to:
    • Records in the British Library catalogue
    • More records in COPAC
    • The entry for Handel in the Grove Dictionary of Music (subscribers only)
    • Record in RISM (a music and music literature database)
    • A page about Handel on the BBC
    • A page about Handel in the IMDB
    • The Wikipedia page for Handel

These are based on the information available for Handel from the MusicNet Codex and the BBC.

Example COPAC page enhanced by 'Composed'

If the record you are looking at contains multiple composers, you should see multiple entries in the display. If there are no links available for the record you are viewing, you should see a message to this effect.

How does it work?
The mechanics are pretty simple. The bookmarklet itself is a simple piece of javascript, which calls another script. This script finds the COPAC Record ID from the COPAC record currently in the browser. This ID is passed to some php which uses a database I built specifically for this purpose to match a COPAC record ID to one or more MusicNet Codex URIs. For each MusicNet Codex URI retrieved, the script requests information from the URI, gets back some RDF, from which it extracts the links that are passed back to the javascript, which inserts them into the COPAC display. If the MusicNet RDF contains a link to the BBC, further RDF is grabbed from the BBC and the relevant information is added into the data passed back to the display.

So what were the challenges?

Challenge 1

Manipulating RDF. Although I’ve done quite a bit of work with RDF in one form or another, I’ve never actually written scripts that consume it – so this was new to me. I ended up using php because the Graphite RDF parser, written by Chris Gutteridge at the University of Southampton made it so easy to retrieve the RDF and grab the information I needed – although it took me a little while to get my head around navigating a graph rather than a tree (being pretty used to parsing xml).

So I guess I owe Chris a pint for that one 🙂

Challenge 2

The major challenge was getting a list of COPAC record IDs which mapped to MusicNet Codex URIs. Actually – I wasn’t able to do this and what I have is an approximation – almost certainly you can find examples where the bookmarklet populates the screen with a composer when there is no composer mentioned in the record.

Unfortunately MusicNet is unable to point at a COPAC identifier or URI for a person – like many library catalogues, COPAC identifies items in libraries (or perhaps more accurately, the records that describe these items), but not any of the entities (people, places, etc.) within the record. This means that while MusicNet can point at the a specific URI for the BBC that represents (for example) ‘Handel’, with COPAC all it does is give a URL which triggers a search which should bring back a set of records all of which mention Handel.

There is a whole load of background as to how MusicNet got to this point, and how they build the searches to COPAC – but essentially it is based on the text strings in COPAC that were found by the MusicNet project to refer to the same composer. These text strings are what are used to build the search against COPAC. This is also the explanation as to why you sometimes see multiple links to COPAC/the British Library catalogue in the Composed bookmarklet display – because there are multiple strings that MusicNet found represent the same composer.

What I’ve done to create a rough mapping between MusicNet and COPAC records is to run each search that MusicNet defines for COPAC and grab all the record ids in the resultant record set. This gives a rough and ready mapping, but there are bound to be plenty of errors in there. For example one of the  searches MusicNet holds for the composer Franz Schubert on the British Library catalogue is http://catalogue.bl.uk/F/?func=find-b&request=Schubert&find_code=WNA – which will actually find everything by anyone called ‘Schubert’ – if there are any similar searches in the COPAC data I’ll be grabbing a lot of irrelevant records in my searching. Since the number of searches, and resultant records, is relatively high (e.g. over 30k records mention Mozart), at the time of writing I’m still in the process of populating my mapping – it is currently listing around 50k [Update: 31/7/2011 at 15:33 – final total is 601,286] COPAC IDs, but I’ll add more as my searches run and produce results in the background.

I’m talking to the MusicNet team to see if they are able at this stage to track back to the original COPAC records they used to derive their data, and so we could get an exact mapping of their URIs to lists of record IDs on COPAC – this would be incredibly useful and allow functions such as mine to work much more reliably.

None of this should be seen as a criticism of either the MusicNet or COPAC teams – without these sources I wouldn’t have even been able to get started on this!

Final Thoughts

I hope this shows how data linked across multiple sources can bring together information that would otherwise be separated. There is no reason in theory why the bookmarklet shouldn’t be expanded to take in the other data sources MusicNet knows about – and possibly beyond (as long as there is access to and ID that can finally be brought back to MusicNet).

Libraries desperately need to move beyond ‘the record’ as the way they think about their data – and start adding the identifiers they already have available to their data – this would make this type of task much easier.

If you want to build other functionality on my rough and ready MusicNet to COPAC record mapping, you can pass COPAC IDs to the script:

http://www.meanboyfriend.com/composed/composed.php?copacid=<copac_record_id>

You’ll get back some JSON containing information about one or more composers with a Name, Links, and an Image if the BBC have a record of an image in their data.

Discovering Discovery

As I mentioned in a recent post I’ve been involved in UK Discovery (http://discovery.ac.uk) – an initiative to enable resource discovery through the publication and aggregation of metadata according to simple, open, principles.

Discovery is currently running a Developer competition. Others have already blogged the competition, but what I wanted to do here was note the reasons for running the competition, capture some ideas that I’ve had, and hopefully inspire others to enter the competition (as I hope to myself).

Firstly – why the developer competition? For me I hope we can achieve three things through the competition:

  1. Engage developers in/get them excited about Discovery
  2. Get feedback from developers on what works for them in terms of building on Discovery
  3. Start building a set of examples of what can be achieved in the Discovery ecosystem

If we achieve any of these I’ll be pretty happy. We are still at early days in building an environment of open (meta)data for libraries, archives and museums, but the 10 data sets we are featuring in the competition provide good examples of the type of data we hope will be published with the encouragement and advice of the Discovery initiative.

On to ideas. The list below is basically just me brainstorming – my hope is that others might be inspired by one of the ideas, or others might contribute more ideas via the comments. (I’ve already picked one of the ideas below that I’m going to try and turn into an entry of my own – but for the purposes of dramatic tension, I won’t reveal this until the end of the post!)

  • Linked Library Catalogue. Rather than having a catalogue made up of MARC (or other format of choice) records, rather simply a list of URIs which point to the bibliographic entities on the web. Build an OPAC on top of this list by crawling the URIs for metadata and indexing locally (e.g. with Solr). Could use Cambridge University Library, Jerome and BNB featured datasets as well as other bibliographic information on the web.
  • What’s hot in research? Use the Mosaic Activity Data, the OpenURL Router data and other relevant data (e.g. from research publication repositories) to look at trends in research areas. Possibly mash up with Museum/Archive data to highlight relevant collections to the research community based on the current ‘hot topics’?
  • Composer Bookmarklet. Use the MusicNet Codex to power a bookmarklet that when installed and used would link from relevant pages/records in COPAC/BL/RISM/Grove/BBC/DbPedia/MusicBrainz to other sources. Focus on providing links from library catalogue records to other relevant sources (like recordings/BBC programmes)
  • Heritage Britain. Map various cultural heritage items/collections onto a map of Britain. Out of the featured datasets English Heritage data is the obvious starting point, but could include data from Archives Hub, National Archives Flickr collection, and the Tyne and Wear Museums data.

Remember that although entries have to use data from one of the featured data sets (I’ve mentioned them all here), you can use whatever other data you like…

If you’ve got ideas (perhaps especially if you aren’t in a position to develop them yourself) that you think would be great demonstrations or just really useful, feel free to blog yourself, or comment here.

And the one I’m hoping to take forward? The Composer Bookmarklet – I’ll blog progress here if/when I make any (although don’t let that stop you if you want to develop one as well!)

 

Linked Data and Libraries: The record is dead

Rob Styles from Talis…
First of all says – the record is not dead – it should be – it should have been dead 10 years ago, but it isn’t [comment from the floor – ‘it never well be’]

Rob says that what is missing from the record is the lack of relationships – this is the power you want to exploit.

If you think this stuff is complicated try reading the MARC Manuals and AACR2… 🙂 Linked Data is different but not complicated

MARC is really really good – but really really outdated – why a 245$$a not ‘Title’ – is it because ‘245’ universally understandable? NO! Because when computing power and space was expensive they could afford 3 characters – that’s why you end up with codes like 245 rather than ‘Title’.

If you look at the use of labels and language in linked data you’ll find that natural language often used [note that this isn’t the case for lots of the library created ontologies 🙁 ]

What is an ‘identifier’? How are they used in real life? They are links between things and information about things – think of a number in a catalogue – we don’t talk about the identifiers, we just use them when we need to get a thing, or information about the thing (e.g. Argos catalogue)

Call numbers are like URLs ….

Rob’s Law: “If you wish to link to things, use a link”

Rob pointing up the issues around the way libraries work with subject headings and ‘aboutness’. Libraries talk about ‘concepts’ rather than ‘things’ – a book is about ‘Paris’ not about ‘the concept of Paris’. Says we need to get away from the abstractness of concepts. He sees the use of SKOS as a reasonable thing to do to get stuff out there – but hopes it is a temporary work around which will get fixed.

MARC fails us by limiting the amount of data that can be recorded – only 999 fields… need approaches that allow more flexibility. And ability to extend where necessary.

FRBR – introduces another set of artificial vocabulary – this isn’t how people speak, it isn’t the language they use.

We need to model data and use vocabulary so it can connect to things that people actually talk about – as directly as we realistically can. Rob praises the BL modelling on this front.

How do you get from having a library catalogue, to having a linked data graph… Rob says 3 steps:

  • training – get everyone on the team on the same page, speaking the same language
  • workshop – spend a couple of days away from the day job – looking at what others are doing, identifying what you want to do, what you can do – until you have a scope of something feasible and worthwhile
  • mentoring and oversight – keep team going, make sure they can ask questions, discuss etc.

Q & A:
Mike Taylor from IndexData asks – how many ‘MARC to RDF’ pipelines have been built by people in this room? Four or five in the room.
Rob says – lots of experimentation at this stage… this is good… but not sure if we will see this come together – but production level stuff is different to experimental stuff.

?? argues we shouldn’t drop ‘conceptual’ entities just because we start representing ‘real world’ things
Rob says ‘subject’ is a relationship – this is how it should be represented.
Seems to be agreement between them that conceptual sometimes useful – but that the more specific stuff is generally more useful… [I think – not sure I understood the argument completely]

Linked Data and Libraries: The record is dead

Rob Styles from Talis…
First of all says – the record is not dead – it should be – it should have been dead 10 years ago, but it isn’t [comment from the floor – ‘it never well be’]

Rob says that what is missing from the record is the lack of relationships – this is the power you want to exploit.

If you think this stuff is complicated try reading the MARC Manuals and AACR2… 🙂 Linked Data is different but not complicated

MARC is really really good – but really really outdated – why a 245$$a not ‘Title’ – is it because ‘245’ universally understandable? NO! Because when computing power and space was expensive they could afford 3 characters – that’s why you end up with codes like 245 rather than ‘Title’.

If you look at the use of labels and language in linked data you’ll find that natural language often used [note that this isn’t the case for lots of the library created ontologies 🙁 ]

What is an ‘identifier’? How are they used in real life? They are links between things and information about things – think of a number in a catalogue – we don’t talk about the identifiers, we just use them when we need to get a thing, or information about the thing (e.g. Argos catalogue)

Call numbers are like URLs ….

Rob’s Law: “If you wish to link to things, use a link”

Rob pointing up the issues around the way libraries work with subject headings and ‘aboutness’. Libraries talk about ‘concepts’ rather than ‘things’ – a book is about ‘Paris’ not about ‘the concept of Paris’. Says we need to get away from the abstractness of concepts. He sees the use of SKOS as a reasonable thing to do to get stuff out there – but hopes it is a temporary work around which will get fixed.

MARC fails us by limiting the amount of data that can be recorded – only 999 fields… need approaches that allow more flexibility. And ability to extend where necessary.

FRBR – introduces another set of artificial vocabulary – this isn’t how people speak, it isn’t the language they use.

We need to model data and use vocabulary so it can connect to things that people actually talk about – as directly as we realistically can. Rob praises the BL modelling on this front.

How do you get from having a library catalogue, to having a linked data graph… Rob says 3 steps:

  • training – get everyone on the team on the same page, speaking the same language
  • workshop – spend a couple of days away from the day job – looking at what others are doing, identifying what you want to do, what you can do – until you have a scope of something feasible and worthwhile
  • mentoring and oversight – keep team going, make sure they can ask questions, discuss etc.

Q & A:
Mike Taylor from IndexData asks – how many ‘MARC to RDF’ pipelines have been built by people in this room? Four or five in the room.
Rob says – lots of experimentation at this stage… this is good… but not sure if we will see this come together – but production level stuff is different to experimental stuff.

?? argues we shouldn’t drop ‘conceptual’ entities just because we start representing ‘real world’ things
Rob says ‘subject’ is a relationship – this is how it should be represented.
Seems to be agreement between them that conceptual sometimes useful – but that the more specific stuff is generally more useful… [I think – not sure I understood the argument completely]

Linked Data and Libraries: Creating a Linked Data version of the BNB

Neil Wilson from the BL doing this talk.

Government has been pushing to open up data for a while. This has started to change some expectations around publishing of ‘publicly owned’ data.

BL wanted to start to develop an Open Metadata Strategy. They wanted to:

  • Try and break away from library specific format and use more cross-domain XML based standards – but keep serving libraries not engaged in cutting edge stuff
  • Develop the new formats with communities using the metadata
  • Get some form of attribution while also adopting a licensing model appropriate to the widest re-use of the metadata
  • Adopt a multi-track approach to all this

So first steps were:
Develop capability to supple metadata using RDF/XML
Worked with variety of community and organisations etc…

Current status:
Created a new enquiry point for BL metadata issues
Signed up c400 orgs to the free MARC21 z39.50 service
Worked with JISC, Talis and other linked data implementers on technical challenges, standards and licensing issues
Begun to offer sets of RDF/XML to various projects etc.

Some of the differences between traditional library metadata and Linked data
Traditional library metadata uses a self contained proprietary document based model
Linked data more dynamic data based model to establish relationships between data

By migrating from traditional modles libraries could begin to:

  • Integrate their resources in the web
  • Increase visibiilty, reach new users
  • Offer users a richer resource discovery experience
  • Moving from niche costly specialist technologies and suppliers to more ‘standard’ and widely adopted approaches

BL wanted to offer data allowing useful experimentation and advancing discussion from theory to practice. BNB (British National Bibliography) has lots of advantages – general database of published output – not just ‘BL stuff’; reasonably consistent; good identifiers.

Wanted to undertake the work as extension of existing activities – wanted to develop local experitise, using standard hardware for conversion. Starting point was Library MARC21 data. Wanted to focus on data issues not building infrastructure and also on linking to stuff.

First challenge – how to migrate the metadata:
Staff training in linked data – modelling concepts and increased familiarisation with RDF and XML concepts
Experience working with JISC Open Bibliography Project and others
Feedback on MARC to XML conversion

Incremental approach adopted – with several interations around data and data model.

Wanted to palce library data in wider context and supplement or replace literal values in records. Linked to both library sites:
Dewey Info
LCSH SKOS
VIAF

but also non library sites:
GeoNames

Three main approaches:
Automatic Generation of URIs from elements in records (e.g. DDC)
Matching of text in records with linked data dumps – e.g. personal names to VIAF
Two stage crosswalking [? missed this]

Lots of preprocessing of MARC records before tackling the transform to RDFXML using XSLT

Can see the data model at http://www.bl.uk/bibliographic/pdfs/british_library_data_model_v1-00.pdf and more information – http://www.bl.uk/bibliographic/datafree.html

Next steps:
Staged release over coming months for books, serials, multi-parts
Monthly updates [I think?]
New data sets being thought about

Lessons learnt…
It is a new way of thinking – legacy data wasn’t designed for this purpose
There are many opinions out there, but few real certainties – your opinion may well be as valid as anyone else – especially when it’s your data
Don’t reinvetn the wheel – there are tools and experience you can use – start simple and develop in line with evolving staff expertise
Reality check by offering samples for feedback to wider groups
Be prepared for some technical criticism in addition to positive feedback and improve in response
Conversion inevitably identifiers hidden data issues – and creates new one
But better to release an imperfect something than a perfect nothing

There is a steep learning curve – but look for training opportunities for staff and develop skills; Cultivate a culture of enquiry and innovation among staff to widen perspectives on new possibilities

It’s never going to be perfect first time – we expect to make mistakes – have to make sure we learn from there and ensure that everyone benefits from the experience. So if anyone is thinking of undertaking a similar journey – Just do it!

Q: How much of the pipeline will you ‘open source’
A: Quite a few of the tools are ‘off the shelf’ (not clear if open source or not?). The BL written utilities could be released in theory – but would need work (not compiled with Open Source compilers at the moment) – so will be looked at…

Linked Data and Libraries: W3C Library Linked Data Group

This talk by Antoine Isaac…
Also see http://www.w3.org/2005/Incubator/lld/wiki
W3C setup a working group on library linked data – coming to an end now. Mission of group = help increase global interoperability of library data on web…

Wanted to see more library data in the Linked Data cloud and also to start to put together the ‘technological bits and pieces’ – things like:

  • Vocabularies/schemas
  • Web services
  • Semantic Web search engines
  • Ontology editors

But – need for a map of the landscape – specifically for library sector…
And need to answer some of the questions librarians or library decision makere have like:

  • What does it have to do with bibliography?
  • Does it make life better for patrons?
  • Is it practical?
  • etc.

About to report have got:
Use cases – grouped into 8 topical clusters – bib data; vocab alignment; citations; digital objects; social; new users

Available data:

  • Datasets
  • Value vocabularies (lists of stuff – like LCSH)
  • Element sets (Ontologies)

See http://ckan.net/group/lld

Finally and most important deliverable:
High level report – intended for a general library audience: decision makers, developers, metadata librarians, etc. Tries to expand on general benefits, issues and recommendations. Includes:

Relevant technologies
Implementation challenges and barriers to adoption

Still got a chance to comment:
http://blogs.ukoln.ac.uk/w3clld/

For the future Antoine says discussions and collaboration should continue – existing groups within libraries or with wider scope – IFLA Semantic Web special interest group; LOD-LAM

Possibility of creating a new W3C Community group…

We need a long term effort – not all issues (many issues) are not technical

Comment from floor – also see http://reddit.com/r/librarylinkeddata

Linked Data and Libraries: OpenAIRE

OpenAIRE is a European funded project … 38 partners

Want to link project data (in institution, CRIS) to publications resulting from those projects. Data sources – Institutional Repositories – using OpenDOAR as authority on this.

Various areas needed vocabularies – some very specific like ‘FP7 Area’ some quite general like ‘Author’ (of paper)

Various issues capturing research output from different domains:

  • Difference responsibilities and tasks
  • Different metadata formats
  • Different exchange interfaces and protocols
  • Different levels of granularity

In the CRIS domain:
Covers research process
run by admin depts
broader view on research info
diverse data models – e.g. CERIF(-like) models; DDF-MXD, METIS, PURE – and some internal formats

In OAR domain:
Covers research publications
Uses Dublin Core

Interoperability between CRIS and OAR

Working group within the ‘quadrolateral Knowledge Exchange-Initiative’ (involving SURF, JISC, DFG, DEFF) – aiming to increase interoperability between CRIS and OAR domains – want to define a lightweight metadata exchange format.

…. sorry – suffering my usual post lunch dip and distraction – didn’t get half of what I could have here 🙁