Following on from my previous post about BNB and SPARQL in this post I’m going to describe briefly building a Chrome browser extension that uses the SPARQL query described in that post – which given a VIAF URI for an author tries to find authors with the same birth year (i.e. contemporaries of the given author).
Why this particular query? I like it because it exposes data created and stored by libraries that wouldn’t normally be easy to query – the ‘birth year’ for people is usually treated as a field for display, but not for querying. The author dates are also interesting in that they give a range for the date a book was actually written rather than published which is the date that is used in most library catalogue searching.
The other reason for choosing this was that it nicely demonstrates how using ‘authoritative’ URIs for things such as people makes the process of bringing together data across multiple sources much easier. Of course whether a URI is ‘authoritative’ is a pretty subjective judgement – based on things like how much trust you have in the issuing body, how widely it is used across multiple sources, how useful it is. In this case I’m treating VIAF URIs as ‘authoritative’ in that I trust them to be around for a while, and they are already integrated into some external web resources – notably Wikipedia.
The plan was to create something that would work in a browser – from a page with a VIAF URI in it (with the main focus being Wikipedia pages), allow the user to find a list of ‘contemporaries’ for the person based on BNB data. I could have done this with a bookmarklet (similar to other projects I’ve done), but a recent conversation with @orangeaurochs on Twitter had put me in mind of writing a browser extension/plugin instead – and especially in this case where a bookmarklet would require the user to already know there was a VIAF URI in the page – it seemed to make sense.
I decided to write a Chrome extension – on a vague notion it probably had a larger installed base of any browser except Internet Explorer – but then later checking Wikipedia stats on browser use showed that Chrome was the most used on Wikipedia at the moment anyway – which is my main use case.
I started to look at the Chrome extension documentation. The ‘Getting Started’ tutorial got me up and running pretty quickly, and soon I had an extension running that worked pretty much like a bookmarklet and displayed a list of names from BNB based on a hardcoded VIAF URI. The extensions are basically a set of Javascript files (with some html/css for display), so if you are familiar with Javascript then once you’ve understood the specific chrome API you should find building an extension quite straightforward.
I then started to look at how I could grab a VIAF URI from the current page in the browser, and only show the extension action when one was found. The documentation suggested this is best handled using the ‘pageAction’ call. A couple of examples (Mappy (zip file with source code) and Page Action by content (zip file with source code)) and the documentation got me started on this.
Initially I struggled to understand the way different parts of the extension communicate with each other – partly because the code examples above don’t use the simplest (or most up to date) approaches (in general there seems to be an issue with the sample extensions sometimes using deprecated approaches). However the ‘Messaging’ documentation is much clearer and up to date.
The other challenge is parsing the XML returned from the SPARQL query – this would be much easier if I used some additional javascript libraries – but I didn’t really want to add a lot of baggage/dependencies to the extension – although I guess many extensions must include libraries like jQuery to simplify specific tasks. While writing this I’ve realised that the BNB SPARQL endpoint supports content negotiation, so it is possible to specify JSON as a response format (using Accept: sparql-results+json as per SPARQL 1.1 specification) – which would probably be simpler and faster – I suspect I’ll re-write shortly to do exactly this.
The result so far is a Chrome extension that displays an icon in the address bar when it detects a VIAF URI in the current page. The extension then tries to retrieve results from the BNB. At the moment failure (which can occur for a variety of reasons) just means a blank display. The speed of the extension leaves something to be desired – which means that sometimes you have to wait quite a while for results to display – which can look like ‘failure’ – I need to add something to show ‘working’ status and a definite message on ‘failure’ for whatever reason.
A working example looks like this:
Each name in the list links to the BNB URI for the person (which results in a readable HTML display in a browser, but often not a huge amount of data). It might be better to link to something else, but I’m not sure what. I could also display more information in the popup – I don’t think the overhead of retrieving additional basic information from the BNB would be that high. I could also do with just generally prettying up the display and putting some information at the top about what is actually being displayed and the common ‘year of birth’ (this latter would be nice as it would allow easy comparison of the BNB data to any date of birth in Wikipedia.
As mentioned, the extension looks for VIAF URIs in the page – so it works with other sources which do this – like WorldCat:
While not doing anything incredibly complicated, I think that it gives one example which starts to answer the question “What to do with Linked Data?” which I proposed and discussed in a previous post, with particular reference to the inclusion of schema.org markup in WorldCat.
You can download the extension ready for installation, or look at/copy the source code from https://github.com/ostephens/contemporaneous
Reading this, it seems possible that VIAF could be more help. Within VIAF’s native XML there are VIAF’s best guess at the most specific birth/death dates, e.g.
<birthDate>1775-12-16</birthDate>
<deathDate>1817-07-18</deathDate>
in http://viaf.org/viaf/102333412/viaf.xml.
If those were explicitly indexed, would it help?
–Th
I think indexing the year at least would be a useful addition – since part of the purpose of dates in these records is to differentiate individuals with the same names then for ‘known person’ searching a year filter makes sense I think.
Of course then I’d need to think of a new search that I can do with SPARQL that you haven’t yet anticipated in VIAF :))