CURL metamorphosises into RLUK

Anne Poulson, the new Executive Director of what was CURL and will shortly be ‘RLUK’ (Research Libraries UK) has been presenting at Imperial this morning about the future direction for CURL. The new Strategic Plan is now on the CURL website:

http://www.curl.ac.uk/about/documents/RLUKStrategicPlan2008-2011.pdf

Definitely some interesting points about what RLUK is going to be doing in the future – one that leapt out at me (probably because its what I associate most closely with CURL), is the ‘Resource Discovery and Delivery’ part, which indicates that CURL is eager to work with a wide range of bodies – both commercial and public – to be part of a conversation about the creation of a ‘integrated national discovery to delivery service’. The emphasis was clearly on being part of the conversation, rather than being the body that delivered the service, but that funding had to exist somewhere to make this happen.

Anyway, Anne is going round CURL members to do the same presentation, but the new RLUK website should be launching shortly, and there is to be a conference in October (22nd-24th, no further details as yet) which will be an very open conference intending to include academics and staff from across library services – sounds like it could be interesting – watch the website for more details…

Q: What’s the difference between a book?

A: One of its formats is both the same

The original joke I remember reading in the Puffin Joke Book as a child (and bizarrely recall that it was used in the Jigsaw TV series to counter some kind of laughing fit affecting all the characters), but it came to mind when thinking about e-books.

Much of this post is based on a presentation put together by two of my staff last year – so thanks to them. The jokes and some of the thoughts are my own…

One of the most problematic things about e-books is that we don’t really understand what we are talking about. There are a few definitions around:

This first one from

Armstrong, Chris and Edwards, Louise and Lonsdale, Ray (2002) Virtually There? E-books in UK academic libraries. Program : electronic library and information systems 36(4):pp. 216-227.

“any piece of electronic text regardless of size or composition (a digital object), but excluding journal publications made electronically (or optically) for any device (handheld or desk-bound) that includes a screen”

– so basically any electronic text that isn’t a journal (but what’s a journal?).

Another one from JISC e-collections:

“an e-book can be a PDF file, interactive website or interactive database”

What about non-interactive websites? Do the ascii text Project Gutenberg files count?

For statistical purposes, SCONUL says (http://www.sconul.ac.uk/groups/performance_improvement/papers/SCONULguidance.doc):

“Use the E-measures Definitions Table [http://www.sconul.ac.uk/groups/performance_improvement/papers/definitionstable.doc] provided to determine whether to count titles within an e-book collection here or to make a single entry under 2k databases.

The distinction is based on the International Standard which includes ‘directories, encyclopaedias, dictionaries, statistical tables and figures’ as ‘databases …usually consulted for specific pieces of information rather than read consecutively’. Please use these definitions even where they do not agree with the practice in your library. The purpose is to ensure that as far as possible all libraries are counting in the same way. You may wish to maintain separately within the library a count of the number of e-books within databases entered in 2k, but this is not required for the SCONUL return. ”

To be honest, I guess we’d encounter the same problem if we tried to define a ‘book’ rather than just an ‘e-book’, and it’s easy to criticise others attempts while I shy away from attempting to do the same. However, it creates a problem in terms of having a common language we agree on to talk about these things. For what its worth I like the way the SCONUL definition suggests the importance difference is how you access the information rather than how the thing fits together.

I’ve just come out of a meeting of UniProc (a purchasing consortium) with colleagues from Oxford University Library and UCL to talk about purchasing e-books. Obviously agreeing what we meant when we said ‘e-book’ was relevant, as if we are going to talk to suppliers, we have to be clear about what we want to buy. We agreed (I think!) that we were talking about:

  • Current titles
  • Available as a printed book
  • Not ‘reference’
  • Purchase model not subscription model

Is this a useful working definition?  It seems slightly odd to say ‘available as a printed book’ – but to some extent I feel this is key. If someone offered us an e-resource tomorrow that was ‘born digital’ is there any chance we would regard it as an e-book? My guess is not…

This also excludes collections like XRefer, ECO, EEBO, Oxford Reference Collection, Encyclopaedia Britannica etc. Reference works have always been different in terms of the way they are utilised to other books, and once you lose the physical constraints then the differences are more apparent than the similarities. This brings to mind an argument David Weinberger makes in ‘Everything is Miscellaneous’, which says ‘the natural unit of music is the track’ – arguing the concept of an ‘Album’ was driven by economic rather than artistic reasons. I suspect this is slightly mis-reasoned (see Nicholas Carr’s critique), but I also think there is some truth. Although I don’t think there is a ‘natural unit’ of music or of writing, I think some things can be regarded as ‘indivisible’ and some not.

A pop song is often written as a standalone piece, and although there are stand out ‘pop’ albums, in many cases the album is simply a collection of songs. A symphony is clearly designed as a single work, even though it will typically be made up of several ‘movements’, which are usually treated as individual tracks. There is no doubt pleasure can be obtained from listening to individual movements, the symphony is a richer piece and ‘makes sense’ as a cohesive whole (and the same is true of some pop albums and other genres – it’s not unique to Classical music). As an aside, as I digitised my Classical music collection I realised that ‘albums’ were a pain in the neck for classical music – after copying to iTunes I then changed the ‘Album’ information to reflect the piece rather that the physical album.

So I would argue a novel is ‘indivisible’ – although many novels are formed as a series of chapters, these chapters do not standalone. Novels are probably the clearest ‘book’ example here – we can recognise an e-novel because it is cohesive. Even if you dump the physical format you can’t sensibly divide them further.

Other books (many academic books follow this pattern) are constructed of chapters which could be read in isolation and still make sense (indeed, from my memories of my academic studies this is how I used books to write essays – I would read the relevant chapters rather than whole books). Once you leave the physical format, I’d argue these individual chapters become the ‘atomic’ unit.

For reference books this goes further – the individual entries are the ‘atomic’ unit.

So my conclusion for the moment is we can only define e-books in terms of printed books – nothing else makes sense, and anything ‘born digital’ needs to be described as something else (generally we use ‘e-resource’ which is not very helpful!).

Any advance on this – comments and alternatives welcome…


The original joke is

Q: What’s the difference between a duck?

A: One of its legs is both the same.

Q: What’s the difference between a book?

A: One of its formats is both the same

The original joke I remember reading in the Puffin Joke Book as a child (and bizarrely recall that it was used in the Jigsaw TV series to counter some kind of laughing fit affecting all the characters), but it came to mind when thinking about e-books.

Much of this post is based on a presentation put together by two of my staff last year – so thanks to them. The jokes and some of the thoughts are my own…

One of the most problematic things about e-books is that we don’t really understand what we are talking about. There are a few definitions around:

This first one from

Armstrong, Chris and Edwards, Louise and Lonsdale, Ray (2002) Virtually There? E-books in UK academic libraries. Program : electronic library and information systems 36(4):pp. 216-227.

“any piece of electronic text regardless of size or composition (a digital object), but excluding journal publications made electronically (or optically) for any device (handheld or desk-bound) that includes a screen”

– so basically any electronic text that isn’t a journal (but what’s a journal?).

Another one from JISC e-collections:

“an e-book can be a PDF file, interactive website or interactive database”

What about non-interactive websites? Do the ascii text Project Gutenberg files count?

For statistical purposes, SCONUL says (http://www.sconul.ac.uk/groups/performance_improvement/papers/SCONULguidance.doc):

“Use the E-measures Definitions Table [http://www.sconul.ac.uk/groups/performance_improvement/papers/definitionstable.doc] provided to determine whether to count titles within an e-book collection here or to make a single entry under 2k databases.

The distinction is based on the International Standard which includes ‘directories, encyclopaedias, dictionaries, statistical tables and figures’ as ‘databases …usually consulted for specific pieces of information rather than read consecutively’. Please use these definitions even where they do not agree with the practice in your library. The purpose is to ensure that as far as possible all libraries are counting in the same way. You may wish to maintain separately within the library a count of the number of e-books within databases entered in 2k, but this is not required for the SCONUL return. ”

To be honest, I guess we’d encounter the same problem if we tried to define a ‘book’ rather than just an ‘e-book’, and it’s easy to criticise others attempts while I shy away from attempting to do the same. However, it creates a problem in terms of having a common language we agree on to talk about these things. For what its worth I like the way the SCONUL definition suggests the importance difference is how you access the information rather than how the thing fits together.

I’ve just come out of a meeting of UniProc (a purchasing consortium) with colleagues from Oxford University Library and UCL to talk about purchasing e-books. Obviously agreeing what we meant when we said ‘e-book’ was relevant, as if we are going to talk to suppliers, we have to be clear about what we want to buy. We agreed (I think!) that we were talking about:

  • Current titles
  • Available as a printed book
  • Not ‘reference’
  • Purchase model not subscription model

Is this a useful working definition?  It seems slightly odd to say ‘available as a printed book’ – but to some extent I feel this is key. If someone offered us an e-resource tomorrow that was ‘born digital’ is there any chance we would regard it as an e-book? My guess is not…

This also excludes collections like XRefer, ECO, EEBO, Oxford Reference Collection, Encyclopaedia Britannica etc. Reference works have always been different in terms of the way they are utilised to other books, and once you lose the physical constraints then the differences are more apparent than the similarities. This brings to mind an argument David Weinberger makes in ‘Everything is Miscellaneous’, which says ‘the natural unit of music is the track’ – arguing the concept of an ‘Album’ was driven by economic rather than artistic reasons. I suspect this is slightly mis-reasoned (see Nicholas Carr’s critique), but I also think there is some truth. Although I don’t think there is a ‘natural unit’ of music or of writing, I think some things can be regarded as ‘indivisible’ and some not.

A pop song is often written as a standalone piece, and although there are stand out ‘pop’ albums, in many cases the album is simply a collection of songs. A symphony is clearly designed as a single work, even though it will typically be made up of several ‘movements’, which are usually treated as individual tracks. There is no doubt pleasure can be obtained from listening to individual movements, the symphony is a richer piece and ‘makes sense’ as a cohesive whole (and the same is true of some pop albums and other genres – it’s not unique to Classical music). As an aside, as I digitised my Classical music collection I realised that ‘albums’ were a pain in the neck for classical music – after copying to iTunes I then changed the ‘Album’ information to reflect the piece rather that the physical album.

So I would argue a novel is ‘indivisible’ – although many novels are formed as a series of chapters, these chapters do not standalone. Novels are probably the clearest ‘book’ example here – we can recognise an e-novel because it is cohesive. Even if you dump the physical format you can’t sensibly divide them further.

Other books (many academic books follow this pattern) are constructed of chapters which could be read in isolation and still make sense (indeed, from my memories of my academic studies this is how I used books to write essays – I would read the relevant chapters rather than whole books). Once you leave the physical format, I’d argue these individual chapters become the ‘atomic’ unit.

For reference books this goes further – the individual entries are the ‘atomic’ unit.

So my conclusion for the moment is we can only define e-books in terms of printed books – nothing else makes sense, and anything ‘born digital’ needs to be described as something else (generally we use ‘e-resource’ which is not very helpful!).

Any advance on this – comments and alternatives welcome…


The original joke is

Q: What’s the difference between a duck?

A: One of its legs is both the same.

UKRR Consultation Meeting – 19th February 2008

At the British Library today there was a UKRR Consultation meeting. I wasn’t at the previous consultation meeting last year, but by all accounts it provoked heated discussion. Today’s meeting seemed quite calm.

I should probably declare an interest in that the UKRR project is based at Imperial, and Deborah Shorley (my boss) is the Project Leader.

UKRR is described at http://www.curl.ac.uk/projects/CollaborativeStorage/About.htm, but in summary it is an attempt to coordinate use of collaborative storage/retention of research material (focusing on journals) across UKHE. The project has two phases, Phase 1 is to develop a prototype UKRR with 6 ‘early adopters’, Phase 2 is intended to open participation up to all research libraries wishing to participate.

The meeting started with an introduction from Deborah, followed by presentations from various participants in the ‘Phase 1’ project, which comes to an end later this year. UKRR is now in the process of making bid for second phase to HEFCE for c.£7m.

Jon Purcell – Director of Library Services, St Andrews

The first presentation was from St Andrews, who have been part of the ‘Phase 1’ project.

St Andrews went for easy wins – Science and Medicine journals, rather than Arts; Abstracts and Indexes etc. Some ‘academic angst’ – tended to be specific individuals, although also some departments. Jon noted that this ‘angst’ should not to be underestimated by those looking to be part of UKRR what I think he termed ‘Collaborative Managed Retention’, and that open dialogue with those concerned helped. However, process took longer and more staff intensive and expensive than expected. Checking process took much longer due to poor records in St Andrews catalogue as well as SunCat.

Good things about UKRR:

  • Allows open discussions of issues – not a library secret
  • Advocacy Toolkit enormously helpful
  • Embedded collaborative storage into SCURL – a new lease of life for CASS (Scottish Collaborative store)
  • St Andrews academics ‘trust’ the British Library
  • Allowed discussion of ‘Access’ vs ‘Holdings’

Things to remember:

  • St Andrews have no option – completely full, no store
  • Trusted digital repositories
  • In perpetuity access
  • Russell Group Libraries involved – involvement makes scheme credible

St Andrews will be joining UKRR:

  • Expediency and necessity
  • Reaping the benefits of de-duplication
  • Initial financial projections OK
  • In St Andrews interests to preserve Document Supply service
  • Gives opportunity of partnership with BL [rather than just customer]
  • Things are changing – difficult to predict where we will be in 5 years time, but St Andrews sees UKRR as part of the picture

“The best way to predict the future is to invent it”

UKRR Funding Bid and Business Model Update – Mat Pfleger (Head of Sales and Marketing, BL)

Three main areas of the UKRR model:

  • Access (a.k.a. Document Supply)
  • Storage
  • Collaborative Collection Management (a.k.a. De-duplication)

In terms of costs to UKHE Institutions, there are two parts:

  • Cost of on going Access/Document Supply
  • Cost of joint storage at BL

These are separate costs, and are dealt with separately below. I think it is important to draw a distinction between the Document Supply issues and the UKRR Subscription, these are not explicitly linked. The BL is committed to running its Document Supply on a cost recovery basis – this will happen no matter what the progress with UKRR. What the BL is doing is looking at offering some enhanced terms of service to UKRR members (e.g. 24 hour turnaround on requests) Proposed new model:

  • Remodelled following significant feedback
  • Represents a more balanced bid (than the previous proposals)
  • Great chance of success (?)
  • Accessible Solution

Access

Document Supply Current Position:

  • Demand has dropped from 3.8 items p.a. and is now 1.6m items p.a. since 2001
  • However, Infrastructure costs remain
  • Some cost reductions and efficiencies have been made, but still operating below cost recovery to UKHE
  • BL has to try to move to full cost recovery in the next 2 years

The BL would need to increase prices by 30/35% (articles/loans) to get to a cost recovery model The BL Board has now approved a 10/12% increase for supply to UKHE from August 2008 – but this is just a stepping stone Options:

  • Continue with transactional model
  • Move to subscription model

BL models suggest that subscription model will result in smaller price increases (8-18% vs 28-34%). Suggestion for subs model is that there would be options to subscribe at different ‘volumes’ depending on number requests you expect to make – but each year you can change as appropriate. BL will be running focus groups around current proposals. Earliest possible date for moving to subscription model would be August 2009.

Storage

Proposed that UKHE contribution should cover c.20% of storage costs (of total £1.3m) – substantial drop from previous models. Annual fee for storage would depend on JISC Bands, and would be a 5 year committment. This would be additional to proposed base subscription or transactional access for document supply services. The proposed costs were given as:

JISC Band Annual Fee 5 Year Committment
A £10k £50k
B £7.5k £37.5k
C £5k £25k

N.B. There was a Q&A around these ‘JISC Bands’ (see below) that established these don’t refer directly to the usual JISC Bands that run from A-K, but rather a simplified banding system (similar to that being adopted by EThOS)

In Phase 1 UKRR has been able to establish some costs for de-duplication – £26.16 per meter for an HEI (also a BL cost) – and this is the amount that would be (on average) available to an HEI taking part in UKRR. For universities taking advantage of UKRR Subscription would be get a one off cash benefit for enabling deduplication, but also would clearly make ongoing savings in terms of space – UKRR has put costings against this, and suggest substantial savings would be made based on cost of space. If the UKRR is to be sustainable, costs of store need to be met – and the more members of UKRR the more secure its future. If UKRR Phase 2 does not go ahead, then there will be a piecemeal approach, which is less likely to result in a long term sustainable approach.

Jean Sykes – LSE

  • Strength of Print collection is a key part of LSE library – attracts external users and funding
  • No immediate space problem (although will be by 2015)
  • Have put themselves forward as a reserve copy holder

However:

  • No library can provide for all researchers’ need
  • Vision is compelling
  • Collaborative approach + funds for disposal
  • Every library/university needs UKRR to deliver good service to its researchers

LSE groups and committees have discussed UKRR. A library group is drafting disposal criteria for academic approval LSE is intending to take part in UKRR phase 2

Questions/Concerns

There was a long period available for Q&A and comments, and the below represents my attempts to capture. If you asked or answered a question and have any corrections, please leave a comment on this post, and I’ll make corrections. I haven’t captured the full discussion, but I think I’ve got the key points.

Q: Is anyone looking at Arts and Humanities material?

A: Yes – not so much, but Southampton
, Birmingham (the latter looking at very very low use material that isn’t on open shelves). Possibly finding that serials are less controversial than monographs in Arts and Humanities.

The overall response to this question is that the opposition to disposal is not as problematic as we might fear. Although it does take effort, it is also ‘doable’.  Several anecdotes indicating that resistance can be overcome and move to alternative models of journal access is not as difficult to achieve as you might expect – often resistance comes from very specific departments or even specific academics. Also a few anecdotes indicating that you need to know where you aren’t going to win, and pursue other areas.

Q: Do de-duplication funds apply over mutliple copies?

A: Yes – e.g. Imperial Medical libraries

Q: How would Doc Supply subscription model apply to organisations with mutliple Doc Supply accounts (question from Cambridge that manages Doc Supply at a collegiate level as well as University library)?

A: No assumptions from BL that Cambridge would have to have a single account.

Q: New model looks very much more affordable than previously. When is phase 2 expected to start, and how can libraries sign-up to take part?

A: Phase 1 now extended to the end of August 2008. Surplus funding from phase 1 to be used to bring in some extra partnerts. Expecting to bid for phase 2 by end of March 2008, and expect to hear the outcome in September 2008. Post Easter 2008 there will be a call to institutions to indicate their interest in signing up to phase 2.

Q: Is de-duplication cost fixed, or amount against which you bid?

A: Costs given in presentations are averages, you get back the actual cost of de-duplication, not a fixed amount. However, overall the fund is fixed, based on expectations of the amount of de-duplication across the sector by UKRR project. Several institutions indicating their interest in signing up to phase 2. Also appreciation from the floor of the current model, and the transparency of the current proposals.

Q: How do we know where the reserve copies are?

A: In phase 1 not possible to build necessary systems to support this kind of information sharing (currently just on spreadsheets). There is now a bid to HEFCE shared services for a feasibility study to see what would be required in terms of systems to help manage this information. Feasibility study due to deliver end March 2008. Phase 1 has shown how sketchy journal holdings often are, and the need for much more detailed holdings information – down to volume/issue level.

Q: Is there any question of looking at monographs?

A: There is clearly a need to look at monographs, but at the moment journals is where UKRR is focussing, and no suggestion/timescale for monographs – and hope that many lessons learnt from dealing with journals will help inform future discussions around monographs.

Q: When will we know if bid to HEFCE has been successful?

A: Should know by July

Q: Does UKRR include Abstracts and Indexes?

A: Yes

Q: Call for expression of interest in joining end of Phase 1 mentions standards, including environmental issues and bibliographic issues?

A: Disposal under UKRR requires material to be shredded and recycled – so you have to use a company that can show they are capable of doing this. For bibliographic information, you have to have enough information to be able to identify and match journals (e.g. ISSN as well as Title). To join Phase 1, UKRR are looking for libraries who are ready to go – already having titles, already had discussions with academics etc.

Q: Is there a danger that the infrastructure is not in place for phase 2, and will never catch-up?

A: BL has some plans in place for a system to help manage information integrated into their existing systems – 6 month build period. Phase 2 will be ‘phased’ so, not everything all at once. Once the feasibility study around the system is finished in March, then the picture should be clearer.

Q: Phase 2 has £6m going into ‘system improvement’ (50% from HEFCE, 50% from BL) – what is this going to deliver?

A: Improvements to Doc Delivery systems – which would not happen without this project. This will improve delivery times (24 hours) and allow a number of smaller benefits (e.g. branding of doc supply for institutions). However, the main aime is making infrastructure sustainable. Timeline – 2 year program, working with BL Technical team – looking for a project plan that delivers early benefits in this 2 year program.

Q: What processes are in place to decide whether the BL moves forward with a subscription or transactional model?

A: BL looking to setup focus groups across all customers (not just UKHE or UKRR) for consultation on models proposed in todays presentation. Look to implement a cost recovery model (whether sub or transactional) by August 2009.

Q: There was a lack of clarity regarding costings in regard to costing groups (the presentation refers to ‘JISC Bandings’ but then only refers to ‘A’, ‘B’ and ‘C’). Can this be clarified? [I think there was some confusion between the UKRR subscription model and the proposed document supply subscription model]

A: The UKRR subscription model in the presentation today described 3 levels referred to as ‘JISC Bands’ and labelled A, B and C. This is meant to be a simplification along the lines of EThOS rather than the JISC Bands A-K used in other areas. BL thinking about the subscription model suggests that there are about 15 ‘volume’ groups represented across their customers, but they would like to reduce this in terms of subscriptions groups/bands – but until they have done a proper consultation they cannot say if this is possible. The BL will hope to complete their consultation by July 2008.

Q: If we take part in Phase 1, will we have to subscribe to UKRR to get access to deduplication funds?

A: No

Q: If we take part in Phase 2, do we have to subscribe to UKRR to get access to deduplication funds?

A: Yes

Q: If we join UKRR are we restricted in what we can dispose of?

A: Only where you have agreed to keep titles. The point of UKRR is to allow coordination of our effort. When titles are being held by only a few members UKRR would put out a call for members who will hold the reserve copy. Even where you have agreed to keep the reserve copy then there is the possibility of bringing this back and UKRR will look for an alternative site to take this responsibility.

Closing words: UKRR is for researchers in the UK, not for the libraries. We need to take a broad collaborative view. Digitisation is not an option for entire print legacy – it may not be a problem for ever, but not a practical option over reasonable timescales. Although we cannot predict the future, we have to do something, and this is a good shot!

N.B. Jean Crawford will take over as UKRR Project Manager from Nicola Wright in March 2008.

Blogged with Flock

Integrating a Repository into the Academic workflow

I mentioned in my last post that at Imperial we were doing what we could to make the actual repository invisible to the academics. We did this by exploiting the existing college-wide ‘publications’ system that many academics were already regularly using. For further details on this, you can see the paper written by Richard Jones and Fereshteh Afshari – in our repository of course – but you don’t need to know that 😉

Blogged with Flock

R.I.P.ositories

As Universities are investing more in building/supporting/maintaining Digital Object repositories – specifically for OA purposes – I’m also starting to see quite a bit of ‘anti-repository’ sentiment on various blogs.

Perhaps the earliest post I saw in this area was The importance of being open by Andy Powell, in which he pondered the difference between the JISC IE approach and the Web 2.0 approach. For those not familiar with the JISC IE approach, I’d recommend looking at Andy’s presentation on Item Banks and the JISC IE specifically slide 8.
I made a comment in response to Andy’s post suggesting that "the IE encourages the idea of a closed community rather than integrating with the wider world" – and looking at the post and IE diagram again, I think this is a definite problem – note that the ‘presentation’ layer of the IE diagram doesn’t have a straightforward ‘open web’ or similar – all via portals or systems.

Interestingly as early as 2003, Clifford Lynch said "a university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members" and also noted "An institutional repository is not simply a fixed set of software and hardware" (ARL : A Bimonthly Report, no. 226 February 2003) – but much of the activity around repositories seems to have focussed around the software options rather than the services.

In a later post (The R Word) Andy says "The important point, at least as far as open access is concerned, is not that such papers are deposited into a repository but that they are made freely available on the Web." – this starts to get to the heart of the matter I think. In the post The Repository Roadmap – are we headed in the right direction? Andy takes this further, saying "One of the things I want to argue in the presentation (though I know that this is something that Rachel, my roadmap co-author, strongly disagrees with) is that, from the perspective of consumers, repositories are just Web sites.  Somehow, it almost feels like heresy to say so – I don’t know why!?"

At the CRIG Unconference Paul Walk from UKOLN was quoted as saying “Wouldn’t it be great if the outcome of this unconference was that repositories were just wrong?” and although he made it clear that this was "a sarcastic response" (Repositories get my vote), the remark obviously struck a chord with more than one participant.

To bring us up to date Andy blogged about his presentation at the VALA 2008 conference (Repositories thru the looking glass), pushing three key points – which I paraphrase here:

  • "…our current preoccupation with the building and filling of ‘repositories’ (particularly ‘institutional repositories’) rather than the act of surfacing scholarly material on the Web means that we are focusing on the means rather than the end (open access)…"
  • "…our focus on the ‘institution’ as the home of repository services is not aligned with the social networks used by scholars, meaning that we will find it very difficult to build tools that are compelling to those people we want to use them…"
  • "…the ‘service oriented’ approaches that we have tended to adopt in standards like the OAI-PMH, SRW/SRU and OpenURL sit uncomfortably with the ‘resource oriented’ approach of the Web architecture and the Semantic Web…"

Paul Walk responds to this – with much agreement and some slight disagreements, and there have been quite a few twitter’s around supporting Andy’s basic points. I’m not altogether I quite get the differentiation between ‘service oriented’ and ‘resource oriented’, but luckily Andy promises to expand on this at some point…

So – what’s my take, as Project Director for the Digital Repository at Imperial College (Spiral) (which we are formally launching next month)?

Well, firstly it’s ironic that the popular repository software was built with OAI in mind, and this was (at least partially) intended to make the ‘invisible web‘ available to search engines. The problem with this approach seems to have been to build a secondary web of data sources which talk to each other via protocols not widely adopted outside the immediate community.
I suspect that within the repository community there is some tendency to think in a silo mentality – a repository is a container you put stuff in, and it leads you to think that you need an ‘interface’ for everyone to access the stuff in the container.

On the otherhand, the criticism seems to overlook some of the work done in providing interfaces that are more ‘native web’ than this – for example, the e-prints.org call for plug-ins – to do things like expose repository entries as RSS feeds etc.
I believe that one of the problems is that repositories aren’t (usually) just ‘webpages’ – they contain what is essentially ‘print’ material in an electronic format. Flickr has been used as an example of a ‘repository’ which integrates more happily with the web – but I’m not convinced it has been any more successful at exposing the material in Flickr than say arXiv is for scholarly papers (some very quick Google searches for images (image search or standard google search) seem to indicate that Flickr doesn’t feature particularly heavily in the results) – Amazon is more successful, and Wikipedia is particularly successful at getting into Google search results of course – but certainly the latter is ‘born digital’ and part of the warp and weft of the web rather than discrete documents being ‘added’ into the web. This needs further thought – is it possible to really ‘integrate’ pdfs into the web? Perhaps we need a shift to true ‘born digital’ publishing for this to be successful.

I believe (and I’m pretty sure that here I’m violently agreeing with at least much of the previous writing), we need to focus on exposing repository content to the web. It may be the systems we currently use don’t aid this enough, but it’s as much an attitude as anything else. At Imperial I’m keen that we see the academic’s ‘Professional Web Pages‘ as the route to the repository (in the near future there will be links from these pages to the full text articles in the repository where available). We’ve also gone as far as we can to ‘hide’ the repository from the users – ideally the academics don’t need to know we have a repository, they are just ‘uploading’ their full-text papers.
At the same time, what we need is a system that helps us administer the workflow around the delivery of digital objects in a corporate environment, but that is invisible to those not involved in the administration – and that’s what I want out of a ‘repository’ – so, for me, the Repository is dead, long live the repository.

Blogged with Flock

SCOAPing High Energy Physics

On Wednesday afternoon I attended a meeting about SCOAP3 (Sponsoring Consortium for Open Access Publishing in Particle Physics). The event was mainly attended by librarians, although there were a couple of real researchers present as well (this was a shame – it would have been nice to see more people from the HEP research community there.

SCOAP3 is an attempt to experiment with an alternative model for peer-reviewed publication, specifically in the area of High Energy Physics (HEP). Essentially the core of the model is that we (the community) should pay for the ‘value added’ parts of the publication process – specifically the peer-review service, while dispenses with the traditional model of ‘journal subscriptions. The idea is that publishers could tender for the business of supplying the peer-review service, and that the money currently spent on subscriptions would be redirected via SCOAP3 to pay for this.

The reason for choosing HEP as a starting point is that it is a small community, already much given to collaboration, and 80% of HEP research is published in just 6 journals – so only a very small set of publications to deal with. Also, 90% of HEP research is already available OA via repositories (mainly arXiv I guess), and this is how most researchers access the research. Finally, HEP has a long history of depending on pre-prints – circulating printed copies of pre-prints at the expense of the community well before OA (or possibly even the Internet) was thought of. So in fact, the existing journals really are basically a way of ensuring peer-review still happens, much more than being a primary means of distributing research.

The meeting was essentially a set of presentations from those involved in SCOAP3 (including Prof. Ken Peach, Jens Vigen and Salvatore Mele) – all the presenters came across as enthusiastic and committed.

At the moment what SCOAP3 is looking for is ‘expressions of interest’ from the libraries/institutions who currently subscribe to the relevant HEP publications. In the longer term, these libraries/institutions will be asked to redirect their expenditure from the journal subs, to SCOAP3, who will in turn pay the relevant publishers to provide the peer review service. This is all hung round with caveats – it has to be affordable (i.e. not cost more than the current model), libraries would have to get money back where HEP journals are part of ‘big deals’ so that they can redirect the expenditure etc.

The presentations were followed by some discussion, but this was a bit circular because essentially I think everyone can see there are all kinds of potential problems and what-ifs, but all SCOAP3 are looking for at the moment is for the interested parties to say ‘hmm, sounds interesting, and if you can pull it off we’ll give it a go’

Probably the main concern raised in the discussion was sustainability. Specifically, if you don’t charge for either publication or access, what is to stop institutions deciding that they don’t want to contribute to the peer-review costs? There was definite feeling from some present that the libraries would find themselves pressured into cutting these payments to save money – essentially leaving a smaller pool of institutions to cover the costs of the peer-review process. To be honest it seems to me that they could almost certainly already do this (and maybe some places have?), as with 90% of the research in arXiv or other repositories, and the claims that this is how most researchers access it, paying for a subscription to the journal is actually just a way of keeping the whole thing going. However, there is no doubt that making this explicit raises the likelihood that someone will say ‘why are we paying for this’?

The next stage is for SCOAP3 to contact the librarians at the relevant institutions with a request they sign an ‘expression of interest’ – I suspect this is really not an issue. Despite the fact there were many questions about the model and how it would work at the moment the level of committment SCOAP3 is looking for is really very low.

Imperial College is very interested, and there is no doubt we will be signing an ‘expression of interest’. We’ll just have to see what happens after that. Once SCOAP3 has managed to get expressions of interest from the relevant parties (and they still have a little way to go on this – the biggest thing being the meeting in the USA next month), it can start the process of issuing a tender to publishers – this is the only way the actual cost can be established.

Overall SCOAP3 is an exciting initiative – it was referred to several times during the day as an ‘experiment’ – and I think this is right. Of course, experiments don’t always turn out how you hope – but you always learn something…

Yahoo! Announces Support for OpenID

Yahoo! Announces Support for OpenID – quite a big announcement for OpenID fans. This starts in beta on 30th January – Plaxo will be one of the first services that will allow you to use the OpenID aspect of your Yahoo account to authenticate.

For those who haven’t come across OpenID, it is a way of using a single ‘identity’ across a number of services, with the control of the OpenID resting largely with the user, rather than with an organisation.

The Official OpenID site has a better explanation than I have time for here…

Blogged with Flock