This presentation (the last of the day) by Sam Pepler from a data centre.
The OJIMS project is:
- Overlay Journal Infrastructure for Meteorological Science
- JISC and NERC funded
- Looking specifically at a ‘data journal’ rather than traditional publication
- Looking to evaluate business models for overlay journals
- Creating an open access subject based repository for meteorology etc.
The University of Leeds will lead development of a dataset review policy with the Royal Meteorological Socitey (RMetS).
A ‘data journal’ is a journal that links published documents with the data that the publication uses – cf CLADDIER another JISC funded project.
What are the benefits of a data journal?
- Extend the value of peer-review from papers to data, to provide assurance that data documentation meets the necessary scientific standards
- Metadata standards
- Independently understandable
- Re-useable
- N.B. about quality of data documentation – not about quality of data set (i.e. “can you use it”, not “is it useful”)
- Provide an overview of the quality and applications of data, enabling it to be used more easily and appropriately in research and applications
- adding independent quality statements about usefulness
- Provide recognition of the work of collecting and describing data
- High quality, reusable data is not presently a citable resources
- The writers of papers do not necessarily acknowledge those who collected the data
Why make an overlay journal?
- Data already in ‘a repository’ – just needs some independent review
- Because data is bulky, compound and complex – not easy to copy (possibly not as ‘self contained’ as traditional published paper?)
MetRep is a subject based repository for meteorological sciences. This was seen as filling a gap in the market – there is no store for some of the items they want to store. Examples of MetRep items are:
- Paper from ‘Weather’
- Set of pictures illustrating cloud forms (e.g. teaching aid)
- Report documenting a file format for climate models
- Weather balloon data
- Recording of a interview with ministers about climate change
- IPCC reports
- Logo for a research programme
Although some of these could sit in existing repositories – Institutional Repository, JORUM, websites, etc.
Perhaps MetRep should be an overlay repository? What does it mean to say an item is ‘in the repository’?
So – what is proposed is:
- Establishing a ‘Overlay document’
- Metadata about the overlay document
- Review process information
- Discovery metadata for the reference document
- Reference to document (referenceable via a resolvable id in a trusted repository)
- The ‘review process information’ consists of
- Version of document in review cycle
- Submitted
- In review
- Published
- Public comments
- Description of review process
- Digital signature?
- Version of document in review cycle
- Metadata about the overlay document would contain
- Author (of overlay no the referenced document)
- Other DC (Dublin Core) fields
- Discovery metadata for the referenced document and Reference to document
- DC metadata harvested from document (not sure if he means from the document, or from the metadata associated with the document?)
- Resolvable reference to document
- Other identifiers for document
The overlay repository would have overlay documents pointing to both documents or data
The advantages he sees in this approach:
- Clear the ‘overlay’ is a document about another document – the two items are distinct self contained
- Authorship for the referenced and referencing document are allowed to be different – others can submit a document for review
- The overlay document has the same meaning as a stand alone item – you can take it out of the repository context, and is still meaningful
- Review mechanisms and repositories do not need adapting to deal with these items
- You can review a private document/data set – answers the ‘is thing worth buying?’ question
Disadvantages:
- Authentication issues – might be able to ‘fake’ items?
- What if the author does not wish for document to be reviewed?
Implementation:
- Atom XML representation (mention of OAI-ORE here)
- Already a popular format with many tools
- Need a tool to create the records
- Need a web rendering method
Trusting repositories:
- More than resolvable identifiers – need to believe the object is preserved
- Need to know what preservation means for complex objects
- Repositories need to have sound footing – but there are no absolute guarantees
Somewhere along the line I’ve lost the point of what we are trying to achieve with this approach – Sam is now summarising, so hopefully this will help:
- OJIMS is about widening review processes beyond papers
- This means storing a wider range of objects – hence MetRep
- Data is a good e.g. of valued stuff which is not recognised in formal manner – hence ‘data journal’
- Lots of repositories are already storing the things – hence ‘overlay repository’
- … didn’t get the last couple of points
Overall seems to be about a way of recording the ‘review process’ alongside the actual object being reviewed.