This is our update for the end of the thirteenth month of KAPTUR.
WP1: Project Management
- The whole Project team met on the 13th November at The Glasgow School of Art.
- Over the last month we have been managing the challenge of two of the four Project Officers resigning from the project. John Murtagh was part-time at University of the Arts London (UAL) and has successfully applied for a full-time role at the University of East London working on their RDM training project (starting on 26th November). Tahani Nadim has been awarded her PhD and has accepted a post-doc position at another institution which will begin in the New Year; interviews with internal candidates are scheduled for December.
- On 14th November the Project Manager met with colleagues at the UAL, including John’s replacement, Sarah Mahurter, Manager of the University Archives and Special Collections Centre. Betty Woessner, Research Systems and Data Manager, will work with the DCC on the Institutional Engagement project.
WP3: Technical Infrastructure
- The Technical Manager attended the JISCMRD programme event, 24th-25th October 2012, Nottingham. It was an opportunity to share the technical work that we have been piloting and also to learn from other projects. Following a presentation from Richard Jones, representing the DataFlow project, and a practical hands-on workshop, there was no resolution to the fact that DataStage is unable to connect with EPrints.
- The Technical Manager has created a test instance of CKAN as this appears to be a way forward with a stronger case for long term sustainability as well as building on the work of University of Lincoln’s Orbital project.
- University of the Arts London have reported that their policy does not need to be approved by the Academic Board, so this completes their delivery of WP4: http://www.arts.ac.uk/research/data-management/
- University for the Creative Arts and Goldsmiths, University of London have had their draft policies approved at the same level as UAL, however these now need to go on to their Academic Boards in January for final approval.
- The Glasgow School of Art have revised their timescale for the policy due to the recruitment of two key staff who they want to feed into the policy; this is now expected to be approved at their Research and Knowledge Exchange Committee meeting in February. Academic Board approval is not required.
- The four policies will be made available through DCC in due course (UAL’s policy is already available via the link above).
WP5: Training and Support
- The first KAPTUR training workshop was held at UAL on Monday 19th November, with support from Marieke Guy and Joy Davidson from the DCC (due to the Institutional Engagement work). Further details and a list of attendees is available here: http://ualrdm-eorg.eventbrite.co.uk/ Presentations are available online here: http://slidesha.re/QTrHcs http://slidesha.re/SnzvBL http://slidesha.re/QnwQIq
- The further three KAPTUR training workshops are scheduled as follows: 27th November (Goldsmiths) with follow-up in January; 30th November (GSA) with follow-up in January; 16th January (UCA).
- Feedback is being gathered from participants to each workshop as well as from the Project Officers themselves, this will then lead to refinements of the KAPTUR training plan.
- The materials used as well as the training plan will be reviewed, re-purposed and re-packaged for use in common Virtual Learning Environments and also for deposit to JORUM. This will form the KAPTUR toolkits.
WP6: Evaluation and Sustainability
- Two of the four case studies have been completed to very good draft stage. The UAL and Goldsmiths Project Officers were asked to focus on this aspect of the project ahead of schedule in order to capture their knowledge before they leave. Their successors will make any adjustments required.
- The new UAL Project Officer and the Project Manager are attending the JISCMRD Benefits programme event in Bristol, 29th-30th November.
- Both the IDCC13 paper and poster proposals were successful.
- The Technical Manager presented at the JISCMRD programme event on 24th October, Nottingham (Carlos’ presentation). The Project Manager also presented a poster (available with audio explanation here) and was part of the Selecting and Appraising Research Data session on 25th October (blog post).
- Jacqueline Cooke attended the RDM Training workshop on 26th October (blog post).
- Anne Spalding attended the DataCite workshop ‘Managing Sensitive Data’ on 29th October (blog post).
- Carlos Silva attended the RDM Forum ‘Shaping the infrastructure’, 14th-15th November (blog post).
With thanks to Anne Spalding, Kaptur Project Officer, University for the Creative Arts, for the following account of DataCite’s Managing sensitive data workshop, The British Library, London, 29th October 2012.
On Monday 29th October I attended my first DataCite workshop; this particular workshop is the third in a series. Slides from this and previous workshops are available via The British Library Datasets web pages.
During the morning session there were four presentations followed after lunch by a workshop where four groups focussed on data management scenarios. Feedback from the workshops and a general discussion rounded off the day.
The first speaker, Veerle Van den Eynden spoke about managing sensitive data from the UK Data Archive‘s experience. She explained in broad terms the legal aspects and also the role that research ethics, data archives and repositories play in the management of research data.
Jonathan Tedds from the BRISSkit project spoke of managing medical and personal data. As part of the project a survey of 3000 staff was conducted in 2010 regarding their own use and re-use of research data. In due course a summary of their findings will be available as part of the project outcomes. Jonathan emphasised the need to make the process of depositing data more engaging for researchers. Jonathan mentioned work in managing research data undertaken by the University of Virginia Library.
From UKOLN, Cathy Pink gave a very interesting presentation on working with commercial partners as part of the Research 360 project. One focus of the project is on the issues and challenges that arise from private sector partnerships and research collaborations. Cathy illustrated the different collaboration agreements that are in place at Bath University. Another important aspect of citing and discovering research data is the use of metadata and Cathy cited the work of Sally Rumsey ‘Just Enough Metadata’.
The final presentation was given by Brian Mathews of the Science and Technology Facilities Council (STFC). Brian’s talk focussed on some issues in research ethics arising from data sharing and also that we are working in a political environment. He referred to the Opportunities for Data Exchange (ODE) and a paper entitled ‘Ten Tales of Drivers and Barriers in Data Sharing’.
One of the main discussion points emerging from the workshops and feedback was the use of Digital Object Identifiers (DOIs). A particular issue was with assigning a DOI to a single object which could change over time and how to note this, is another DOI required? Could an umbrella DOI be assigned for the whole object but somehow allow for changes? Solutions for handling this might depend on work practices within institutions.
This event provided me with a further insight into the complexities of managing research data. The variety of perspectives also demonstrated that we are all grappling with the same issues but might well take different solutions dependant on the institutional environment.
This is our update for the end of the twelfth month of KAPTUR; we are just past the two-thirds mark! For an overview of the past year, please visit the KAPTUR Prezi.
WP1: Project Management
- The Project team have been in contact by telephone and email; four colleagues will be attending the JISCMRD Programme meeting this week in Nottingham.
WP3: Technical Infrastructure
- The Technical Manager has written a blog post with more detail here: Working in Stages with DataStage and Figshare
- The project partners have provided feedback on KAPTUR’s DataStage pilot site.
- The four policies are going through several rounds of committees and are to schedule; this has been the focus of the past month.
- In addition the University of the Arts London’s draft policy is available online: http://www.arts.ac.uk/research/data-management/
WP5: Training and Support
- The KAPTUR training plan is now publicly available.
- The Pinterest links have been linked to via UAL’s RDM pages and DCC’s Marieke Guy’s excellent blog post on The value of video in getting the RDM message across
- The GSA Project Officer taught MRes students about research terminology covering research data and promoting the KAPTUR project; this will feed into our training materials. Blog post about this: Getting to grips with research terminology
- The Project Officers have been in contact with their Research Offices to arrange a half-day training session for Research Office staff and Librarians in order to pilot the KAPTUR training materials.
WP6: Evaluation and Sustainability
- The Project Officers have received a short Word document and model costings template (Excel) and will be piloting this within their own institutions.
- Detailed case study templates have been created and shared with the Project Officers. The case studies will be presented at the end-of-project conference on Wednesday 6th March 2013.
- The Project Director and Project Manager met with DataCite at The British Library to discuss the licence and other aspects of using DataCite.
- The UAL Project Officer’s presentation to Library staff is available on SlideShare.
Last week’s DataCite workshop was a really good opportunity to ask questions about DataCite at The British Library, how to mint a DOI (Digital Object Identifier), and to discuss challenges with citing research data.
The day started with a challenge to the presenters – what is data? This discussion had echoes of KAPTUR’s own research question – what is visual arts research data? (Environmental Assessment report). It seems almost impossible to define research data due to its diversity, but a working definition is obviously necessary, a good example is from University of Bristol’s Glossary.
The British Library’s Head of Scientific, Technical & Medical Information, Lee-Ann Coleman, spoke about the importance of making research data available, mentioning examples including the virologist Ilaria Capua who opened up worldwide access to Avian flu virus data sequences; and the open-data journal GigaScience research into E.Coli. A recent addition, ISO 26324:2012 for DOIs was mentioned. Garfield’s 15 reasons ‘when/why to cite?’ was a useful point of reference too:
- Paying homage to pioneers.
- Giving credit for related work (homage to peers).
- Identifying methodology, equipment etc.
- Providing background reading.
- Correcting one’s own work.
- Correcting the work of others.
- Criticizing previous work.
- Substantiating claims.
- Alerting researchers to forthcoming work.
- Providing leads to poorly disseminated, poorly indexed, or uncited work.
- Authenticating data and classes of fact – physical constants, etc.
- Identifying original publications in which an idea or concept was discussed.
- Identifying the original publication describing an eponymic [sic] concept or term as, e.g., Hodgkin’s disease, Pareto’s Law, Friedel-Crafts Reaction, etc.
- Disclaiming work or ideas of others (negative claims).
- Disputing priority claims of others (negative homage).
Garfield, E., 1996. When to Cite. In: Library Quarterly 66 (4), 449-458. Available from: http://www.garfield.library.upenn.edu/papers/libquart66(4)p449y1996.pdf [Accessed 25 May 2012].
What is DataCite?
Elizabeth Newbold provided an introduction to DataCite. It is a not-for-profit international registration agency for DOIs to facilitate the citing of research data. Founded in December 2009; it consists of a Managing Agent (currently the German National Library of Science and Technology (TIB)) and regional Members. In the UK The British Library is the regional Member, which then works with ‘Data Clients’ such as the UK Data Archive amongst other data centres and repositories. DOIs are assigned between the Data Member (e.g. The British Library) and their Data Clients (e.g. UK Data Archive) i.e. on an institution to institution basis – if an individual researcher wants a DOI then they need to contact the appropriate Data Client for their subject discipline, a list of some existing and potential future Data Clients is maintained on the DataCite website. Data Clients must fulfil a number of requirements and pay an annual fee to The British Library.
Some of the requirements for Data Clients:
- DOIs must resolve to a publically accessible landing page even if the data itself is not open; the landing page can be an existing set of Web pages with the Data Client’s style so long as it is updated to include the DataCite information.
- Mandatory metadata fields: 4 fields (5 if you include the DOI itself) – these should be subject discipline agnostic: http://schema.datacite.org/
- The mandatory metadata must be freely available for discovery purposes, specifically under a Creative Commons CC0 licence; there was some interesting discussion around this and some issues to be resolved.
- Data Clients should have a formal data preservation plan (this may include disposal policies and so on); an operational service level agreement (SLA); and a clear intention in a mission statement to preserve and maintain the DOIs, this could include reference to an EPSRC Roadmap. Action: DataCite will share a draft SLA with the attendees.
How to mint a DOI – case study
Louise Corti of the UK Data Archive provided a very useful mini-case study and I’ll link to her presentation here when it is available. As data providers the UK Data Archive want to use citations to improve resource access and discovery. It was really interesting to hear how DOIs are effected by changes to the research data – at the UK Data Archive minor changes (e.g. a spelling mistake or typo) are documented in their Change Log but the DOI version number stays the same; major changes (such as an updated dataset) are documented in the Change Log field and the DOI is also given a new version number at the end. Challenges for the future include citing parts or fragments of research data; and also issues around describing relationships between data. Look out for a forthcoming UK Data Archive and ESRC brochure on citing data, aimed at the Social Science community.
How to mint a DOI – the technical bit
An illuminating presentation from Ed Zukowski described the following components of the DataCite systems:
The Data Client will be provided with information from the regional Member in order to make use of the Metadata Store and facility to mint DOIs, technical knowledge is required to use the API for bulk registration. For minting one DOI an XML file is required with at least the four mandatory fields of metadata using the DataCite Schema.
The user will resolve a DOI (e.g. using a system such as http://dx.doi.org/) through the Global Handle Registry this includes information from the Handle Server hosted by the DataCite Managing Agent. Resolving a DOI takes the user to a landing page and collects statistics about how many times a DOI has been resolved.
There is a free search of existing DataCite DOIs. From the top right of the Search page select ‘Options’ and ‘enable’ the Filter Preview, then when you do a search it is possible to filter by individual regional Member (‘allocator’) and Data Client (‘datacentre’).
The OAI-PMH Data Provider is available here: http://oai.datacite.org/
http://data.datacite.org/ – provides two ways of exposing metadata held in the Metadata Store:
- HTML links i.e. hyperlinks in a standard Web browser.
- HTTP Content Negotiation – ‘I say what I want and in what priority’ e.g. ‘I want a PDF version of the research data but if there is a HTML version I’ll take it’ – if there is a PDF version available content negotiation will take you straight to the PDF rather than to the landing page for example.
A really useful tool to format DOIs into Harvard system citations (and other citation systems) in multiple languages: http://crosscite.org/citeproc/
Breakout groups on challenges with citing research data (some questions):
- Selection process – what about raw data? when does data become citable?
- Why not use DOIs for Ph.D. theses?
- Do you need to mint DOIs before you publish the journal article so you can link to them? – could start minting DOIs at collection level then move into additional specific parts nearer to publication of the journal article?
- A need to define roles and responsibilities.
- What about changes to Data Clients or funding bodies?
- How does versioning work with DOIs? (note UK Data Archive case study above)
- What is a citable unit of research data?
- What about cross-institutional, international, or cross-disciplinary research? Who mints the DOIs?
- A need for DataCite to provide case studies, perhaps with future workshops.
- It is only possible to describe one resource type per DOI (and this is a fixed controlled list e.g. Image, Film, etc) – this may be problematic with visual arts e.g. an exhibition; how do you describe complex relationships?
For cost/charge plans – discuss with the UK regional Member via firstname.lastname@example.org
The next DataCite workshop will be on metadata on Friday 6th July, details will be published online in due course.
Some other links: