Research Data Hack Day in ManchesterPosted: May 9, 2012
The following blog post is by Carlos Silva, Technical Manager for Kaptur:
The Hack Day started with quick presentations from attendees to find out about our projects, our interests, pose questions and to start assembling teams who shared similar ideas, ambitions and problems.
By the end of the afternoon we were allocated a team and a task to do and started working on a particular problem.
There were four teams which covered the following topics:
- Stakeholder Driven Metadata
- Dropbox for Institutions
- SWORD 2 protocol and Bit Torrent
- Data collection from research activities
1. Stakeholder Driven Metadata
Using a metadata map we were trying to map different schemas such as Dublin Core with OAI-PMH and the British Library.
Looking at this from a users perspective, the users will need to follow a certain workflow, for example using a DMP and so on (N.B. view prezi about this).
The team also worked on an example to show different types of handling DOIs and metadata between different schemas: http://homes.ukoln.ac.uk/~ab318/datacite/
I mentioned that the Kaptur project involves creating a model of best practice in management of visual arts research data and how using different types of metadata schemas was a problem for some institutions. I also mentioned that researchers in our sector need to handle different types of data and not only large amounts of data but also different metadata schemas and fields that may not be covered by the default Dublin Core or OAI-MPH schemas.
Finally there was an unofficial launch of the Journal of Open Research Software: http://openresearchsoftware.metajnl.com
2. Dropbox for Institutions
Sparkleshare was mentioned during the presentation, but it was noted that it is unstable to use in production environments.
A blogpost is available here with more information: http://blogs.bath.ac.uk/research360/2012/05/mrd-hack-days-file-backup-sync-and-versioning-or-the-academic-dropbox/
3. SWORD 2 protocol and Bit Torrent
SWORD 2 is a protocol for depositing content and its metadata with a repository.
The issue for this group to discuss, was to how to enable any type of file to be deposited.
Big deposits can take a long time to transfer; this isn’t a problem in itself, but there are problems around it. For example you can do partial uploads, however if the transfer is interrupted the repository will not be able to create a record.
Using SWORD and Bit Torrent the team were trying to tackle the problem by splitting the file into chunks, which will allow submitting large files and allow them to upload them into the server despite interruptions.
Advantages could be found immediately: it is secure, you can track it and also limit the number of uploads.
This project won support for further enhancement and will receive two days paid by JISC to further enhance it and develop it.
4. Data collection from research activities
The concept was straightforward: when people start to upload content, information will come not only from the users, but also from the actual file itself.
The team attempted to build an API to do this, however further time was needed to complete this.
Ultimately the project was intended to be a very big feed that will tell what has been done around the whole record such as visits by a researcher, modifications to the file, anything to do with the record so that all that information could be gathered by the System Admin to create reports.