CKAN4RDM workshopPosted: March 2, 2013
With thanks to Carlos Silva, KAPTUR Technical Manager, for the following blog post.
On 18th February I attended a workshop led by the JISC funded Orbital project, to gather information about the open source software CKAN and how it could be used to support research data management in the academic sector.
The workshop started with a presentation from Mark Wainwright (community co-ordinator for the Open Knowledge Foundation) on the latest release of CKAN, its origins and potential in the academic community.
One of the big advantages with using CKAN is that the ‘core’ system is surrounded by APIs allowing it to be flexible enough to accommodate different user and institutional needs. This means that the core software can be updated without affecting the APIs or having to adapt external code to fit with the core software.
Another important feature that looks promising is the ability of CKAN to not only harvest other CKAN databases, but also to search other types of repositories such as EPrints and DSpace. The mechanism developed covers different repository sources not only EPrints and DSpace, but also Geospatial Servers, Web catalogues and other HTML index pages.
In terms of sustainability, CKAN has been developed over the last 6 years, so it is relatively mature now with an extensive and very streamlined workflow process to add features, fix bugs and enhance the core services. The latest version 2.0 (recently released as Beta) promises to be an exciting release with more visually enhanced tools, improved groups feature, customisable metadata and a rich search experience based on their Apache Solr search.
The workshop continued with a presentation from the data.bris project at the University of Bristol. It is amazing to note that each Principle Investigator can apply for up to 5TB of storage for free and backed up securely for 20 years!
Academics receive a mapped network drive which they can access and use to deposit content, however this requires additional features to manage research data. Therefore, the data.bris project was interested in CKAN due to its flexibility, data access (ability to have private datasets), organisation schema, ability to share with external researchers and the CKAN search engine.
In the future, the University of Bristol is considering two instances of CKAN, one for a public read-only catalogue of research data publications and another for controlled access (which would include teaching and other types of data).
The third presentation was from Orbital; Project Manager Joss Winn provided a virtual tour of the latest tools developed by the project. They have connected CKAN between different instances: to their EPrints repository and also to different departmental databases, such as an awards management system.
The Orbital set up allows their researchers to have different types of data located in a central place, this includes the policies, profiles, publications and analytics information from specific outputs, making the most of the CKAN software.
The demonstration included mention of the software created to enable deposit of data from CKAN to their EPrints repository – something which we have been anticipating for the last few months and is an exciting development for the sector. Orbital have released the code through Github which in theory should work with CKAN version 1.7. The functionality enables CKAN to submit the metadata to EPrints using the SWORD2 protocol but not the actual files themselves – instead a link is added to EPrints which links back to the files deposited in CKAN.
The Orbital team are proposing a two year roadmap to their senior management team to take responsibility and carry this project forward and embed it further into the University of Lincoln’s infrastructure.
During the group discussion session, workshop participants suggested a comprehensive list of about 80 tools, features, amendments and requests that we would like to see as part of a new version of CKAN (a Google Docs spreadsheet is available: http://lncn.eu/mxz2). Again in groups we did a GAP analysis for the specific items requested and a CKAN expert was available to answer any questions.
As an academic community we found that there were lots of similar challenges which should be easier to address collaboratively.
From the visual arts community perspective although CKAN can’t currently address all the requirements from our user requirements list (PDF) there is scope for further development and this is continuing in the right direction.