Photos from the KAPTUR conference at RIBA
Posted: March 7, 2013 Filed under: events | Tags: jiscmrd Leave a commentSelect one of the following thumbnails to view a close-up and slideshow:
KAPTUR conference, London, March 2013
Posted: March 7, 2013 Filed under: events | Tags: CKAN, EPrints, jiscmrd, RDM policies, technical analysis, ukdcc Leave a commentCKAN4RDM workshop
Posted: March 2, 2013 Filed under: events, technical | Tags: CKAN, jiscmrd, Orbital 1 CommentWith thanks to Carlos Silva, KAPTUR Technical Manager, for the following blog post.
On 18th February I attended a workshop led by the JISC funded Orbital project, to gather information about the open source software CKAN and how it could be used to support research data management in the academic sector.
The workshop started with a presentation from Mark Wainwright (community co-ordinator for the Open Knowledge Foundation) on the latest release of CKAN, its origins and potential in the academic community.
One of the big advantages with using CKAN is that the ‘core’ system is surrounded by APIs allowing it to be flexible enough to accommodate different user and institutional needs. This means that the core software can be updated without affecting the APIs or having to adapt external code to fit with the core software.
Another important feature that looks promising is the ability of CKAN to not only harvest other CKAN databases, but also to search other types of repositories such as EPrints and DSpace. The mechanism developed covers different repository sources not only EPrints and DSpace, but also Geospatial Servers, Web catalogues and other HTML index pages.
In terms of sustainability, CKAN has been developed over the last 6 years, so it is relatively mature now with an extensive and very streamlined workflow process to add features, fix bugs and enhance the core services. The latest version 2.0 (recently released as Beta) promises to be an exciting release with more visually enhanced tools, improved groups feature, customisable metadata and a rich search experience based on their Apache Solr search.
The workshop continued with a presentation from the data.bris project at the University of Bristol. It is amazing to note that each Principle Investigator can apply for up to 5TB of storage for free and backed up securely for 20 years!
Academics receive a mapped network drive which they can access and use to deposit content, however this requires additional features to manage research data. Therefore, the data.bris project was interested in CKAN due to its flexibility, data access (ability to have private datasets), organisation schema, ability to share with external researchers and the CKAN search engine.
In the future, the University of Bristol is considering two instances of CKAN, one for a public read-only catalogue of research data publications and another for controlled access (which would include teaching and other types of data).
The third presentation was from Orbital; Project Manager Joss Winn provided a virtual tour of the latest tools developed by the project. They have connected CKAN between different instances: to their EPrints repository and also to different departmental databases, such as an awards management system.
The Orbital set up allows their researchers to have different types of data located in a central place, this includes the policies, profiles, publications and analytics information from specific outputs, making the most of the CKAN software.
The demonstration included mention of the software created to enable deposit of data from CKAN to their EPrints repository – something which we have been anticipating for the last few months and is an exciting development for the sector. Orbital have released the code through Github which in theory should work with CKAN version 1.7. The functionality enables CKAN to submit the metadata to EPrints using the SWORD2 protocol but not the actual files themselves – instead a link is added to EPrints which links back to the files deposited in CKAN.
The Orbital team are proposing a two year roadmap to their senior management team to take responsibility and carry this project forward and embed it further into the University of Lincoln’s infrastructure.
During the group discussion session, workshop participants suggested a comprehensive list of about 80 tools, features, amendments and requests that we would like to see as part of a new version of CKAN (a Google Docs spreadsheet is available: http://lncn.eu/mxz2). Again in groups we did a GAP analysis for the specific items requested and a CKAN expert was available to answer any questions.
As an academic community we found that there were lots of similar challenges which should be easier to address collaboratively.
From the visual arts community perspective although CKAN can’t currently address all the requirements from our user requirements list (PDF) there is scope for further development and this is continuing in the right direction.
Reflections on the 8th International Digital Curation Conference
Posted: January 30, 2013 Filed under: events | Tags: idcc13, jiscmrd, ukdcc Leave a commentWith thanks to Emma Hancox, Assistant Archivist, University of the Arts London for this blog post.
From Tuesday 15th to Wednesday 16th January I attended the 8th International Digital Curation Centre Conference in Amsterdam entitled ‘Infrastructure, Intelligence, Innovation: driving the Data Science agenda.’ The conference was an invaluable opportunity to learn from the research data management experience of professionals from a range of different countries and backgrounds. Here I will draw on highlights of most relevance to the KAPTUR project, however an overview of the full conference including presentation slides is available on the Digital Curation Centre website as are videos of some of the talks.
Day One: Tuesday 15th January
‘Growing an Institution’s Research Data Management Capability through Strategic Investments in Infrastructure’, Anthony Beitz, Monash eResearch Centre.
The key message I took from this talk was Antony’s call to ‘adopt, adapt and develop’, in essence look at solutions that already exist and develop them. Anthony advocated going out into the research community to see what solutions researchers already use within their communities as they tend to be more loyal to their research community than their institution. He also emphasised that a lot of the work has already been done for us; we can use Facebook for marketing, Twitter for customer service and we can adapt a range of open source software to meet our needs.
‘Building Services, Building Communities, Supporting Data Intensive Research’ Patricia Cruse, Director, University of California Curation Centre.
Patricia Cruse emphasised the importance of researcher engagement as early as possible in the digital curation lifecycle. She gave two very useful pieces of advice; ‘start small’ with a simple solution that can be built upwards when more complex problems are met and employ flexible solutions that can be adapted to diverse situations. UCC has a number of tools to assist researchers such as UC3Merritt (for the management, archiving and sharing of digital content) and the Web Archiving Service which allows researchers to capture, analyse and archive websites used in the course of their research. More information is available on the UCC website.
Minute Madness
The minute madness session gave poster demonstrators one minute to encourage delegates to view them and vote for them! Many posters represented projects of interest to KAPTUR and I enjoyed wandering around and exploring the display later in the afternoon. Posters of interest included ‘Creating an Online Training Module on Research Data Management for the University of Bath’ (training in research data management is something that KAPTUR project partners will certainly need to consider in the future) and the poster for IMEJI an open source software tool from Germany providing free storage, sharing and metadata creation for audiovisual content which I can see being of use in a visual arts research data context.
Day Two: Wednesday 16th January
‘Institutional Research Data Management’
On the second day I chose from a programme of parallel sessions. In the morning I learnt about the journeys professionals from the Universities of Bath, Edinburgh, Nottingham and Oxford had been on to create, implement and improve research data management capabilities in their institutions. Amongst much useful information I learnt that The University of Edinburgh has created MANTRA, an online learning module available under an open license so it can be rebranded and used by others. Thomas Parsons from the University of Nottingham commented that researchers typically store their data in five places. This emphasised to me the need for research data management training and the value of training modules such as MANTRA. From surveying researchers James Wilson from the University of Oxford found that types of data he had expected to be in a minority, were actually used more frequently than expected. I wondered whether we could also expect this with visual arts research data.
‘Arts and Humanities Research Data’
In the afternoon there was a chance to hear about Arts and Humanities Research Data and an overview of KAPTUR was given by Carlos Silva from the University of the Creative Arts. Following this Marieke Guy gave a presentation entitled ‘Pinning it Down: towards a practical definition of ‘Research Data’ for Creative Arts Institutions.’ This talk discussed work done by the DCC in collaboration with UAL to explore the nature of visual arts research data. Marieke reflected on the fact that whilst there is much consensus on research data in the sciences, this is lacking in the visual arts. Research has suggested that arts researchers do not tend to find the term ‘research data’ useful and find ideas such as ‘documenting the research process’ more useful. She suggested that a definition would be useful, but adopting a scientific vocabulary for the arts can be problematic.
The talks about Arts and Humanities Research Data were the last I was able to attend before I left the conference and ending on this note proved useful for reflecting on the conference in terms of the KAPTUR project. What I felt I took away from IDCC 2013 was that there is much that can be gained from projects at other universities and also a range of existing tools that can be developed and adapted to make life easier. In the visual arts environment, however, we need to continue to think about how research data can be defined since it doesn’t necessarily fit into the same categories as data at other Universities I heard from at IDCC. We also need to tailor solutions to our own unique context.
RDMF9: Shaping the infrastructure, 14-15 November 2012
Posted: November 28, 2012 Filed under: events, technical | Tags: infrastructure, IT, rdmf9, ukdcc 2 CommentsWith thanks to Carlos Silva, KAPTUR Technical Manager, for the following blog post. The Digital Curation Centre’s (DCC) Research Data Management Forum was held at Madingley Hall, Cambridge from 14th to the 15th November 2012; presentations from the event are available online.
“Technology aspirations for research data management”
The take-home message for the day was that IT will need to be more involved with research and their collaboration will have an impact for future grants, projects and sustainability.
Jonathan Tedds presented lessons learned from University of Leicester via projects such as the UK Research Data Service (UKRDS) pathfinder study and Halogen as well as from other projects such as Orbital. Jonathan covered ‘top-tips’ to get researchers’ attention and how to develop software as a service through the BRISSkit project (Biomedical Research Infrastructure Software Service kit).
Steve Hitchcock covered lessons learned from DataPool on building RDM repositories. The project was specifically to do with SharePoint and EPrints however KAPTUR did get a mention as an example of other projects using EPrints and not re-inventing the wheel. Published in July 2012, an application in the EPrints Bazaar called Data Core:
“Changes the core metadata and workflow of EPrints to make it more focused for as a dataset repository. The workflow is trimmed for simplicity. The review buffer is removed to give users better control of their data.”
Paul O’Shaughnessy from Queen Marys, University of London, spoke about how their IT services are changing and how different parts of the university needed to be involved in making this happen. The University currently has around 16,000 students; they started an IT transformation programme, because their original set-up was not fit-for-purpose, for example there were 7 different email systems. After creating a strategic plan for the next 5 years they realised that a third of their funding income comes from research grants so investing in IT infrastructure to support this was crucial. They were investing from 3 – 4% whereas other Russell Group Universities tend to invest from 5- 10%. They followed a greenfield approach and mentioned the importance of letting the staff know that it was not just IT who will need to be involved and not just another project. An interesting number was that 25% of HSS grant applications were lost because of poor IT sections.
The aim of the Janet brokerage services is to become a community cloud of available resources, by:
- developing frameworks and procurement structures such as DPS to facilitate access to services
- working with DCC and JISC to ensure sensible requirements and priorities
- hoping to get to a conclusion early next year about these services (Janet is currently in talks with Google AWS, Dropbox and Microsoft Azure will probably follow)
There was a comment about limitations with Dropbox but also possibilities that universities may be able to use it in the future and overcoming the current issues of storing research data outside the EU.
Other topics and interesting points from the discussion:
- Suggestion that just as there are Faculty Librarians, we should have Faculty IT people.
- Recommendation to negotiate resources with IT, for example if there is someone with the skills try not to use that person to fix printers but for something more productive.
- A Russell Group University mentioned that 1TB of data stored over 30 years will cost close to £25,000.
Break-out session on the Engineering and Physcial Sciences Research Council (EPSRC)
There was discussion about the research data that they expect projects to make available. They mentioned the importance of joining and gathering together all metadata; and of bringing IT together; a drip feeding of information (for example through OAI, SWORD, other protocols to transfer information and allow metadata to be harvested).
Conclusion
Overall it was a good workshop which provided different points of view but at the same time made me realise that all the institutions are facing similar issues. IT departments will need to work more closely with other departments, and in particular the Library and Research Office in order to secure funding and make sustainable decisions about software.
Finally a ‘flexible’ yet, intelligent approach should be taken from IT for example the use of PRINCE2 methods do not fit research projects as they all change during the duration of the project. The Agile methodology should be used; involvement and knowledge about this from IT should be expected.