Taking research skills to the next level: librarians teaching data literacy

I accidentally caught a presentation at LIBER 2014 by Don McMillan at the University of Calgary Libra2014-07-03 14.32.33ry that showcased a great example of deep collaboration between a library and academic departments (Developing data literacy competencies to enhance faculty collaborations).  At Calgary science support librarians collaborated with faculty members in genetics and biochemistry to develop instructional sessions in which library staff taught students how to extract data from bioinformatics databases and protein repositories and use it to answer  structured questions.

The students gained real-life experience of working with data while the librarians developed domain expertise.  One reason for the success of the programme was that the data skills sessions were fully integrated with the students’ courses and participation gained them credits.  The second was that it built on previous collaboration between the library and the faculties, integrating genetics and biochemistry content into an existing information literacy programme and taking them a step further.

Libraries meeting the challenge of research support

At the LIBER 2014 conference, as so often, one of the most thought-provoking contributions to the discussion on how libraries can develop their workforces came fro2014-07-03 08.51.08m Professor Sheila Corrall, in this case suggesting a fresh approach to looking at how existing strengths can be mapped to the new challenges of supporting research in innovative ways (Mobilizing Invisible Library Assets for Innovative Research Support in the 2020 Information Landscape). The main challenges are familiar: networked data-­driven science, digital humanities, interdisciplinary research, dealing with policy developments and funding body mandates – open access, data sharing, and research impact.  Libraries need to change their offering and move to fill gaps in research support, moving from “service as support” to a deeper and more collaborative relationship.

She put forward two propositions.  The first that libraries should use their “intangible” or invisible assets to gain strategic advantage, the second that they should overextend themselves, undertaking activities that require more than their current capabilities. What she meant by “intangible assets” became clearer when they were broken down into human, relational and structural assets and she looked at how they were used at case sites:

  • Human assets (in library terms this might be expertise in collection development/archives administration, information organization/retrieval know-­how and teaching/training abilities, reference interviewing skills)
  • Relational assets (professional networks, trust and credibility built from from previous interactions with researchers, liaison librarians, cross-­unit collaborations, e.g. Research Office, Computing Services, )
  • Structural assets (Institutional-­level commitees/groups endorsed library role in research process, hybrid structure of subject liaison librarians and (new) functional specialists used to provide subject-related support)

This sounds pretty familiar, as does the idea of over-extending the library’s role.  The challenges of the previous decade – digital preservation projects, establishing an institutional repository, publishing linked data, digital humanities – all required us to re-deploy existing skills and learn by doing, as well as bringing in staff with new skills. Sheila Corrall argues that “In a dynamic, technology-driven environment, libraries cannot afford to wait until they are completely ready to act” although not advising us to be reckless in the process.  We already have some of the assets we will need to meet research support needs, just not all of them.

OAPEN deposit service: is it time to build a central infrastructure for Open Access monographs?

Last week’s JISC workshop on next steps for OAPEN explored the possibility of a European deposit service for OA monographs, potential benefits for participants and institutions, and the features that the latter, particularly libraries, would like to see in such a service.

The OAPEN Library, founded in 2010, has been very successful, attracting 60 publishers so far, with around 2,000 monographs.  On the reasonable assumption that the volume and value of OA monographs will continue to grow and that researchers may be required by funders to comply with OA mandates, the workshop looked ahead at what infrastructure would be needed to support these developments and the role that OAPEN might play at European level.   Areas of potential benefit to libraries in managing higher volumes of OA monographs include:

  • Content aggregation and discovery  OAPEN already aggregates content and promotes discovery but for libraries it could offer the facility to harvest metadata from a single source in a variety of formats  – ONIX, MARC XML, CSV, and MARC 21 through OAI-PMH and FTP – and metadata conversion and enhancement (adding DOIs, ORCID ids, grant information)
  • Integration with library catalogue and services  including OCLC WorldCat and commercial LMS suppliers
  • Preservation through e-Depot at the Koninklijke Bibliotheek and a partner such as CLOCKSS

One of the most valuable roles that OAPEN might play is in the deposit and publication workflows, ensuring that publishers are aware of funder mandate requirements, in the quality assurance process, and supporting communications between funders, authors, and publishers.  A variety of workflows can be supported depending on how funders prefer to work – examples were shown from existing participants including the European and Austrian research councils.  A central register of funder requirements can be maintained which avoids duplication of effort in explaining them to each publisher.  Discussions amongst participants revealed a wide variety of workflows in OA publishing, both from experience of journals and monographs, depending on publisher, funder, author and institutional preferences.  There was a recognition that the author-publisher relationship is often more complex when dealing with monographs.

A number of institutional participants are interested in taking part at the launch stage in the UK, Netherlands, and Austria.  Possible business models were explored.  Workshop participants supported the idea of a UK pilot with JISC in a lead role without committing to a particular business model.

National Monograph Strategy: scoping the problems

The_Shard  Credit: Ben Griffin.  Creative Commons Attribution-Share Alike 3.0 Unported

I spent last Friday in the shadow of The Shard with an assorted group of librarians, publishers (commercial and Open Access), JISC programme managers and project team, and representatives of RLUK and SCONUL working jointly on one of the strands in the National Monograph Strategy project led by JISC.  For those new to the project it is described as “exploring the potential for a national approach to the collection, preservation, supply and digitisation of scholarly monographs”.   The project blog page at http://monographs.jiscinvolve.org/wp/about-the-project/ provides more background but the three main outputs will be:

  1. A landscape study: A report that provides a coherent picture of the monographs issue.  This is complete and very comprehensive.
  2. The monograph problem: A report defining and assigning value for the problems that need to be addressed by a national monograph strategy.
  3. The monograph solutions: An outline of the possible solutions which could address the problems identified.

Last week’s workshop focussed on identifying the problems.  A number clustered around the unstable nature of publishing: the problem of sustaining monograph publishing, both commercial and OA, but particularly small university presses, when business models are breaking down; fragmentation of the monograph itself and its changing value in the eyes of academics; demand for publishing outstripping supply; pressure from the REF.

Business models also featured in discussion of the problem areas around shared acquisition, licensing, and access. While shared acquisition has the potential to deliver a better return on public investment there are significant challenges – how would we build a collaborative shared model acceptable to all stakeholders, particularly publishers? A true national collection needs to be widely accessible.  Licensing and copyright legislation were seen as barriers, although it’s possible to see licensing as part of the solution.  National licenses for substantial electronic collections already exist and there are models beyond the UK that would be worth exploring, particularly in Scandinavia.

Having seen collaborative collection development initiatives come and go over the years, I put down a marker for sustainability as one of the key problems, both in terms of long-term commitment from the libraries to maintain collections and in terms of preservation.   There was still a surprising amount of scepticism about digital preservation, which seems like a short-medium issue.  Sustaining the storage, maintenance, and acquisition of print collections is a costly problem with no easy solutions.  In a response to an earlier document from the project John Tuck and I pointed out that not all UK libraries are purchasing only to meet local needs.  The legal deposit libraries, those with a national research support role, and the specialised libraries with unique and distinctive collections, take a much wider view. Any funding model to support a NMS has to recognise the long-term costs of sustaining their role, which by implication would be enhanced.

The workshop concluded with participants voting on their personal top 3 priority problems and explaining why they had been chosen.  At a glance, voting was heaviest for defining the scope and purpose of a national monograph strategy and discovering what stakeholders want from it, followed by that hardy perennial, “uncatalogued stuff” (understanding what we have), and the lack of co-ordination in digitisation, where there is a danger of expensive duplication.  The absence of reliable methods of knowing what has been digitised and whether it can be re-used was lamented. If we are moving to large-scale supply of digital surrogates the problem will need to be overcome.

Books as good neighbours: is browsing still useful in the digital age?

pole_1The opening address of LIBER 2013 by Professor Peter Strohschneider, President of the Deutsche Forschungsmeinschaft, was a plea for libraries to remain both central to the university and at the heart of academic experience in the digital age, focussing on what a researcher can actually do in a library.   Supporting organisations like the DFG and other funding bodies should ensure that digital developments are driven by academic considerations.  The research library provides the context for scholarship – supporting discovery which needs serendipity.  Research is driven by curiosity.

The opening address of LIBER 2013 by Professor Peter Strohschneider, President of the Deutsche Forschungsmeinschaft, was a plea for libraries to remain both central to the university and at the heart of academic experience in the digital age, focussing on what a researcher can actually do in a library.   Supporting organisations like the DFG and other funding bodies should ensure that digital developments are driven by academic considerations.  The research library provides the context for scholarship – supporting discovery which needs serendipity.  Research is driven by curiosity.

As his argument developed, it sounded increasingly familiar – browsing as fundamental to academic research, existing knowledge ordered to promote discovery – until I recognised it as the guiding principle of the Warburg Institute Library where I worked for many years.  “Books as good neighbours.”  But is this still valid?  Physical collections in research libraries have grown vastly since the Warburg Library was founded in the 1920’s and researchers are particularly time-poor.  They are increasingly grateful for discovery systems that direct users to the most frequently-used resources.  Even 20 years ago serendipity sounded likely a luxury for gentlemen scholars.

Peter Strohschneider argued that discovery systems and the centrality of browsing, at least for certain disciplines, should not be irreconcilable.  The DFG promotes discovery systems and research infrastructures permitting the linking of collections and recognises that dialogue between scholars is also very important.  He concluded that these systems must be constructed to reflect the nature of academic enquiry and disciplinary differences.  It is important that libraries and academic departments manage and develop these research systems rather than university computing services.

His second theme was the value of collection building for the long term and the importance of the physical relationship between researcher and object.  Materiality and first-hand experience of collections or objects still matter.  Both library collections (books, archives) and non-library (objects and artefacts) involve sustained collecting, organisation, and curation. Books or objects may not have been used/read immediately on acquisition but they are retained and preserved. Things change and they may then be ordered and become a collection with a new function or purpose, citing the example of  Goethe’s library delineated by time and a person, now in the Anna Amalia Bibliothek, allowing him and his era to be studied.

Research Data Management: where are libraries now?

Are libraries the right place to manage Open research data and, if they are, how much progress have we made towards doing it?  This was a major theme of the 2013 LIBER conference, explored in at least seven presentations and workshops.  The consensus was largely that libraries had a significant role to play, and a number of institutions have started to explore the boundaries of what is possible within libraries – advocacy for Open data, support for metadata creation by researchers, helping to write data management plans, and curating data output from “small science”.  LERU (League of European Research Universities) is developing a roadmap for research data which sets out this list of roles for libraries.  Paul Ayris outlined their work in his presentation and went on to elaborate its implementation at UCL.

Top concerns for libraries included:

  • Research council requirements: What research data management services need to be provided to comply?
  • Infrastructure: How do we make the business case for investment?
  • What roles and responsibilities are right for libraries?
  • Are we recruiting and training the right staff? Are librarians capable of developing the technical skills required?
  • If we recruit or train data librarians, how do they gain an understanding of research practice?  Should they be embedded in research groups (and possibly become isolated) or the library?  

Liz Lyon (UKOLN and DCC), in her plenary presentation, Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in Libraries, provided the clearest guidance, as always, on where to start.  Looking at the implications of the EPSRC policy on research data as an exemplar, she outlined the requirements, and some models and tools that we might might use: the example of the business case written for her own university (Bath), the DCC simplified data management plans checklist, and Bath’s own DMP guidance and template.

She went on to look specifically at institutional data publication services, which could be library based – data curation in the repository, cataloguing and discovery, citation, and metrics – and source material for skills development.  Data librarians are few and far between but training opportunities are beginning to develop.  Liz Lyon highlighted the new ImmersiveInformatics pilot course at the University of Bath, co-developed with University of Melbourne.

Geoffrey Boulton (University of Edinburgh), author of the Royal Society’s policy paper on Science as an Open Enterprise, outlined in his presentation, A Revolution in Open Science: Open Data and the Role of Libraries, the benefits of open data and sharing (identifying fraud in science, public accountability, economic benefits, rapid response to medical emergencies, crowd-sourcing, citizen science), but stressed that the data needs to be intelligible. He was slightly sceptical about whether libraries could take on the challenge, perhaps provoking us to start building it into our strategies.  Universities need to take on responsibility for the data they produce but have to be proactive, not just compliant.  They need strategies, e.g. in the library, and management processes.

Funding the infrastructure and services and making them interoperable if data management responsibility lies with multiple organisations, remains the most significant issue.  Carlos Morais Pires (DG Connect, European Commission), Enabling Data-Intensive Science through Advanced Data e-Infrastructures and Services, spoke on the EC’s Horizon 2020 programme and consultation, which was an outcome of the Riding the Wave report.   Its vision was “data e-infrastructure that supports seamless access, use, re-use, and trust of data. In a sense, the physical and technical infrastructure becomes invisible and the data themselves become the infrastructure a valuable asset on which science, technology, the economy and society can advance”.

The implemention of an interoperable data infrastructure is envisaged as:

(a) data generators; research projects, big research infrastructure, installations or medium size laboratories, simulation centres, surveys or individual researchers
(b) discipline-specific data service providers, providing data and workflows as a service
(c) providers of generic common data services (computing centres, libraries)
(d) researchers as users, using the data for science and engineering

There will be funding opportunities in the Horizon2020 programme.  It is currently out for consultation and has received responses from research centres (e.g. CERN) individual universities, and libraries organisations, including LIBER, and the League of European Research Universities (LERU).

Further RDM-related presentations explored cost modelling storage of Print and Digital Collections (Darryl K. Mead, National Library of Scotland), a European policy framework for open access to research data (Susan Reilly, LIBER), initiatives to upskill IT and research support staff for RDM in France (Marie-Christine Jacquemot-Perbal, Inist-CNRS, and collaborative RDM projects in the MLA sector in Denmark.

All presentations can be found on the LIBER 2013 web site.