Research data and publications, learning and teaching materials and software are all increasingly 'open', but what does this really mean, and where could it lead?
In the report we look at the areas of research and education that are being transformed by open practices, and explain key concepts around licensing of publications, educational resources, data and software. We also look at some examples of where open approaches have been particularly successful, and consider the implications for institutions of moving to a culture where open practices are the default mode of operation.
We provide an overview of Jisc services and support for open science and open educational practices, and suggest some possible future strategic directions for Jisc.
In many areas of research and education, it is becoming normal working practice to openly share information such as research outputs and teaching materials. In research it is becoming common for funders to mandate this open sharing in the interests of transparency and maximising uptake and reuse.
Historically, copyright would have been used to restrict this sharing and repurposing, but open copyright licenses have been developed that permit a range of reuse and remixing options.
Here are some of the common forms that this sharing can take in research and education:
- Open access1 research publications
- Open data - eg data in tables and figures in a research paper, the raw data that was processed to come up with them, or institutional administrative data
- Open source software - the code, workflows and software pipelines that research increasingly depends on
- Open educational practices including open educational resources (OERs), describes the way that sharing teaching materials and teaching practice openly enhances the learning experiences of students
- Open science and open research more generally - openness through the entire research process, from the creation of ideas through to practical approaches such as electronic lab notebooks that researchers use to record the details of their experiments, and use of collaborative online tools to develop funding proposals and research papers
Key to all of these is having a shared understanding of the opportunities that exist for a given resource to be accessed, used, modified and shared. The openness of a resource is best viewed as a spectrum or perhaps a series of spectra. For example, the Open Access Spectrum Evaluation Tool quantitatively scores journals’ degrees of openness on a range of factors including readers’ rights, reuse and machine readability to arrive at an overall ‘openness score’.
The Open Data Institute’s Data Spectrum diagram below shows how a similar approach can be taken around data. The FAIR Principles describe how data can be made findable, accessible, interoperable and reusable.
- Retain - the right to make, own, and control copies of the content (eg, download, duplicate, store, and manage)
- Reuse - the right to use the content in a wide range of ways (eg, in a class, in a study group, on a website, in a video)
- Revise - the right to adapt, adjust, modify, or alter the content itself (eg, translate the content into another language)
- Remix - the right to combine the original or revised content with other material to create something new (eg, incorporate the content into a mashup)
- Redistribute - the right to share copies of the original content, your revisions, or your remixes with others (eg, give a copy of the content to a friend)
The popular Creative Commons licenses are described below.
Readers should also note that there are other open licenses for specific purposes such as the UK’s Open Government License, and that similar licenses exist for software, such as the GNU General Public License (GPL) and the Mozilla Public License (MPL).
Some of the world’s most widely used software is open source, such as the core of the Android operating system which powers over 1.4 billion phones and tablets. Android in turn is built on the Linux kernel3, which also powers the vast majority of Internet services and top 500 supercomputers.
Open access to research
Open access to academic research including both research outputs such as journal articles and monographs allows the benefits of high quality academic research to be realised. And where we might once have viewed research data, software and source code as underlying the published paper, increasingly these are being seen as first class research outputs in themselves.
Openness allows for reusability and reusability, driving up the speed and utility of research and allowing for a more immediate contribution to human knowledge and thus wider society and the economy.
A number of open access research publication models have been explored, with the most commonly used being:
- Green: an open access copy of the publication, or pre-print of a paper accepted for publication, is archived in an institutional or subject specific repository
- Gold: an open access copy of the publication is made available via the publisher’s website, often in return for an article processing charge being paid
Read more about types of OA publishing in our introductory guide to open access.
The Open Library of the Humanities is an example of an alternative model, where operating costs are met by an international consortium of libraries, and authors do not have to pay an article processing charge.
Research funders such as Research Councils UK (RCUK), the Wellcome Trust and the Gates Foundation now require that outputs be open access, with some mandating particular licenses and open access policies - for example the length of any embargo that will give the publisher exclusivity over a publication before it becomes open access.
The UK Scholarly Communications and License (UK-SCL) and Model Policy (pdf) has been developed to simplify compliance with funder mandates and addresses some of the issues that have arisen with rights ownership of scholarly outputs. Over 70 institutions and UK sector bodies have collaborated to on the UK-SCL, and there is also considerable international interest in it.
Open access to research
Open peer review, open journals and the University Press
We are seeing growing interest in open peer review mechanisms. Perhaps in the future most publications will be deposited in pre-print archives like arXiv.org or F1000Research for immediate sharing and peer review, allowing for speedier and more accessible scholarly publishing.
Global academic publishing giant Elsevier has started to explore the value of pre-print sharing through its acquisition of the Social Sciences Research Network (SSRN).
In a similar vein, many collaborative research proposals and papers are already developed using online tools like Overleaf, Authorea, Microsoft Office 365 and Google Apps for Education. Perhaps in the future the norm will be to openly develop research ideas in order to solicit potential collaborators and also for peer review purposes - this approach has been pioneered by the Journal of Research Ideas and Outcomes (RIO Journal), launched in 2015.
Institutions will need to reflect on how they balance collaboration with competition in research proposals, projects and papers.
We have also seen a resurgence of interest in the university press, using modern online distribution mechanisms such as the system developed by the Open Journal Systems project. Institutions could run these as in-house services, or have the software hosted on their behalf. In a parallel move, the Wellcome Trust also recently announced its own online platform for research outputs that it has funded.
Institutions may wish to consider whether it is in their interests to take a more active role in publishing.
Privacy, confidentiality and commercial sensitivity in research data
Some classes of information are clearly sensitive, for example where national security or personally identifying information is involved, or where data arising from a research project has potential commercial applications. Examples of sensitive data might include personal medical information or in-depth ethnographic interviews.
Whilst our base assumption is now that research data will be shared openly, there will always be important exceptions. The RCUK Concordat on Open Research Data states that data should be ‘as open as possible, as closed as necessary’. For other research publications such as papers and book chapters these issues will have to be addressed independently of whether the resulting output is to be made open access.
Institutions may wish to check that research projects’ data management plans have robust protections for sensitive data, eg privacy and confidentiality.
Following on from the UK’s Hargreaves Review, the law was changed to explicitly create copyright exceptions for non-commercial research and private study, and a text and data mining exception created for non-commercial research. Recent years have also seen the creation of new platforms like ContentMine, an open source text and data mining engine.
There is huge potential for automated text and data mining of research outputs such as papers, software and data. However, in many cases this will be difficult as these materials are not typically structured so as to facilitate machine processing and data extraction.
arXiv.org has been in operation since 1991 and now provides access to over a million pre-prints in particle physics, mathematics, computer science, quantitative biology, finance and statistics.
Domain specific pre-print repositories
Open working practices are already well established in several research domains. For example the arXiv.org site has been in operation since 1991 and now provides access to over a million pre-prints in particle physics, mathematics, computer science, quantitative biology, finance and statistics. In many other disciplines it is still early days, with projects to set up the ChemRxiv for Chemistry and engrXiv for Engineering announced in Summer 2016.
Institutions may be able to accelerate the adoption of pre-prints in new research domains by bringing researchers from those domains together with experienced pre-print advocates from other research domains.
Infrastructure supporting open access
With individual researchers and institutions’ reputations increasingly dependent on their research publication record, we have seen widespread interest in the Open Researcher Contributor Identification Initiative (ORCID) and the Digital Object Identifier (DOI) system. These provide staff and their research outputs with unique IDs that are not dependent on information which may change over time such as their institutional affiliation or the URL of a journal or institutional repository. Without developments such as DOIs and ORCID, the ‘link rot’ we see when websites are updated or decommissioned would pose a serious threat to the scientific record.
Researchers at Los Alamos National Laboratory and the University of Edinburgh tested a sample of 400,000 papers from arXiv.org as part of the Hiberlink project. They found that as many as 30% of the ‘http://’ web links no longer functioned, and the 65% of the remaining links pointed to resources that were not robustly archived.
Institutions may wish to promote initiatives like DOI and ORCID in new staff and research student inductions and orientation training.
While global identifier registries, like those that support DOIs and ORCID, tend to be set up from the start with appropriate business models, there are other extremely well-used global services supporting open research that started out as projects, and that face a challenge sustaining themselves. Examples include SHERPA-RoMEO, a global registry of journal open access policies, and the Directory of Open Access Journals (DOAJ). Both of these are central to open access policies, and well-used by researchers and support professionals.
Institutions, and their representative organisations, may have a role in sustaining key services such as ORCID, SHERPA-RoMEO and DOAJ.
Where publications are either not available as open access, or under publisher embargo, other less formal techniques have sprung up. These include digitally enhanced versions of the time honoured technique of asking for a courtesy copy such as the iCanHazPDF4 social media hashtag or the 'copy request button', as described in this paper from the Eprints team. Many researchers also share materials and collaborate via professional networks such as ResearchGate (12 million users), Academia.edu (50 million users) and the UK’s own Piirus.
On another level entirely is the Sci-Hub site, which maintains illicit copies of over 60 million research papers. This article in Science reviews the Sci-Hub usage statistics and suggests that there may be a wide range of reasons for interest in the site - from institutions cancelling journal subscriptions to save money, to adjunct staff that do not have full access to institutional resources due to the casual nature of their relationship with the institution. There is also some evidence that Sci-Hub is being used due to its convenience and one-stop-shop nature.
Institutions may wish to provide researchers with advice and guidance on informal sharing, eg of pre-prints and courtesy copies.
The Creative Commons licenses have been widely embraced online far beyond academia, with over 38 million media files in the Wikimedia Commons alone.
Search engines often support filters for content tagged as open. For example, Google’s advanced search lets you specify license terms such as 'free to use, share or modify'. Dedicated media sharing and marketplace sites also often include filters for Creative Commons and public domain content, for example Flickr and 500px. Some sites provide only Creative Commons-licensed content, such as Wikimedia Commons and FreeSound.
There is a huge amount of freely reusable content available if you know where and how to look for it. Institutions may wish to emphasize this positive message in student and staff inductions.
For educators, the OER Commons provides a global clearinghouse for open educational resources, with over 50,000 resources at all levels from primary schools to secondary and tertiary/higher education. There are several other well populated sites providing OERs, including MERLOT, which hosts resources specifically designed for higher education.
Advising lecturing staff on potential OER content for use in their courses would be a natural extension of the role of technology-enhanced learning teams at institutions.
University and college staff will inevitably feel a strong sense of ownership over their work. This can lead to 'data hoarding' - for example in order to maximise the number of papers published before releasing the underlying data and thereby avoid being scooped by another researcher.
Whereas funders are increasingly mandating open access and open data for research, no equivalent mandate exists in the UK for teaching and learning materials at present.
Perhaps in the fullness of time we will see something like the US Trade Adjustment Assistance Community College and Career Training (TAACCCT) initiative in the UK. This supports community colleges to work with employers to develop training programmes that meet industry needs, with the requirement that material developed with TAACCCT funding is made available under a Creative Commons license.
Whilst there is no UK mandate equivalent to TAACCCT, there is nothing to prevent institutions from collaborating on areas of common interest. For example the FE Sussex consortium of colleges worked with us to develop a library of shared health and safety resources.
One of the key criticisms of a bottom-up publishing model driven by institutions and individuals is that publishers add significant value through activities such as copy editing, proofreading and marketing that may be difficult for what is often a volunteer workforce to replicate. However, there is some evidence that a sustainable approach to open textbooks is possible. Creative Commons-licenced textbooks have been a major trend in Canadian and US Higher Education, through initiatives like the BCcampus Open Textbook Project.
Mercy College worked with Lumen Learning to deliver an OER-based Algebra course which saw savings of over $125,000 per year in
textbook costs, and student pass rates improving from 48% to 69%.
These findings are backed up by a landmark study of the impact of OER materials on the learning outcomes of 16,000 post-secondary students in the United States, which found that:
"In three key measures of student success - course completion, final grade of C- or higher, course grade - students whose faculty chose OER generally performed as well or better than students whose faculty assigned commercial textbooks."
Notable UK initiatives in this area are the computer science curriculum for schools (pdf) from the British Computer Society’s Computing At School group, and the open textbook for Key Stage 3 computing on the Wikibooks open textbooks site.
Open textbooks have huge potential for institutions in terms of supporting widening participation and access to further and higher education.
We will look at them further below when we consider what Jisc could do next.
Open access to research
We have a number of well established services supporting the take-up of open access, including:
- SHERPA RoMEO - publisher copyright and self-archiving policies
- SHERPA Juliet - research funder open access policies
- SHERPA FACT - check whether a journal meets funder compliance requirements
- SHERPA REF - check whether a journal meets Research Excellence Framework (REF) open access compliance requirements
- CORE - search metadata or full text of over 70 million items of open access content from around the world
We also worked with the Higher Education Funding Council for England (HEFCE) and RCUK to develop the RIOXX metadata profile, now in use by over 60 institutional repositories in the UK. This helps institutions to use consistent funder and project/grant identifiers, which in turn aids in tracking of research outputs.
Our IRUS-UK service aggregates institutional repository usage statistics to give a high level national view and facilitate benchmarking between institutions.
We have negotiated agreements with major publishers such as John Wiley and Sons and Taylor & Francis to offset the costs of article processing charges against institutions’ subscriptions. We also worked with Springer to develop the Springer Compact agreement. This gives researchers the ability to publish articles in over 1,600 Springer journals without cost or administrative barriers, by combining open access publishing and subscription access into a single annual fee. Our Monitor UK and Monitor Local services have tools to help institutions track and report on their open access publishing activities.
Improved data practices
Data about citations within academic literature are often used (as described in James Wilsdon’s The Metric Tide report) as a proxy for the quality of research, but the underlying data and algorithms are not openly available. This means that data quality is uncertain, and it is difficult to be clear that it is being used in a reliable or replicable way.
As a way of investigating means to address this issue, we supported the startup phase of the Open Citation Project, whose Open Citations Corpus now contains some 750,000 citation links harvested from research literature. These are available as open linked data using the Resource Description Framework (RDF) standard. We are also working with the semantometrics project from the Open University, which is exploring a full text based approach to analysing the 'value' of a publication, based on semantic similarity between it and papers published subsequently.
UK Data Service partnership
We are a key partner in the UK Data Service, which provides unified access to the UK’s largest collection of social, economic and population data resources. The service, funded by the Economic and Social Research Council (ESRC) has over 6,000 datasets including census data, government surveys, longitudinal studies and business microdata.
Research data sharing and discovery
With institutions now being required by many funders to make the data from their research available as open data, we have been developing a research data shared service. This will save institutions the time and expense of setting up their own research data management facility. It will give researchers the ability to both store their data and make details of it, known as 'metadata', available to aid with discovery.
In parallel we have been developing a prototype research data discovery service. This is a 'one stop shop' for research data that participating institutions have made available. We believe that this new service will significantly simplify the discovery and reuse of research data.
Research data policy, standardisation and best practice
Institutions have reported difficulties in understanding and complying with multiple funder data polices. We continue to support initiatives to clarify and standardise funder policies, most recently via the RCUK Concordat on Open Research Data.
Our role in promoting consensus in research data practice is also demonstrated in our report, directions for research data management (pdf), which brings together representative bodies from research support functions.
We, alongside the Digital Curation Centre, have played an important role in the development of research data management practice.
Equipment data sharing
We have also worked with the Engineering and Physical Sciences Research Council (EPSRC) and the University of Southampton to develop equipment.data, a national equipment sharing portal. This site gathers open data about high value capital equipment available for sharing from over 40 universities and research institutes. At the time of writing, the site made details of over 10,000 items of equipment available.
With most items being valued at £20,000 or over, equipment.data helps researchers to discover and share over £200m worth of equipment.
Sharing equipment both reduces time to science through reuse of existing facilities, whilst also supporting institutional efficiency initiatives. The equipment.data site in turn feeds into the Konfer brokerage service from the National Centre for Universities and Business, connecting academia and industry.
Our new Jisc app and resource store, the successor to Jorum, supports sharing of open educational resources alongside commercially licensed material, and includes a large proportion of Jorum content - such as material developed by the projects chartered under our interactive learning resources for skills programme.
We have also supported the development of OERs and the capability/capacity building required to fully exploit them.
Alongside the Higher Education Academy, we supported a three-year UK open educational resources (UKOER) programme to investigate the opportunities that OER offered UK academics, institutions and subject groups. This led to a huge variety of activity - described in the UKOER programme evaluation - and drew together a large and vibrant UKOER community, which now runs an annual OER conference, supported by the Association for Learning Technology (ALT).
UKOER also proved to be an incubator for the first massive open online course (MOOC) offered in the UK - Phonar, initially based at Coventry University. The project evaluation report (pdf) examines that context and conditions that led to this innovation, and offers guidance to others using open practices to drive innovation in delivery.
Read our quick guide to enhancing your online learning provision.
Beyond the pdf
In spite of the extent to which digital technologies are used in teaching and learning and research, we still tend to think in terms of printed publications, handouts of lecture notes and so on.
For example, research papers are still typically shared in Adobe’s Portable Document Format (pdf). This is not useful to someone who wishes to remix or reuse particular elements of the document, such as a figure or the data that was used to create it.
Whilst some publishers now offer 'enhanced pdfs' via services like ReadCube, we believe that a massive opportunity exists to rethink the whole concept of a publication. This would allow us to exploit the power and capability of digital technologies to facilitate reuse not just of data, but also of whole scientific workflows.
Support for open textbooks
We have seen research above which indicates that students may be significantly less likely to drop out of their course if using OER textbooks - suggesting that the high cost of textbooks may be a factor in retention. The OER ecosystem was also identified as one of the top ten strategic technologies impacting on higher education in a 2016 Gartner report. However, the platforms and tools that exist at present are not especially well-suited to the discovery and remixing of OER content.
At Jisc we are well-placed to facilitate the discovery and adaption of quality open content through our new app and resource store. This is the ideal place to showcase UK relevant high quality open textbooks, alongside their OER building blocks and commercially licensed content such as our e-books for FE service. By rating and reviewing resources, users will be able to support peer-led discovery and develop collaborative projects to adapt or update key resources. Teachers and learners may well still require hard copy of open textbooks, and a national print-on-demand agreement would be a natural next step to enable this.
We should also keep in mind that the web was originally intended to support collaborative annotation of content, and indeed the very first web browser included this capability. Latter day annotation tools like hypothes.is may well give the readers of open textbooks an effective way to share their insights into the text.
Living in the open
Whilst the open science and open educational practices we have described in this report are well established in some subject domains, they are still new and potentially threatening and disruptive to many people. To help their staff adapt to these new ways of working, institutions have tended to create initiatives around specific aspects such as open access and research data management.
We believe there is a need to address the wider cultural shift by ensuring that both institutional leaders and practitioners fully-understand the implications of open practices - for example, the positive citation impact of open access publications. This would be a natural extension of our current work to help institutions build and sustain their digital capability.
It is also important that the UK continues to play an active role in international standards setting and collaboration through participation in initiatives such such as FORCE11, Learning Registry and OpenAire.
We hope you have found this report useful and would love to hear your thoughts on our three big ideas.
Please do get in touch with us to discuss, using the contact details below.
- 1 Open access (OA) means making research publications freely available so anyone can benefit from reading and using research. Read more in our introduction to open access https://www.jisc.ac.uk/guides/an-introduction-to-open-access
- 2 This material is based on original writing by David Wiley, which was published freely under a Creative Commons Attribution 4.0 license
- 3 Linux kernel definition on Wikipedia https://en.wikipedia.org/wiki/Linux_kernel
- 4 #ICanHazPDF is a hashtag used on Twitter to request access to academic journal articles which are behind paywalls. Read more on Wikipedia https://via.hypothes.is/https://en.wikipedia.org/wiki/ICanHazPDF