Computer code (or software) is a series of instructions written in human-readable language, usually stored in a text file. In the DPC's list of digitally endangered species, research software is classified as critically endangered. It is easy to understand why: software is often considered as a means to an end rather than as part of the research. Yet software too can be a research output and, as such, it should be managed, curated and preserved.
What computer code should be saved?
Not all code written within a project is worthy of preservation, so keep in mind what software is worth keeping or sharing with other researchers. A few questions you can ask yourself in this respect include:
- Is there motivation for preserving the software?
- Do the predicted benefit(s) exceed the predicted cost(s)?
- Are the necessary capability and capacity available?
If your computer code is on a GitHub public repository, the Software Heritage archive takes care of saving it automatically. On their website, you can check if they have already saved your work.
Dealing with computer code during a research project
When planning and conducting research using software, consider the following:
- Treat computer code like any other output of your research. It should be part of your research data management plan or have a tailored management plan.
- Prepare documentation and metadata for your computer code. This includes not only inline comments, but also dependencies and their versions. Documentation should be clear enough for others to replicate the features of your code.
- Share your computer code like you would any other research output. You can share it via a repository (eg GitHub) or in a journal. We highlight the Journal of Open Research Software (JORS, Open Access) by the Software Sustainability Institute, but also their list of suitable alternatives by field. Alternatively, the Code Ocean platform provides researchers with a way to share and run the code they publish along with data and articles.
- Computer code should have a URL or a DOI (digital object identifier). Always include these when citing the code, including information on the version you used. Note that code deposited on Code Ocean receives a DOI. Similarly, it is possible to obtain a DOI for computer code on GitHub using Zenodo.
- Apply a suitable licence. Computer code is slightly different from research data, so appropriate licences exist.
Since software is a complex field, many of its features fall outside the scope of this toolkit. However, in 2018, the Software Sustainability Institute developed a set of complementary guides covering the main aspects of depositing software into digital repositories. In addition, they list several purposes, benefits and scenarios about sharing software.
Further reading
- Sustainability Institute - How does software fit into EPSRC’s research data policy?
- Software Sustainability Institute - How to cite and describe software
- The Internet Archive Software Collection
- Software Carpentry - Data management (video lecture)
- nature neuroscience - Toward standard practices for sharing computer code and programs in neuroscience
- Open Working - Workshop Report: Software Reproducibility – How to put it into practice?
- Software Sustainability Institute - Digital preservation and curation - the danger of overlooking software
- Software Sustainability Institute - Top tips