The web is an accommodating and forgiving place. All manner of information is stored on web servers all over the world providing instant global access to web pages.
It is such a convenient channel of communication that it is easy to forget that it can be a pretty cranky place when you start to follow links, particularly if you try and find older material. You can all too regularly find yourself coming across a dead end.
The ‘404 error message’ will probably be familiar to almost all web users and signals that the page or the file you want is no longer associated with the web address (URL) displayed in the web browser. Although the neighbouring pages may be accessible and functioning, the error message indicates that the file you are trying to access has been moved or deleted and there is no forwarding address. This phenomenon is referred to as ‘link rot’.
Link rot is a big problem
Like rotten timbers in a house, it can be a sign of neglect or under-investment. It is inconvenient for the user who is then unable to get to the resources they want. It can also give the impression that the organisations involved are not paying sufficient attention to the integrity of the links. So link rot is not only a practical problem but also a reputational issue.
…it’s having a big effect…
Universities and colleges were amongst the first types of organisations to start engaging with the web and over the years some web servers have become used as file stores or repositories. Used this way these servers are relied upon to provide long term access to information about research projects or learning resources. However, this is not a good solution for the long term management of such resources. When a project finishes and staff move on, content can get shifted around or deleted and this is when link rot sets in.
…but universities and colleges can protect themselves…
The best way of tackling the problem is to make sure that different types of resources are managed according to pre-established policies and that there is infrastructure in place to store and archive them appropriately. It is the job of the institutional repository or digital asset management system (not the web server) to store documents for the long term.
If there is some necessity to re-design a website or impose some new structure, then care should be taken not to break old links and a series of “redirects” should be set up which will tell the web browser that the requested page is to be found at a new location. As a first step, it is easy to check for dead links on a web site. The Online Broken Link Checker is very easy to use and the free version will check up to 3,000 pages in one scan.
…and work is being done to mitigate against it
The process of depositing items into a well-managed digital environment should include assigning a “persistent identifier”. There are a number of schemes in place that can be used to uniquely identify digital objects and provide stable addresses for them over time, for example the Digital Object Identifier (DOI) system. Collaboration with national or local web archiving initiatives can help, as can redirects to an archived copy of a resource if it is no longer available on a “live” website.
This article originally featured in issue 40 of Jisc Inform (via the Wayback Machine).