Some of the risks of storing information in the cloud are covered elsewhere in this resource. Issues relating to the ongoing protection of personal data under the Data Protection Act have tended to be forefront in people’s considerations but it is important to recognise that this is not the only potential challenge. Here Steve Bailey summarises some of the issues from a preservation/retrieval and research perspective and the topic is covered further in the accompanying video.
Our summary of risks of cloud computing also looks at some of the practical problems this may pose in relation to freedom of information (FOI) requests not least because they serve as an example of some of the more deep-seated shifts in the relationship between the organisation and the information it creates that moving to the cloud represents. At the heart of compliance with the FOI Act is the ability to identify and access all the information pertinent to the request in question.
This often proves challenging even when the institution physically holds the information in question, spread as it may be between a mixture of systems and filing structures: some central, some held within local departments or even by individual members of staff. In theory at least, however, the institution in the pre-cloud world possesses the ability to manage and search for information centrally and in recent years there has been a general drift within institutions towards such centralisation. This provides the means to undertake comprehensive searches from one place across the entire (or at least most) of the institution to identify and retrieve all the information pertaining to subject X, Y, Z.
As we have already seen ‘the cloud’ covers a multitude of possible configurations, services and suppliers and it is highly unlikely that an institution will restrict its data storage to just one. Indeed, the business model for many such service providers is to specialise in storing and managing one particular type of information: YouTube for video clips, Flickr for photos etc each requiring different accounts (possibly many within the same institution) and each providing different ways to describe and search for information.
In this heterogeneous world there simply isn’t the means to centrally search for ‘all information we have relating to subject X’. Indeed it is highly unlikely that the institution will even know that a particular department or individual within it has chosen to create and store information pertinent to an FOI request somewhere in the cloud.
This same heterogeneity threatens to cause problems across the entire information management landscape, one which strives to establish a consistency of approach across information wherever and however it is stored. Fundamental tenets of good information and records management such as the consistent application of access controls and the management of the retention and deletion of records will be made vastly more difficult to achieve across disparate, unconnected systems and providers.
This separation of information into format-specific silos also threatens to have wider and more profound implications in the future. For as long as there have been historic records, archives and other trusted repositories have sought to arrange and store records according to the original order in which they were created. In doing so we preserve their provenance and allow future researchers to be able to reconstruct the narrative record as it was at the time the records were created. In essence this means management by the record’s subject, not its format. This approach is visible in historic collections (for example in the collected papers of a famous alumnus) and also in modern records management (where we would look to arrange all committee papers in a series for example, rather than looking to arrange all Microsoft Word7 documents together).
But the logic of the cloud threatens to undo this established order, with emails being stored by one service provider, documents another and multimedia resources by a multitude of others with nothing to connect these disparate threads together. Many of our institutions are old, some ancient. Many have historical collections and archives stretching back many centuries of national, if not international, importance all carefully arranged according to these tried and tested theories of provenance and original order.
But if we embrace the cloud as fully as many would like us to, whether future generations will be able to benefit from these riches in the same way that researchers today can is debateable indeed. Of course we appreciate that such a long view may not be at the forefront of most manager’s minds when immediate operational decisions need to be made, but all the same it is perhaps worth at least pausing for a moment or two to consider the (potentially unintended) legacy of our decisions before the die is cast.
Spanning this divide between problems of the future and the priorities of today comes the issue of digital preservation and ongoing access to information stored in the cloud. The challenges implicit in ensuring ongoing access to digital information in a changing world are never simple, but are compounded further when left to a third party. At the very least it is worth being certain that the service providers you choose in the cloud have the skills and inclination to continue to store your information for as long as you require, perhaps even indefinitely. It is also worth bearing in mind during such discussions that a historian or archivist’s view of what constitutes ‘historic’ is likely to be very different from that held within the notoriously short-term IT industry.
Finally it is worth noting that for centuries we have handed over our most important historic records to the care of trusted, expert and non-commercial custodians. As things currently stand in the cloud this is a role that we would be handing over to a range of often small, transitory commercial businesses and what should happen if and when they disappear (potentially with little or no prior warning) should be of concern not only to the researcher of tomorrow, but the manager of today.