"The biggest challenge has been – and still is – the issue of data consistency. During the course of the project it has become apparent that there is no centralised, formal documentation of the different sources of data. The knowledge about data is dispersed across different people in different departments"
The Open University
The conventional wisdom when it comes to implementing information systems tends to be that you should seek to resolve all information/data quality issues before you consider implementing any new technologies. Indeed, this is the logic underpinning stage two of our BI implementation model – that it is sensible to address questions of data governance, ownership and quality at an early stage of your project, and certainly prior to installing any technology.
In an ideal world this advice still holds true, and it certainly does make sense to have surfaced and resolved as many of these issues as you can to provide as solid a foundation for your BI solution as possible. However, the experience from those who have implemented BI systems also suggests the need for a more pragmatic approach. This acknowledges that no data will ever be perfect and that to continue to hold back to wait for all such data problems to be resolved could essentially derail the project for years.
Furthermore, experience suggests that few things are as powerful at surfacing long suspected, or even known about but ignored, issues with data quality than a BI project which rapidly exposes any such deficiencies to the full glare of senior management scrutiny.
1. Data ownership is often complex. A successful BI project resolves any data ownership issues early in the project
Virtually every one of the Jisc-funded projects encountered some issues surrounding data ownership. Historically in many organisations where a data item was used for a single purpose the ownership of the data item was viewed as a non or irrelevant question.
However, as information silos are progressively broken down, users of a particular data item need to know who to approach for access to the data and also who can take remedial action in the event of problems with the data that is being accessed. All the projects that found themselves confronting this challenge secured timely access to their data through a variety of different strategies.
2. The meaning of a data item is often unclear and lacking in rigour; make sure that all data definitions are understood and agreed
A related but different question that several projects had to address was the precise definition of a data item; ie what information does this item hold, and what is the range of valid values for the item? In one example, the Open University project needed to understand the range of codes associated with a student withdrawing from a course and discovered that not only were some of the differences between values extremely subtle but that different course administrators within the organisation interpreted the values differently.
As a result, two students withdrawing from different programmes for exactly the same reason could potentially appear in the student record system with different withdrawal codes. Often these differences in interpretation were for good reasons and within the ‘local’ operation made sense; it was only from an organisation-wide perspective that the problems arose.
Both the University of Bolton and the University of Liverpool responded to this challenge by instigating a formal data definition process which sought to establish a data dictionary that recorded some detail of ownership, semantics and coding frames. Rather than attempting a highly ambitious and high risk strategy of trying to create an all encompassing data dictionary, Liverpool opted to deploy an ongoing process which requires every item to be documented when it is encountered for the first time.
3. For a key performance indicator (KPI) to be successfully adopted it is necessary that there is agreement on how it is derived
In addition to the above situations, The University of Central Lancashire was surprised when they were also confronted with disagreements amongst managers on the fundamental question of how the KPIs they were seeking to monitor were defined in terms of the data held by the University’s corporate systems. Addressing this issue was challenging and required the project to adopt a bottom-up approach to developing a KPI where they established their base data sources and detailed rules for processing the data.
4. Wherever possible try to have an agreed ‘single point of truth’
Many projects used variations of the term ‘a single point of truth’ to establish which data were the authoritative source from which they could build their predictive models etc. While this process was, for the most part, consensual the Liverpool approach showed that ultimately a robust mechanism is needed to establish a definitive source on which trusted models can be built.
5. Metadata can be an issue
For the University of Huddersfield the challenges of data definition were slightly different. Their strategy was to build a presentation layer which was completely independent of the data being passed to it. Consequently they required a ‘data definition language’ which defined the objects being passed to the front-end and which could then render the objects as XML. The development of this mechanism formed a significant part of their project.
The University of Sheffield had to deal with datasets from sources such as data.gov.uk. The project found that generally the data available was usable but that consistency from one dataset to the next presented challenges and that the metadata that describes the datasets is not completely consistent.
6. Ensure that data structures are flexible enough to adapt in a changing environment
The University of East London encountered a significant issue when there was a major restructure of academic departments part of the way through the project which resulted in data having to be recoded. In retrospect the project felt that they could and should have anticipated such an event and started with a data structure which was adaptable enough to cope with such an eventuality.