Project workflow
This section describes the overall workflow for the project, bringing together several elements from previous sections of this report. It includes a project workflow diagram; a work plan, showing the project’s main activities and work packages (WP) on a Gannt chart; and a detailed description of the work to be undertaken within in each work package.
The following diagram illustrates the overall workflow proposed for this project, from pamphlet selection to discovery by users. The major work packages (WP4-8) are also indicated.
Figure 4.1 Project workflow
The Gannt chart outlines the likely timing of the main activities and work packages (WP) proposed for this project. This timetable would be confirmed in the Project Plan to be prepared under WP2. See work packages for a description of each work package in detail.
Figure 4.2 Proposed work plan
See the work plan
This section provides more detail on each work package, including its timeframe, leader, objectives, main outputs, and a discussion of particular issues.
|
WP1 pre-project activities |
|
|
Timeframe: Oct-Dec 2006 |
Objective: To enable the project to begin promptly in January 2007. |
|
Lead: Southampton |
Main outputs: Staff in post and management groups constituted for a January project start. Libraries have identified IPR issues and begun any clearance activities. |
Should the bid succeed, some work will be initiated immediately: (a) staffing secondments and recruitment would be organised; (b) the project management groups (described in WP2) would be constituted; and (c) library partners would be asked to identify materials that may be in copyright and begin any clearance required.
|
WP2 project management activities |
|
|
Timeframe: Jan 2007-Dec 2008 |
Objective: To ensure that all the Work Packages in the project are managed coherently and that all the project outputs are delivered within agreed deadlines and budgets. |
|
Lead: Southampton |
Main outputs: detailed Project Plan; MoUs between partners; liaison with and between all partners; regular reports and other documentation; attendance at JISC programme meetings; overall QA and risks monitoring; editing of the project web site; dissemination programme. |
This scoping study has provided much ground work for the project and would help inform a more complete and detailed Project Plan. Other early documentation would include Memoranda of Understanding (MoUs) between partners and a project website.
The project would conform to the JISC’s Project Management Guidelines and closely follow the PRINCE2 methodology. The core Project Team would be comprised of: a Project Manager (0.5 – to be seconded from TASI at the University of Bristol); a Technical Project Manager (0.5 – the current manager of BOPCRIS); and a Project Officer (1.0 – also from BOPCRIS). The Project Manager would take responsibility for the overall project monitoring, risk management, reporting, liaison and dissemination. The Technical Project Manager would be responsible for managing the production processes, quality assurance, and liaison with other partners over technical standards. The Project Officer would receive the pamphlets and track them through the BOPCRIS production system. They would also maintain the database, which is used to aid the selection and de-duplication of pamphlets. A Software Developer (0.5) would also be employed to undertake WP3 development and provide support for other packages.
This core team would be supported by two groups: (a) the Project Management Group , which would include the Project Director (chair), project managers, and representation from among the partners and JISC, and would meet regularly and as required to oversee the project and manage any exceptional circumstances; and (b) the Project Steering Group , which would offer a strategic oversight. The Steering Group would be chaired externally and be comprised of the Project Director and managers, representation from the partners, JISC, research councils, and senior members of the research, teaching, and information communities. The Steering Group would meet at the beginning, middle, and near the end of the project. It would provide advice and contribute to maintaining a high level of visibility for the project within the UK and internationally.
|
WP3 development |
|
|
Timeframe: Jan 2007-Aug 2007 |
Objective: To make adjustments to existing systems or develop new systems to support the major work packages: i.e. WP4-8. |
|
Lead: Various |
Main outputs: adjustments to the BOPCRIS production system, JSTOR delivery system, and Copac database; project database for libraries. |
A key aspect of this project is its use of existing infrastructure, including production systems, preservation and delivery systems, and discovery services. This avoids the considerable expense and delay in developing new systems. Some development work, however, would be required to ensure an efficient workflow, better conformity to the emerging standards, delivery of a new kind of collection (JSTOR), and to enable the linkages that will provide effective resource discovery (MIMAS). This work would take place at or near the beginning of the project, timed to ensure that the systems were ready as the data moved to that stage of the workflow. The only completely new system required would be the database used by libraries to log pamphlets, note their condition, and check for duplicates. Data from library OPACs would be used to help populate this database.
|
WP4 selection and preparation |
|
|
Timeframe: Jan 2007-Sep 2008 |
Objective: To efficiently select, prepare and transport pamphlets from and back to libraries. |
|
Lead: Southampton with libraries |
Main outputs: delivery of selected materials to BOPCRIS for digitisation |
The selection and flow of materials from seven libraries to Southampton requires careful management. This study has explored the issues by assessing the volume and condition of the pamphlets (both will affect scanning time) and the extent of duplication across the primary partner collections. It also discussed with libraries any issues that would affect the scheduling of collections (e.g. whether the collections should go at once or in batches). As a result of these investigations we have presented a possible schedule. The final timetable would be agreed with libraries at the beginning of the project.
A database would be created (WP3) and used to enable libraries to check material already sent and log any missing, fragile or damaged items. For five libraries, whole collections have been chosen. Selections will be made from the other two (Bristol and the LSE) to complement these and address feedback from users (see under WP6). Selection strategy has outlined the selection and de-selection criteria for this project.
|
WP5 production |
|
|
Timeframe: Mar 2007-Nov 2008 |
Objective: To create high-quality digital images, metadata and OCR. |
|
Lead: BOPCRIS |
Main outputs: Datasets comprising standards-compliant images, metadata and OCR text. |
Digital datasets has described the methodology, technical specifications and workflow for the project. These were discussed between BPOCRIS, JSTOR and the author in the course of the scoping study, with several different approaches to capture, metadata, OCR and Quality Assurance (QA) considered. This discussion was informed by test images, metadata, and OCR text generated by BOPCRIS from a sample collection supplied by the University of Bristol.
As described in Digital datasets, BOPCRIS would take responsibility for the digital capture; generation of technical, preservation and structural metadata; OCR, and initial QA work. Texts would be captured bitonally or in greyscale and colour where relevant (e.g. pages with prints, maps, or annotations). Standards-compliant XML-based metadata would be generated from the BOPCRIS production system along with OCR text. Once ready, the data would be transferred to JSTOR in batches using external hard drives and network transfer (FTP).
| WP6 delivery |
|
Timeframe: Apr 2007-Feb 2009 |
Objective: To effectively deliver the collection to users. |
<Lead: BOPCRIS |
Main outputs: Online collection of ca. 23,000 digital pamphlets. |
An advantage of using an existing delivery platform is that the collection can be quickly made available to users. JSTOR would release the collection in batches as the content is ready. As a result, this project would hope to be delivering digital content to users within its first year (see Gannt chart above). This would provide an opportunity to gain information from users to help inform some of the later selection of pamphlets being undertaken by the LSE and Bristol.
Once the BOPCRIS datasets are received JSTOR would apply their own quality assurance processes on the image and OCR files, checking an average of 10% of each. They would work closely with BOPCRIS to address any issues discovered. Once approved, an archival dataset would be preserved (WP7) and the data transformed (images resized and metadata mapped) to create a delivery dataset for incorporation into JSTOR’s systems. JSTOR would also generate URLs and Document Object Identifiers (DOIs), passing these with the corresponding library and record IDs to MIMAS for incorporation into Copac and distribution to libraries (WP8). In addition, JSTOR would take responsibility for facilitating pathways into the collection through its linking arrangements with organisations such as Google, the History Cooperative and RePEC , and its participation in CrossRef. The full OCR text of the pamphlets would be exposed to Google’s indexing spider, enabling pamphlets to be found via a standard Google web search.
| WP7 preservation |
|
Timeframe: Apr 2007- |
Objective: To ensure the long-term preservation (including future upgrades and migrations as technology changes) and accessibility of the material. |
Lead: JSTOR |
Main outputs: Archive of image, metadata and OCR datasets. |
JSTOR would receive from BOPCRIS a very rich dataset, including large archival images and standards-compliant metadata, with accompanying full text. This dataset would be archived by JSTOR and made available to contributing libraries or the JISC upon request.
| WP8 linking |
|
Timeframe: Apr 2007- |
Objective: To achieve linking from Copac, from collection descriptions on the RSLP Project website, and from the individual OPACs of libraries holding the pamphlets. |
Lead: MIMAS |
Main outputs: Hyperlinked records. |
MIMAS would take the URLs and DOIs supplied by JSTOR and use them to update the CURL and Copac databases and make available links or duplicate records to partner libraries so they can update their own catalogues. MIMAS would develop software to generate these additional records and also (as far as is possible) identify all records in the database describing the same item so that all relevant libraries could be informed. Any library participating in CURL/Copac would be able to download a record or link to items held in the pamphlet collection. Collection-based searching would also be provided by MIMAS via the Guide to 19th Century Pamphlets hosted on the CURL website. This would enable users to limit their search to a particular collection or, if they prefer, to just the digitised content of that collection.
| WP9 marketing and dissemination |
|
Timeframe: Jan 2007-Dec 2008 |
Objective: To generate wide interest in the project and wide usage of its collection. |
Lead: Southampton |
Main outputs: Website, publicity materials, presentations, learning resources, digitisation toolkit, launch and dissemination event. |
The Project Manager would take responsibility for this work package, creating or commissioning webpages and publicity materials, writing papers, making presentations, and coordinating an event in the summer of 2008. This event would be likely to take the form of a one-day seminar and to include a formal launch of the collection (which by this stage would include a significant amount of content). The Project Manager would also (a) prepare a online ‘toolkit’ or suite of resources to assist with the future selection, digitisation, delivery and preservation of pamphlet literature, and (b) commission a Research Officer to create a set of e-learning resources to encourage the use of the resource within teaching. These e-learning resources would include additional web content for the Guide to 19th Century Pamphlets and a sample learning package for deposit within the JORUM repository.
| WP10 evaluation |
|
Timeframe: Oct -Dec 2008 |
Objective: To commission an external evaluation study whose assessment and recommendations will be incorporated in the Final Report. |
Lead: Southampton |
Main outputs: Evaluation Report. |