Session notes: Beyond text

Speaker: Richard Jones, Imperial College

Complex Objects and e-theses, and one as the other

Complex and compound are interchangeable  

The sorts of things that a user can do with them:-

  • Capture
  • Storage
  • Discovery
  • Dissemination
  • Interpretation

A Compound PhD thesis could contain:-

  • Text, chapters, PDF, html
  • Multimedia
  • Journal paper
  • Metadata
  • Software package
  • Ref to ORE project   

Boundaries are therefore arbitrary and complicated

4 notable considerations:-

  • Boundary
  • Object internal file structure
  • Registries and API’s
  • Serialisation and interoperability

Boundary

Capture – store or link to

Structure

Capture – understand semantics

Devolved interfaces

Dissemination

Registries and API’s

Capture – file formats obtained

Metadata formats defined

Serialisation and interoperability

Capture – web services

Storage – preservation –metadata, format registries

Discovery – web services, harvesters, aggregators

Dissemination – web services, transfer formats, content packaging

 

Speaker: John MacColl, University of Edinburgh Project StORe

E-Theses and data Linking

Findings:-

Disciplines differ - more evident in non-text form

Open data is a good idea

Data is uncurated

Librarians have no role -surprise from some researchers that we are interested

Some scientists are uncomfortable with our use of the word “data”

Researcher Behaviour is typically:-

  • Would like to be use other people’s data
  • I can see all sorts of advantages
  • I don’t always want other people seeing mine
  • Worry about loss of competitive advantage
  • Don’t have time to do it properly
  • Will do if it’s a condition of grant

The Data Lifecycle:-

  • Not commonly understood concept
  • Should subsume research outputs
  • Requires curation
  • Sharing and reuse are good things but not the responsibility of researchers themselves

Researchers vs phd students

Needs for PhD candidates and researchers are the same

But the Academy treats them differently

Library should not?

The PhD student infrastructure environment is a good place to introduce good practice

Disciplinary Differences

Astronomers well provided for

Environmentalists quite well

Social scientist less so

Medicine and bioscience good

Humanists not well (AHDS loss to make this worse)

Institutions

Need to understand the deficit

Provide data repositories as well as output repositories

Provide the experts = curators

Middleware, citation stands

JISC and research councils need to invest in skilled workers

It is impossible to provide curatorial expertise for every institution to the level needed

Need national expertise infrastructure

Join-up with library schools

A Peripatetic role for trained staff?

The environment needs a skill change to stay ahead of pace of tech change

 

Panel Discussion

This is very complex stuff, when the majority of institutions are struggling to set up repositories; it’s a big challenge to get this going?

If repositories can incorporate this function when they are set up this will make this easier

Agreed though that not all institutions have the expertise to do this development work

Comment: the sector needs to accept that data is subject to constant change, and that objects need to be reloaded

How have projects treated associated software in a thesis?

They tend to be bundled together

Are there arguments about what constitutes a thesis?

As long as there is parallel digital and print this is going to be a problem.

Bookmark and Share