Session notes: Beyond text
Speaker: Richard Jones,
Imperial
College
Complex Objects and e-theses, and one as the other
Complex and compound are interchangeable
The sorts of things that a user can do with them:-
-
Capture
-
Storage
-
Discovery
-
Dissemination
-
Interpretation
A Compound PhD thesis could contain:-
-
Text, chapters, PDF, html
-
Multimedia
-
Journal paper
-
Metadata
-
Software package
-
Ref to ORE project
Boundaries are therefore arbitrary and complicated
4 notable considerations:-
-
Boundary
-
Object internal file structure
-
Registries and API’s
-
Serialisation and interoperability
Boundary
Capture – store or link to
Structure
Capture – understand semantics
Devolved interfaces
Dissemination
Registries and API’s
Capture – file formats obtained
Metadata formats defined
Serialisation and interoperability
Capture – web services
Storage – preservation –metadata, format registries
Discovery – web services, harvesters, aggregators
Dissemination – web services, transfer formats, content packaging
Speaker: John MacColl, University of Edinburgh Project StORe
E-Theses and data Linking
Findings:-
Disciplines differ - more evident in non-text form
Open data is a good idea
Data is uncurated
Librarians have no role -surprise from some researchers that we are interested
Some scientists are uncomfortable with our use of the word “data”
Researcher Behaviour is typically:-
-
Would like to be use other people’s data
-
I can see all sorts of advantages
-
I don’t always want other people seeing mine
-
Worry about loss of competitive advantage
-
Don’t have time to do it properly
-
Will do if it’s a condition of grant
The Data Lifecycle:-
-
Not commonly understood concept
-
Should subsume research outputs
-
Requires curation
-
Sharing and reuse are good things but not the responsibility of researchers themselves
Researchers vs phd students
Needs for PhD candidates and researchers are the same
But the Academy treats them differently
Library should not?
The PhD student infrastructure environment is a good place to introduce good practice
Disciplinary Differences
Astronomers well provided for
Environmentalists quite well
Social scientist less so
Medicine and bioscience good
Humanists not well (AHDS loss to make this worse)
Institutions
Need to understand the deficit
Provide data repositories as well as output repositories
Provide the experts = curators
Middleware, citation stands
JISC and research councils need to invest in skilled workers
It is impossible to provide curatorial expertise for every institution to the level needed
Need national expertise infrastructure
Join-up with library schools
A Peripatetic role for trained staff?
The environment needs a skill change to stay ahead of pace of tech change
Panel Discussion
This is very complex stuff, when the majority of institutions are struggling to set up repositories; it’s a big challenge to get this going?
If repositories can incorporate this function when they are set up this will make this easier
Agreed though that not all institutions have the expertise to do this development work
Comment: the sector needs to accept that data is subject to constant change, and that objects need to be reloaded
How have projects treated associated software in a thesis?
They tend to be bundled together
Are there arguments about what constitutes a thesis?
As long as there is parallel digital and print this is going to be a problem.