Data-mining and repositories

There has been discussion for a while about the limits of using material from typical repositories.  In the absence of formal user-licences, there is an implict permission to read material – but what about data-mining?

A formal, cautious approach is to assume a lowest-common-denominator approach, where the rights to re-use material in a repository are taken as being those of the most restricted piece of content.

Material in a repository is normally pretty heterogeneous with respect to re-use rights.  And where the individual pieces do not have their rights associated with them, liberal rights pieces cannot be told apart from restrictive rights pieces.  Material that is truly Open Access is in a minority and will generally result form archiving true Open Access published materials, from BMC for example, where limited named rights are explicitly given to the publisher.  Many more OA published materials are actually restricted in re-use:  many OA publishers are anything but true OA. The corollary of true OA publishing is that all rights not granted to the publisher are retained by the author and then, presumably, licensed to the repository.

The majority of content is in a different situation from true OA material.  It will be in there as a result of copyright being transferred or exclusively licensed to a publisher, who has then granted back, or allowed retention of, nominated rights.  In such a circumstance then the author (and by extension the repository, see above), has only these certain, nominated, rights and if data-mining or other forms of re-use are not mentioned explicitly, then strictly, no such right exists.  Some publishers explicitly exclude the right to data-mine the article and so, without being able to identify these, the lowest common-denominator approach kicks in.

The easiest solution for data-mining (and it could be argued for open access in general)  is blanket rights for data-mining being retained by funders: or for publicly funded research to be placed in the public domain as regards copyright, as is done in the States.

All this concern, of course, restricts the full potential of open access being realised: what assumptions can or should be made, what liaibility, if any,  should be risked in order to get at this potential?

Mendeley have one solution:  do it and see!  They have just announced a competition to mine the articles that authors have put on their Mendeley accounts.  It will be interesting to see how the rights issue will be handled: it may prove to be a model for others to follow.

Bill

Advertisements

About Bill Hubbard
Bill Hubbard is the Director of the Centre for Research Communications (CRC), incorporating the work of SHERPA. Bill has a background in Higher Education and IT; in particular in work aiming to embed IT into university functions and working practices. Previous work has looked at the use of Expert Systems in supporting decision making, designing information systems for managing research funding and a number of years working with the introduction of multimedia into university teaching. Bill's commercial experience includes three years as a project manager in virtual reality applications for communications, installations and broadcast, specialising in virtual heritage environments. Before this he worked as a senior lecturer at De Montfort University, Leicester, leading a BA degree course in Multimedia Design and has been an honorary lecturer in the School of Computing Sciences at the University of East Anglia. Bill speaks widely on open access and related issues - repository network development, institutional integration, cultural change, IPR and Open Access policy development. He is also involved in archaeological and heritage applications of new media and sits on the Channel 4 Award jury for new media archaeology.

One Response to Data-mining and repositories

  1. Pingback: Links between open data « Research Communications Strategy

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: