Data-mining and repositories
March 10, 2011 1 Comment
There has been discussion for a while about the limits of using material from typical repositories. In the absence of formal user-licences, there is an implict permission to read material – but what about data-mining?
A formal, cautious approach is to assume a lowest-common-denominator approach, where the rights to re-use material in a repository are taken as being those of the most restricted piece of content.
Material in a repository is normally pretty heterogeneous with respect to re-use rights. And where the individual pieces do not have their rights associated with them, liberal rights pieces cannot be told apart from restrictive rights pieces. Material that is truly Open Access is in a minority and will generally result form archiving true Open Access published materials, from BMC for example, where limited named rights are explicitly given to the publisher. Many more OA published materials are actually restricted in re-use: many OA publishers are anything but true OA. The corollary of true OA publishing is that all rights not granted to the publisher are retained by the author and then, presumably, licensed to the repository.
The majority of content is in a different situation from true OA material. It will be in there as a result of copyright being transferred or exclusively licensed to a publisher, who has then granted back, or allowed retention of, nominated rights. In such a circumstance then the author (and by extension the repository, see above), has only these certain, nominated, rights and if data-mining or other forms of re-use are not mentioned explicitly, then strictly, no such right exists. Some publishers explicitly exclude the right to data-mine the article and so, without being able to identify these, the lowest common-denominator approach kicks in.
The easiest solution for data-mining (and it could be argued for open access in general) is blanket rights for data-mining being retained by funders: or for publicly funded research to be placed in the public domain as regards copyright, as is done in the States.
All this concern, of course, restricts the full potential of open access being realised: what assumptions can or should be made, what liaibility, if any, should be risked in order to get at this potential?
Mendeley have one solution: do it and see! They have just announced a competition to mine the articles that authors have put on their Mendeley accounts. It will be interesting to see how the rights issue will be handled: it may prove to be a model for others to follow.