Mendeley in WIRED

There is an interesting article on the innovative and rapidly growing Mendeley system in the latest (June 2011) issue of WIRED, which gives some background to the hopes and vision of the senior Mendeley team.

Principle investor Stefan Glaenzer: “We are aiming to make Mendeley the biggest knowledge database on the planet [. . . ] In 19 months we have collected over 67 million articles. It took Thomson Reuters 49 years to come up with 40 million.”

Victor Henning, cofounder and CEO, is noted as explaining that the productivity/collaborative component of Mendeley will be monetised, the unique data aggregation will be monetised, Mendeley will be turned into a content distribution platform and targeted advertising will be introduced for Mendeley’s users.

They seem to have established the user base to support this: a claimed 800,000 users uploading seven million research articles (presumably full-text in comparison with the quoted 67 million articles, presumably of bibliographic details).

What is less clear is what monetization routes may be built, or indeed recognised, for the producers and copyright holders of the content which to be distributed, or whether the service itself is repayment enough for the value-added exploitation. Previously, academic authors, and by extension their employing institutions and the funders of their research, have been content to allow commercial exploitation of research articles by publishers. This realisation has helped to bolster arguments for open access, so will future commercial exploitation systems find it as easy to be accepted?

One of the key issues of course, is that traditional publishers have sought to exclusively exploit the material – the basis of subscription-model journals – while Mendeley and others are only using what has been given to them on a freely-reusable basis. This means that they are free to re-use it as they will, make money or not – and if anyone else comes up with a compelling service, then they can get hold of the information too and good luck to them.

Interestingly, as we know from the traditional model, once research dissemination habits have been formed, they tend to become embedded and resistant to change. In this situation, the first to establish a widely used and valued system built on top of freely reusable articles might establish a firm position. Might this happen with Mendeley? Could it be that Mendeley has been in the right place at the right time – as well as giving a service that academics truly value – to become a future dominant underpinning service for research dissemination and re-use?



Call for retention of authors’ rights

The high-ranking JISC Open Access Implementation Group has released a strongly worded statement in support of authors’ retention of publishing rights.

This seems to relate to recent moves by some publishers to try to limit institutional archiving of materials by asking for separate agreements to be reached between the publisher and individual institutions.

Elsevier, in particular, has begun to try to restrict its previously permissive policy allowing authors to archive their own final versions, by saying that if an institution has a systematic deposit mandate for its staff, that authors should no longer be allowed to archive their work. See the 1,800 words of their policy guidance which they expect authors  to understand and comply with.

This is being done on the basis that Elsevier will still allow authors to archive their work if it is done on a voluntary basis: but if there is a mandate, they will seek to prohibit it.  So authors can if they want to:  but not if they are told to!

Such efforts seem to try to amend policies that have been put in place by funders or institutions “upstream” of the author’s final production of an article for publication and make adherence to these policies a matter of post-hoc negotiation.

In the case of Elsevier, the publisher seems to be seeking to make independent agreements with individual institutions, rather than more open collective agreements: a point raised in the OAIG statement.  Rumours of negotiations with individual institutions so far seem to suggest that Elsevier is seeking to track usage of authors’ articles from institutional repositories and asking institutions, as a condition of allowing archiving, to give them reports of detailed monitoring and use of institutions’ own institutional systems.

Will institutions agree to third-party monitoring of their own internal systems, if that is really what is truly being requested? It will be interesting to see what finally results from any negotiations if any are actually concluded.  The OAIG statement calls for institutions:

” . . . not to enter into one-to-one negotiations with publishers on self-archiving rights for their staff, and instead to rely on publicly declared rights as shown on the Sherpa-RoMEO website.”



Industrial taskforce urges opening access

A major report by the Council for Industry and Higher Education (CIHE) is urging universities to open access to their knowledge and intellectual property to support and boost UK manufacuring capacity.

The reports assesses the UK’s current position in manufacturing – Britain is still the sixth largest manufacturer in the world by output, with manufacturing contributing £131 billion to GDP (13.5%), 75% of business research and development (R&D), 50% of UK exports and ten percent of total employment.

Given the conventional wisdom that the eighties finished off UK manufacturing, this is cheering to read.  However, the UK currently only ranks 17th in competitativeness and is forecast to slide.  The report identifies greater access to innovative IP and cutting edge research as essential to halt this decline.

From their release:  Simon Bradley, vice-president of EADS, said to gain greater access to universities’ knowledge, ideas and creativity was vital for manufacturing: “Our Taskforce has found that the simple act of universities opening their vast knowledge banks and providing free access to their intellectual property would have the single biggest impact on accelerating the capability and growth of smart manufacturing in the country.”

This is where open access to articles and data cuts into the “real world” and benefits can be seen outside the research community.

Some sceptical publishers continue to argue against Green OA and for locking down copyright on the grounds of (unproven) economic impacts on their business. Open Access journals, while developing, are still far from the norm: “hybrid” journals continue to charge high fees on top of their continuing subscription costs. The response from much of the publishing world has been to see open access as an additional profit line, or as something to allow by exception, rather than a recognition of a different and new way of working and of OA as playing a part in a far larger working environment.

This report highlights that there is an economic world outside the publishing industry too, and one which is crying out for the benefits of OA.

Given the potential for open access to research to benefit this wider economic picture, as well as collaborative developments between research institutes and industry,  restrictive arguments become increasingly untenable. If funders want OA, researchers want OA, institutions want OA and industry wants OA, why are some publisher’s contracts still stopping this from happening?



Data-mining and repositories

There has been discussion for a while about the limits of using material from typical repositories.  In the absence of formal user-licences, there is an implict permission to read material – but what about data-mining?

A formal, cautious approach is to assume a lowest-common-denominator approach, where the rights to re-use material in a repository are taken as being those of the most restricted piece of content.

Material in a repository is normally pretty heterogeneous with respect to re-use rights.  And where the individual pieces do not have their rights associated with them, liberal rights pieces cannot be told apart from restrictive rights pieces.  Material that is truly Open Access is in a minority and will generally result form archiving true Open Access published materials, from BMC for example, where limited named rights are explicitly given to the publisher.  Many more OA published materials are actually restricted in re-use:  many OA publishers are anything but true OA. The corollary of true OA publishing is that all rights not granted to the publisher are retained by the author and then, presumably, licensed to the repository.

The majority of content is in a different situation from true OA material.  It will be in there as a result of copyright being transferred or exclusively licensed to a publisher, who has then granted back, or allowed retention of, nominated rights.  In such a circumstance then the author (and by extension the repository, see above), has only these certain, nominated, rights and if data-mining or other forms of re-use are not mentioned explicitly, then strictly, no such right exists.  Some publishers explicitly exclude the right to data-mine the article and so, without being able to identify these, the lowest common-denominator approach kicks in.

The easiest solution for data-mining (and it could be argued for open access in general)  is blanket rights for data-mining being retained by funders: or for publicly funded research to be placed in the public domain as regards copyright, as is done in the States.

All this concern, of course, restricts the full potential of open access being realised: what assumptions can or should be made, what liaibility, if any,  should be risked in order to get at this potential?

Mendeley have one solution:  do it and see!  They have just announced a competition to mine the articles that authors have put on their Mendeley accounts.  It will be interesting to see how the rights issue will be handled: it may prove to be a model for others to follow.


Developing complexity and service response

Following from the release of a major upgrade to RoMEO during Open Access week, the Centre for Research Communications, which runs the SHERPA services RoMEO, JULIET and OpenDOAR, has now launched two User Surveys to gather feedback from the community – one for RoMEO and one for OpenDOAR and ROAR. These surveys are to help prepare for support of an increasingly diverse research communications environment.

As part of RoMEO we have always had a suggestion form for new publishers or for updating information and an active community of contributors and suggestions.  However, we wanted to launch the current survey to more formally gather comment, opinion and wishes for the future development of RoMEO as the circumstance of its use changes over time.

The service originally developed to interpret publishers’ copyright transfer agreements for author self-archiving and we want this to continue as the core of RoMEO.  The system started with a single aim and could interpret, summarise and present information from this single viewpoint.  As time has passed the situation for archiving has grown more complex – and users’ needs have matched this.  The growth of “hybrid” options for journals has made a single interpretation of a journal’s copyright contract impossible. Individual funders have come to agreements with some publishers for Open Access publishing and therefore (sometimes but not always) Open Access archiving rights also apply to publication of work they have funded.  Sometimes individual publishers have recognised and matched the requirements of some (but often not all) funding agencies and for them allow their standard terms to be modified.

All of this gives a far more complex environment for authors to work in and underlines the need for assistance in guiding authors through their options and responsibilities.  It also presents real challenges to RoMEO in  providing this. If any service is to be used successfully by end-users, then it has to reflect the users’ needs and fit into their workflow. If one of the current drivers for archiving work is compliance with funders’ mandates, then these need to be represented and permissions summarised.  However, many mandates have a focus on OA publication, rather than archiving.  Given the number of funding agencies and the complexity of their requirements (summarised in and linked to JULIET) as these apply to every publisher, the original fairly clear RoMEO interface became quite crowded.  The upgrade from last week has attempted to deal with this, in allowing “single funder views” of the data, as it were, but the diversity of possible approaches to the data remains, We are aware that archiving in an institutional repository practically takes a place within a suite of options that needs to be presented with clarity and simplicity.

This is a reflection of a larger picture – how will this look in future?  What is being developed in practice within institutions to deal with the requirements of funders, authors and publishers?  From a strategic point of view, what can services like RoMEO give in support of wider access to information?

We have also released a survey for OpenDOAR and ROAR. These services, run by the CRC and University of Southampton respectively, share some aspects of work in analysing the world’s repositories, but exist as separate services with individual aims. ROAR has a focus on quantitative and statistical analysis of repositories and their holdings; OpenDOAR has a focus on qualitative analysis and policy and standards development.  Each of the services has healthy feedback from its users, but again, we wanted to more formally gather comments from the community on the services as they will be used in a more diverse picture of repositories.

Here too the situation has become more complex over the years that they have been in operation.  While the original aim and distinctive difference for open access repositories was that anyone could access the full text, for many repositories this has been bypassed by conflicting needs so that for some the great majority of their content is merely metadata. Many, and probably most, repositories accept metadata entries, maybe driven by concerns to display high levels of records irrespective of full-text links; or because the repository is used for internal purposes that require no more than full-text; or because there is the hope that at some point in the future, there will be enough staff resource to chase down the full-text.

Whatever the reason, the decision to accept metadata is a significant one. It means that many searches of open access repositories now end in a bibliographic entry with no access to the full-text article, or simply a link to it held on the publisher’s website.  For the researcher looking for material, this effectively undercuts the rationale for searching repositories in the first place. It is hard for any advocate to engage researchers with open access as a distinctive and different service when the full-text content is not there.

Having said that, some drivers for the adoption of institutional repositories now seem sufficiently strong (at least, to some institutions) to match the original idea of full-text access. The use of the repository as an enhanced research publications database is one example: others include it as an administrative system for projected REF needs; other requirements may be met by full-text access on-campus, even if restricted off-campus.

While the continued growth of metadata-only records remains a significant challenge for advocates and the future use of the repository network, here too, developments take their place within a wider and more complex environment of different use, structure and purpose of repositories.

Again, one of the original distinctions was that the repositories should be openly accessible. The fact that many repositories are set up as closed access in some way (registration & password systems, even subscriptions or pay-per-view) but identify themselves as open access was one of the drivers in the establishment of OpenDOAR, with a policy of a human accessing each repository and sampling holdings to check that what was being claimed was true.  Since the start of OpenDOAR we have rejected between 25 – 33% of candidate repositories because they are out of scope – no full text at all, not open, test sites, junk data etc.

The types of material held in repositories has grown to include research data, learning objects, varieties of grey literature, specialist collections, and others. Some of this content brings with it understandable restrictions on access while at the same time being appropriate for a repository-like collection and (partial) exposure.

Combined with the variety of purposes that repositories are accumulating, this means that repositories of different “flavours” now take their place in a more complex, interesting and ultimately more rewarding environment. While I believe that we cannot afford to loose sight of the key goal of access to full-text research, services offered by OpenDOAR and ROAR (among many others) have to change to reflect this and allow the diverse requirements of the users of repository content and the diverse basis for repositories to be reflected in the service they provide.

The level to which this happens with these services is a reflection of the larger question.  To what extent should we all continue to press for the original OA vision if this is at the expense of the easy growth of some alternatives (metadata repositories, partial access etc)? Should future development in the field be guided by what has proved popular and practical so far, if this fails to address the original goal of full-text open access and original method and goal of author-engagement and self-archiving? Do we set goals that are the natural extension of what we see developing, or aim for the more robust and clear vision that was articulated in Budapest and elsewhere?

Have your say in how some of the support services in this developing environment will themselves develop. Do contribute to the RoMEO survey and the OpenDOAR and ROAR survey.  We will be interested to see your thoughts.


Cross-linking between repositories

A thread on JISC-Repositories this week has been discussing whether to delete repository records when an academic leaves.  This set me thinking about such policies in general and how the interaction of different policies between repositories may affect access or collections in the long run. It is an example, I think, of the way that institutional repositories work best when seen as a network of interdependent and collaborative nodes that can be driven by their own needs but produce a more general collective system.

Our policy in Nottingham is that we see our repository as a collection of material that is produced by our staff.  Therefore, our policy developed that when a member of staff leaves, we will not delete their items as this is a record of their research production while they were here.

More than that, authors should not expect such deletion even upon their request, except in very unusual circumstances.  If repositories are to be used as trusted sources of information, the stability of the records they hold is very important.

If authors have put material into the repository which includes their “back-catalogue” produced at previous institutions, then that is fine too — we will accept them and keep them.  Strictly, they did not produce this material while they are employed at Nottingham, but if they are not openly accessible elsewhere, why not take them?  It might be slightly anomalous to hold this material but if it opens access to research information, that’s the basis of what it’s all about.

I think there is a transition period here, while academics adopt the idea of depositing material.  I think it’s likely that academics will put their back-catalogue to date into the first major repository that they use in earnest, if they have the right versions available.  Thereafter, as this material should be kept safe and accessible, they can always link back to it.  In other words, once they have deposited their back catalogue, there are unlikely to want to do it at every subsequent institution they move to: as long as they know it will be safe and that they can link to it.  There is an advocacy theme here to help researchers understand that repositories are linked and that the repository – and repository network – will serve them throughout their career.

For a newly-arrived member of staff with material in a previous institution’s repository, then it all depends on the new institution’s collection policy as to whether the institution would prefer them to just deposit outputs they produce from that time on; deposit all their own material again; or create a virtual full record of outputs by copying the metadata and linking back to full-text in the previous repository(ies). This will depend in turn as to whether the previous repositories are trusted to match the new institution’s own terms for access and preservation.

Maybe if the material is held on a repository without long-term assurance of durability — maybe on a commercial service — and if the institution’s repository works on a level which cannot be matched, then there would be a rationale for holding a local copy of the full-text.  This may be held and exposed, or possibly be held in reserve in case of external service failure. Otherwise, simply linking back to the full-text held on the previous repository seems most practical if a full record is required.

If the previous repository is trusted to provide the same level of service in access, preservation, and stability, then it does not really matter which URL or repository underlies the “click for full text” link.  Academics can compile their list of publications and draw from the different institutions at which they have worked: repositories can hold their own copy of metadata records and link to external trusted repositories; and as far as the reader-user is concerned it’s still “search for paper — find — click — get full-text paper”.

This kind of pragmatic approach may well mean that some duplicates (metadata record and/or full-text) get into the system by being held at more than one location.  Duplication/ close-to-duplication will have to become a non-issue. I cannot see that duplication can be completely avoided in future: it already happens.  As such, handling close and exact duplicates is an issue we cannot avoid and must solve in some way as it inevitably arises. That is not to say that the publisher’s version will automatically become the “official” record in the way that it tends to be used now. We do not know how versions/ variants/ dynamic developments of papers will be used and regarded by researchers: we are just at the start of a period of change in research communications. Therefore if a process offers solutions and benefits, associated risks of duplication are not sufficient to dismiss the process as impractical.

After all, what is the alternative?  If as repository managers we start deleting records when folks leave and have to create/import/ask the academic for a complete set of their outputs when they arrive at a new institution, I think we, and significantly the users of open access research, will very quickly get into a situation where we lose track of what is where.

Even if we try to create policies or systems to replace an old link (to the now-deleted full-text), with a new link (to the full-text in the new repository), I cannot see this working seamlessly and things will get lost.  In addition I think that subsequent moves by the author would create daisy chains of onward references which would be very fragile.

While the use of repository references in citation of materials relates to research practice and so is for resolution between researchers rather than between ourselves, I don’t think we should deliberately disrupt longer-term references to material. Rather, I would see the system building on existing stable records and all institutional repositories able to play a part in the system-wide provision of information as stable sources.

Therefore, I would suggest that repositories should continue to hold staff items after they have left, as this helps fulfil their role as institutional assets and records. Repositories can accept an academic’s back-catalogue, even if it has not been produced at the institution, as being anomalous but in line with our joint overall aim of providing access to research information. Adopting standard practices will help reassure each institution that other repositories can be trusted with access and curation and allow stable cross-linking. Once a repository has material openly accessible, then, given matching service levels, the whole system supports linking to that material, without anything but possible local needs for additional, duplicate copies.   Overall, repositories can follow their institutional self-interest and still create a robust networked system.