Publisher and Institutional Repository Usage Statistics (PIRUS 2)

Yesterday I attended a seminar – Counting Individual Article Usage – which reported on the results of the JISC funded PIRUS 2 Project.

It was a full day, with many interesting speakers. In the morning the talks focused on the project itself, while afternoon talks covered the bigger picture.

All About PIRUS 2

  • Hazel Woodward started things off by setting the stage and providing us with the aims and objectives of the PIRUS 2 project. These can be found here. Basically they were looking at the viability of creating a system that can bring together usage (download) data, at an article level, from publishers and repositories, in a standardised format.
  • Peter Shepherd from COUNTER then gave a review of the organisational, economic, and political issues involved with the project. Cost allocation hasn’t been explored fully, but currently publishers would be expected to carry the brunt of the costs with repositories also contributing. Politically, there are still a lot of issues that remain (one being whether publishers and repositories are actually willing to provide their own data, and willing to pay for such a service).
  • Paul Needham then took us through the technical side, and showed us that, yes, it is technically feasible to collect, consolidate, and standardise “download event” usage data from a number of different providers.
  • Ed Pentz from CrossRef then talked about the importance (and relevance to PIRUS) of DOIs, and also described ORCID (Open Researcher and Contributor ID).
  • Paul Smith from ABCe spoke about their possible role as auditor for PIRUS.

The bigger picture

  • Mark Patterson from PLoS, then gave an interesting talk, describing some of the new alternative impact metrics (some that PLoS now provides). He cited people such as Jason Priem (see alt-metrics: a manifesto) and commented that changing the focus from Journal to article, would change the publication process.
  • Gregg Gordon from SSRN also spoke of alternative methods to measure usage, and also noted the importance of context when thinking about usage.
  • Daniel Beuke from OA Statistik then gave a review of their project (very similar to PIRUS) set in Germany. It would be interesting to see how these two teams could work together. These projects (along with SURE) have worked together under the Knowledge Exchange’s Open Access Usage Statistics work (see here for their work on International interoperability guidelines).
  • Ross MacIntyre then spoke about the Journal Usage Statistics Portal, another JISC supported project
  • Paul Needham then gave us a demonstration of the functioning PIRUS database and we closed the day with a panel discussion.

Unfortunately, I felt not enough emphasis was placed on demonstrating the usefulness of PIRUS 2 and the data that it could potentially generate. The political side of the discussion would also have been very interesting to delve into further.

Interesting things that kept popping up:

  • The importance of standardisation of author and research names (ORCID)
  • The need for metadata description standard (e.g. whether the paper is peer reviewed)
  • And the need for all publishers to use DOIs

Some of the questions I’m still thinking about:

  • Are publishers really willing to share this data?
  • What can a publisher really gain from this type of collation of usage data? And a repository?
  • To make it most useful everyone would need to contribute (and have access?) What would be the competitive advantage to having access to this data if everyone has access?
  • We now know it is technically feasible, but is it economically and politically feasible?
  • Are we ready to place value on these alternative metrics of usage (i.e. not Journal Impact Factor)? Who says we are ready? Are institutions ready? Will this usage data count as impact in the REF?
  • What about other places people put articles – personal web pages, institutional web pages, etc. – could this data be included?
  • What about including data from the downloads of briefing papers, working papers, and preprints? Doesn’t usage of these also signify impact?

Springer’s Realtime

It seems that traditional publishers may finally be beginning to catch up with the capabilities of the internet, and actually collecting and sharing metrics for the journals and articles they publish.

Springer has recently released Realtime, which aggregates download data from Springer journal articles and book chapters and displays them with pretty pictures (graphs, tag clouds, maps, and icons).

You can look up download data by journal (I looked up Analytical and Bioanalytical Chemistry) and it will show a graph of the number of downloads from the journal over time (up to 90 days). It also lists the most downloaded articles from the journal (with number of downloads displayed), and if you sit on the page for a while (a short while in the case of this journal) you will see which article was most recently downloaded (this is the “Realtime” part). They also display tag clouds of the most frequently used keywords of the most recently downloaded articles, a feed of the latest items downloaded, and an icon display that shows downloads as they happen.

Springer states that,

[t]he goal of this service is to provide the scientific community with valuable information about how the literature is being used “right now”.

Some of this information is definitely valuable (download counts for articles), and some of it is merely fun and pretty (the icon display). The real question is will they be providing download counts for individual articles on an ongoing / long-term basis? Currently you can only look back 90 days, and you can’t search for individual articles…you can only see download counts for your article (or a particular article) if it happens to be one of the most downloaded.  So for this to actually be useful to authors and the institutions they come from, Springer will have to give us a little more, but it is a step in the right direction.