Journal articles are the currency of scholarly research. As a result, we as a community, use sophisticated methods to gauge the impact of research and measure the attention it receives by analyzing article citations, article page views and downloads, and social media metrics. While imprecise, these metrics offer us a way to identify relationships and better understand relative impact. One of the many challenges with these efforts is that scholarly research is made up of a much larger and richer set of outputs beyond traditional publications. Foremost among them is research data. In order to track and report the reach of research data, we must build and maintain new, unique methods for collecting metrics on complex research data. Our project will build the metrics infrastructure required to elevate data to a first class research output.

In 2014, members of this proposal group were involved in an NSF EAGER research grant entitled, Making Data Count: Developing a Data Metrics Pilot . That effort surveyed scholars and publishers and determined which metrics and approaches would offer the most value to the research community. We spent one-year researching the priorities of the community and exploring how ideas common to article level metrics (ALM) could be translated to conventions in data level metrics (DLM) and building a prototype DLM service. We determined that the community values data citation, data usage, and data download statistics more than they value the metrics focused on social media. Based on this research, the project partners went a step further and isolated the gaps in existing data metrics efforts:

  • there are no community-driven standards for data usage stats;
  • no open source tools to collect usage stats according to standards;
  • and no central place to store, index and access data usage stats, together with other DLM, in particular data citations.

This project proposes to fill these gaps by engaging in the following activities:

  1. We will work with COUNTER to develop and publish code of practice recommendations for how data usage should be measured and reported
  2. We will deploy a central online DLM hub based on the Lagotto software for acquiring, managing, and presenting these metrics
  3. We will integrate new data sources and clients of aggregated metrics to serve as exemplars for the integration of data repositories and discovery platforms into a robust DLM ecosystem
  4. We will encourage the growth and uptake of DLMs through an engaged stakeholder community that will advocate, grow, and help sustain DLM services

Throughout each of these activities, we will encourage the growth and uptake of DLM through an engaged stakeholder community that will advocate, grow and sustain the services. As a result, the community will finally have the infrastructure needed to build relationships and better understand the relative impact of research data.