I am trying to keep this short. You might remember my recent blog post on data sharing. I basically wanted to point out that data acquisition can be an art on its own. It can take months of planning, applying for permits, securing money, coming up with an elegant sampling design, and finding what you had been looking for. This is usually followed by weeks in the laboratory extracting the right molecules and preparing them for sequencing. Some people are really good at this and they should get credit for it. How can we make this happen?
Good data makes good science. Science is communicated through articles. Once we communicate our results through articles we should make the data public. A. Murat Eren recently wrote a blog post why data sharing is so important. Please read it. I hope we all agree on making the data public once we submit our articles and let reviewers do their job. Let other people play around with your data and let them use it to find new results. But I am still unsatisfied about the way we give each other credit. The currency is publications. So let me ask this bold question: Does producing fantastic data justify being a co-author on a future paper?
The International Committee of Medical Journal Editors defines the roles of authors and contributors as follows:
- Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND
- Drafting the work or revising it critically for important intellectual content; AND
- Final approval of the version to be published; AND
- Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Does that mean, I should invite data producers to be co-authors, and if they fulfill all four requirements, they will get the chance to contribute to future papers based on their data?
Can we just properly cite all the good work in the bibliography?
Some articles are based on thousands of sources and it would not be practical to write a paper with a thousand of references. So how do we give credit to the people who work hard to produce good public data?
As Meren phrases it, maybe we just don’t. He makes the analogy to being a parent. At some point you have to let your children go and you won’t take credit for their accomplishments away from home.
Maybe we could at least mention the data producers we depend on in the acknowledgements section? ‘Contributors who meet fewer than all four of the above criteria for authorship should not be listed as authors, but they should be acknowledged.’ And as @phylogenomics pointed out one year ago, they should be acknowledged with their ORCID. This would allow us to perform summary statistics on acknowledgements sections. You can read the Twitter moment here. Would we then add a new statistic into our CVs giving the number of times we were acknowledged for good data? Sounds silly to me. Moreover, I would not approve if people used the acknowledgements section to reduce the number of references, or even worse, reduce the number of co-authors!
It seems to me that acknowledging people for producing public data is not feasible. Although everybody would profit from good data. It would be nice to give credit for creating fantastic data instead of just any useful data.