During the 2020 COVID-19 shutdown, one of us (S.M.), a graduate student at the time, was asked to write a short manuscript using a previous laboratory member’s unpublished data. Thinking it would be a quick and easy pandemic project that would lead to a publication and a chapter in his dissertation, he happily took on the project. But the happiness didn’t last.
After weeks of trudging through poorly documented programming codes and data files, S.M. realized that much of the raw data he needed were missing. V.A., a postdoc in the lab, was recruited to help with the project. Together, we were ultimately able to locate some of the lost data, but they were spread across two other hard drives, saved using file names that were inconsistent with the original data set. We spent the following year repeating experiments, but some data could not be re-collected, because the animal models used in the original project were long gone. Two years later, the paper remains unpublished and has no clear narrative because of inconsistencies and gaps in the data. Sadly, such stories of data loss seem common in academia.
As longer-term lab members who have borne and continue to bear the brunt of continuing older projects, we think these problems arise from a misplaced assignment of responsibility. Academia heaps most of the burden of documentation and data storage onto individuals, instead of the lab as a whole. At the same time, little, if any, instruction is provided to teach individuals how to properly document and store their data. But labs can mitigate data loss by implementing three simple suggestions.
The optimal practices for documenting and storing data differ from team to team. Our lab focuses on electrophysiology, so our data consist mostly of large data files of raw neural traces. In the example described above, our biggest frustrations were the lack of a centralized data repository and inconsistent file names. Since that experience, we have created a dedicated folder on the lab server where team members are expected to upload their raw data, and we have established a uniform naming standard for files: ‘User_YYYY_MM_DD_Exp#’.
We were also hampered by a lack of documentation for the data that we did find, which is necessary for determining experimental parameters, such as sampling rates, stimulation frequencies and stimulus intensities. To alleviate this issue, we have developed and implemented a semi-automatic software routine that extracts this information from each experiment and saves it on the server alongside the raw data. These standards reflect our experience of what works best for our lab; other labs will have their own best practices and standards. Whatever you do, consult with those actually running the experiments to ensure that your proposed standards match their experimental and analytical needs.
Manage personnel transitions
As the most senior members in our lab, we have helped many researchers come and go. Through that experience, we have learnt that it is important to implement clear onboarding — and, more importantly, offboarding — procedures to retain institutional memory.
Onboarding is relatively easy: get the new member acclimatized to the lab, train them on protocols and standards and help them to get their project off the ground. Offboarding is harder, yet is at least as important. The difficulty arises from trying to get exiting lab members to properly document and pass on their projects, either because they are rushing to finish or are too disengaged to care. The principal investigator (PI) needs to set the expectation that exiting lab members will devote time to wrapping up projects. Much of this could be achieved through a well-defined exit interview with the PI and perhaps other lab members who will inherit unfinished projects. During this meeting, all notes, data and essential materials are handed over, and the current status and next steps of projects are discussed.
Alternatively, you could do what one of us (S.M.) did when he was getting ready to leave his previous lab. Around the time of his exit date, he was chosen to present a talk at his lab group’s annual retreat. Preparing for his presentation inadvertently required him to tick off most of the to-do-list items: organizing data and notes, summarizing results and background material, and describing next steps.
Other labs could implement something similar as a standard practice, for instance by requiring outgoing members to give an exit presentation of what they have accomplished during their tenure one or two weeks before their departure. Leaving an organized project makes it easier for others to pick it up and complete it.
Maintain institutional knowledge
Even if lab members document everything that they’re supposed to, the effort is wasted if their notes and data cannot be retrieved later. The solution is to establish a well-organized digital archive of documents, data, reagents and other relevant materials for each project. Long-term data-storage solutions are plentiful. Although finding one that fits your lab’s needs is not trivial, it’s worth it in the long run. We recommend checking with your institution to see what resources they can provide or recommend.
Of equal importance is putting long-term personnel, or even the PI, in charge of data storage. Only personnel who have been in the lab for a long time (and who are likely to stay for a long time) can know what research projects have been started and where project materials and data might reside. They can ensure that these data are maintained over the long term.
It takes a lab to fight data loss. Individual researchers are rightfully responsible for keeping good notes that others can follow. But the lab as a whole needs to accept greater responsibility for setting standards, archiving notes and data and staying apprised of what has been tried, what has succeeded, and what has failed so that the data and knowledge are not forgotten. We feel that the steps outlined here will result in a more productive lab that will avoid many of the headaches we experienced. But don’t be afraid to ask others about their experiences while developing your own plan for preserving data. Your future team members — and future selves — will undoubtedly appreciate it.