A Curated Collection of Data Management Resources
By Crystal Lewis in resources
September 14, 2022
Below are resources for those wanting to learn more about research data management. I have organized resources by type (papers, guides, slides, etc). There are MANY more resources than the ones listed below, but I have narrowed them down to those that I have found most helpful in understanding general best practices in data management and how to implement them into a research workflow.
Resources were last updated on 2023/02/19
Guides
These are guides that are helpful for understanding data management in the context of the entire research process.
Author | Title |
---|---|
CESSDA | Data Management Expert Guide |
Crystal Lewis | Data Management in Large-Scale Education Research |
ELIXIR | Research Data Management Kit |
J-PAL | Research Resources |
ICPSR | Guide to Social Science Data Preparation and Archiving |
Institute of Education Sciences | Sharing Study Data: A Guide for Education Researchers |
Johns Hopkins Institute for Clinical and Translational Research (ICTR) | Best Practices for Research Data Management |
Reynolds, T., Schatschneider, C. & Logan, J. | The Basics of Data Management |
The Turing Way | Guide |
Papers
These are papers written about research data management practices and workflows.
Checklists
Thesea are checklists to help researchers plan for the data management process through the research life cycle.
Author | Title |
---|---|
Crystal Lewis | Checklists |
Harvard Longwood Research Data Management | Research Data Management Checklist |
Kristin Briney | Data Management Plan Checklist |
Stanford Medicine Lane Medical Library | Data Management Checklist |
UK Data Service | Checklist |
Slides
These are slides from presentations on the research data management life cycle and best practices to implement.
Author | Title |
---|---|
Logan, J. | Data Management and Data Management Plans |
Logan, J. | Data Sharing and Data Shared |
All slides available from Kristin Briney | Kristin Briney Slide Share |
Other Resources
These are other types of resources ranging from blog posts to podcasts, that also provide excellent data management content.
Resource Type | Title |
---|---|
Blog | Kristin Briney Blog |
Blog | Teague Henry: Strings Not Factors Blog |
Glossaries | FORRT Glossary; Open Science Training Handbook Glossary; Data Flow Toolkit Processes; UCSF HRPP Definitions; Cornell University Glossary of data management terms; |
Podcast | Within & Between |
Podcast | IDEA: Improving Data Engagement and Advocacy |
Syllabus | Data Management for Psychological Science: A Crowdsourced Syllabus |
Workshop Materials | SREE Data Management for Data Sharing Workshop |
Organizations
These are organizations, many working in the area of open science and reproducible research, who often have searchable databases that can be used to find resources around data management, among other topics.
Organization | Resource Hub |
---|---|
CESSDA | Training Resources |
DataONE | Data Management Skillbuilding Hub |
FORRT | Curated Resources |
FOSTER | Resources |
OER Commons - Hubs | Open Scholarship Knowledge Base |
Search "data management" in repository databases for great resources | OSF; Zenodo; figshare; LDbase |
UK Data Service | Learning Hub |
University Librarians (this is just a small sampling!) | University of New Hampshire Library; Stanford Medicine Lane Medical Library; University of Pittsburgh Library System |
Data Cleaning Resources
Oftentimes people are interested in resources specific to developing good data cleaning workflows. While the resources above are about overall data management practices, below are some resources specific for how to structure and organize your data files, as well as common rules and steps for cleaning and validating your data in preparation for sharing and analysis. Although I advocate using code for data cleaning, and I am an avid #rstats user, I recognize that researchers use different tools and I want to provide resources that meet people where they are. Therefore, the resources below, even if specific tools are mentioned, are provided because I simply appreciate the general data cleaning process that is provided. In a future blog post I plan write about setting up a reproducible, reliable, and automated data cleaning workflow in R for those who are interested.
*Note: The entries with asterisks are cross-references from earlier in this post. They just fit in both areas.
Author | Title |
---|---|
ACAPS | Data Cleaning |
Borer, E., Seabloom, E., Jones, M. & Schildhauer, M.* | Some Simple Guidelines for Effective Data Managment |
Broman, K. & Woo, K. | Data Organization in Spreadsheets |
Crystal Lewis* | Data Cleaning Plan |
Ellis, S. & Leek, J. | How to Share Data for Collaboration |
Hubbard, A. | Data Cleaning in Mathematics Education Research: The Overlooked Methodological Step |
Innovations for Poverty Action | Reproducible Research: Best Practices for Data and Code Management |
Innovations for Poverty Action | Cleaning Guide |
J-Pal | Data cleaning and management |
Karl Broman | Steps toward reproducible research |
Karl Broman | Data Cleaning Principles: Talk for csvconf |
Kline, et al. | Psych-DS: A Technical Specification for Psychological Datasets |
Reynolds, T., Schatschneider, C. & Logan, J.* | The Basics of Data Management |
- Posted on:
- September 14, 2022
- Length:
- 5 minute read, 913 words
- Categories:
- resources
- Tags:
- data management data cleaning