Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Data-Intensive Research: Cite a Dataset

This guide covers data services and resources available to the WCMC community.

Featured Resource

Data Citation

Citing data encourages the replication of scientific results, improves research standards, and gives proper credit to data producers. Data citation helps to: 

  • Enable easy reuse and verification of data
  • Allow the impact of data to be tracked
  • Create a scholarly structure that recognises and rewards data producers 

(Source: DataCite)

Guidelines for citing datasets.

Contact the library for additional help with data citation.

Find and Store Data

General Data Sources

  • Re3Data Most comprehensive list to date of research data repositories with downloadable datasets. Browse by subject, content type, and country
  • Data.gov As a priority Open Government Initiative, Data.gov aims to increase the ability of the public to easily find, download, and use datasets that are generated and held by the Federal Government. 
  • Open Data Sites via Data.gov Complete list of open data cites organized geographically. Full list can be downloaded as CSV or Excel file
  • US Census Data Access Tools Data downloads, tools, and resources provided by the US Census Bureau
  • The World Bank Data Catalog Listing of available World Bank datasets, including databases, pre-formatted tables, reports, and other resources.
  • Dryad International repository of data underlying scientific and medical publications.
  • FigShare Research repository where users can make all of their research outputs available in a citableshareable anddiscoverable manner. 

Healthcare Specific Data Sources

  • NHDS National Hospital Discharge Survey conducted annually from 1965-2010. National probability survey designed to meet the need for information on characteristics of inpatients discharged from non-Federal short-stay hospitals in the United States.
  • SEER Data Cancer data provided by the Surveillance, Epidemiology, and End Results Program including incidence and population data associated by age, sex, race, year of diagnosis, and geographic areas. 
  • VirtualRDC @ Cornell Access to synthetic data constructed to statistically approximate the data available within the secure and restricted access environment of the Census Bureau's Research Data Centers (RDCs)
  • HealthData.gov Datasets and statistics. Site managed by the U.S. Department of Health & Human Services
  • CDC Data and Statistics Downloadable datasets searchable by topic provided by the Centers for Disease Control and Prevention
  • CDC Public-Use Data Files Downloadable public-use data files provided by the National Center for Health Statistics through the Centers for Disease Control and Prevention's (CDC) FTP file server. Access data sets, documentation, and questionnaires from NCHS surveys and data collection systems.
  • World Health Organization Data & Statistics WHO's portal for giving access to data and analyses for monitoring global health
  • ClinicalTrials.gov Registry and results database of publicly and privately supported clinical studies of human participants conducted around the world

Quick Links