Skip to Main Content

Data-Intensive Research: Data Analysis & Visualization

This guide covers data services and resources available to the WCMC community.

Data Analysis and Visualization

Explore the tools and resources on this page to find new ways of both understanding and presenting your data. Many of the tools and resources here are open source, but some need to be purchased after various trial periods.

Tools

Data-Preprocessing

  • Open Refine - Free data pre-processing tool to clean up messy data
  • Tabula - Free tool that allows you to extract data tables from PDFs into CSV format

 

Data Analysis & Visualization

  • WCM Library Scientific Software Hub - Here you can find information and access to all the software that is licensed to Weill Cornell Medicine faculty, staff and students, or is freely available
  • Power BI - Power data visualization from Microsoft
  • CU IT Available Software
  • CISER Computing Accounts - Apply for remote access to use statistical software and other applications on the research servers from your computer. Service provided by the Cornell Institute for Social and Economic Research.
  • R - Free environment for statistical computing and graphics using the language, R. Available for Windows, Mac, and UNIX platforms. Consider also installing R Studio
  • Python - Programming language for analysis and other applications that lets you work quickly and integrate systems effectively. See also the Anaconda Distribution of Python for access to extras like the iPython Notebook
  • SPSS - Software package used for statistical analysis and predictive analytics.
  • Stata - General use statistical software package
  • SAS - Integrated system of software products focused on analytics. Periodically available in the Library Computer Lab when courses using the software are being taught in the Computer Lab
  • ArcGIS - Geographic Inforation System (GIS) for working with maps and geographic data.
  • WEKA - Free machine learning algorithm suite for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Developed by the University of Waikato
  • D3.js  - Open source JavaScript library for manipulating documents based on data.
  • Tableau - Family of interactive data visualization products with free one month trials available. Tableau Public is freely available. Tableau On Demand Training and Tutorials are available to demonstrate different aspects of the Tableau interface.
  • Sci2 - A modular toolset designed for the study of science. It supports temporal, geospatial, topical, and network analysis and visualization of datasets at individual, local, and global levels.
  • Gephi - Open source graph visualization and manipulation software

Helpful Resources

Data Analysis

  • General Analysis
    • Guide to Resources for Data Analysis - This helpful guide provided by Northwestern University lists resources for data analysis organized software package and level of proficiency.
    • Data Analysis Examples - From the UCLA Institute for Digital Research and Education. Examples use Stata, SAS, SPSS, MPlus, and R
  • R Specific Resources
    • RFUN - Comprehensive, easy-to-follow tutorials
  • SAS Specific Resources
  •  SPSS Specific Resources 
    • IBM SPSS Tutorials - Tutorials and case study examples to walk you through how to work with SPSS
    • Learning SPSS Guide - Resources provided by the Institute for Digital Research and Education at UCLA

Data Visualization 

Quick Links

The following links are located in every tab of this guide in the right-hand column for your convenience.

Additional Training

  • LinkedIn Learning - Comprehensive learning modules covering a wide variety of data-related topics
  • CISER Workshops Downloadable computing workshops provided by the Cornell Institute for Social and Economic Research
  • Data Analysis Free online course periodically offered through Coursera.org. Instruction by Johns Hopkins Bloomberg School of Public Health