Data Science, Machine Learning, Deep Learning, and Artificial intelligence are among the most in-demand skills at this moment and offer a lucrative career with higher salaries. Harvard University offers free data science and AI courses on the online learning platform edX. The article covers 15 such popular data science courses from Harvard University, along with their benefits and learning outcomes.
Free Data Science Courses from Harvard University
Statistics and R
Length – 4 Weeks
Level – Intermediate
You will learn
- Random variables
- Distributions
- Inference: p-values and confidence intervals
- Exploratory data analysis
- Non-parametric statistics
Data Science Linear Regression
Length – 8 Weeks
Level – Introductory
You will learn
- How linear regression was originally developed by Galton
- What is confounding and how to detect it
- How to examine the relationships between variables by implementing linear regression in R
To learn more about data science, read our blog on – What is data science?
Data Science: R Basics
Length – 8 Weeks
Level – Introductory
You will learn
- Basic R syntax
- Foundational R programming concepts such as data types, vectors arithmetic, and indexing
- How to perform operations in R including sorting, data wrangling using dplyr, and making plots
Data Science: Visualization (using R)
Length – 8 Weeks
Level – Introductory
You will learn
- Data visualization principles
- How to communicate data-driven findings
- How to use ggplot2 to create custom plots
- The weaknesses of several widely-used plots and why you should avoid them
Causal Diagrams: Draw Your Assumptions Before Your Conclusions
Length: 9 Weeks
Level – Introductory
You will learn
- Translating expert knowledge into a causal diagram
- Drawing causal diagrams under different assumptions
- Using causal diagrams to identify common biases
- Using causal diagrams to guide data analysis
Principles, Statistical and Computational Tools for Reproducible Data Science
Length: 8 Weeks
Level: Intermediate
You will learn
- Understand a series of concepts, thought patterns, analysis paradigms, computational and statistical tools
- Fundamentals of reproducible science using case studies that illustrate various practices
- Key elements for ensuring data provenance and reproducible experimental design
- Statistical methods for reproducible data analysis
- Computational tools for reproducible data analysis and version control (Git/GitHub, Emacs/RStudio/Spyder)
- Tools for reproducible data (Data repositories/Dataverse), reproducible dynamic report generation (Rmarkdown/R Notebook/Jupyter/Pandoc), and workflows.
- How to develop new methods and tools for reproducible research and reporting
- How to write your own reproducible paper
Statistical Inference and Modeling for High-throughput Experiments
Length – 4 Weeks
Level – Intermediate
You will learn
- Organizing high throughput data
- Multiple comparison problem
- Family Wide Error Rates
- False Discovery Rate
- Error Rate Control procedures
- Bonferroni Correction
- q-values
- Statistical Modeling
- Hierarchical Models and the basics of Bayesian Statistics
- Exploratory Data Analysis for High throughput data
Advanced Bioconductor
Length – 5 Weeks
Level – Advanced
You will learn
- Static and interactive visualization of genomic data
- Reproducible analysis methods
- Memory-sparing representations of genomic assays
- Working with multiomic cancer experiments
- Targeted interrogation of cloud-scale genomic archives
Data Science: Capstone
Length – 2 Weeks
Level – Introductory
You will learn
- How to apply the knowledge base and skills learned throughout the series to a real-world problem
- How to independently work on a data analysis project
Introduction to Bioconductor
Length – 5 Weeks
Level – Intermediate
You will learn
- What we measure with high-throughput technologies and why
- Introduction to high-throughput technologies
- Next Generation Sequencing
- Microarrays
- Preprocessing and Normalization
- The Bioconductor Genomic Ranges Utilities
- Genomic Annotation
Data Science: Probability
Length – 8 Weeks
Level – Introductory
You will learn
- Important concepts in probability theory including random variables and independence
- How to perform a Monte Carlo simulation
- The meaning of expected values and standard errors and how to compute them in R
- The importance of the Central Limit Theorem
Data Science: Inference and Modeling
Length: 8 Weeks
Level: Introductory
You will learn
- The concepts necessary to define estimates and margins of errors of populations, parameters, estimates and standard errors in order to make predictions about data
- How to use models to aggregate data from different sources
- The very basics of Bayesian statistics and predictive modeling
Data Science: Wrangling
Length: 8 Weeks
Level: Introductory
You will learn
- Importing data into R from different file formats
- Web scraping
- Tidy data using the tidy verse to better facilitate analysis
- String processing with regular expressions (regex)
- Wrangling data using dplyr
- How to work with dates and times as file formats
- Text mining
Data Science: Productivity Tools
Length: 8 Weeks
Level: Introductory
You will learn
- Using Unix/Linux to manage your file system
- Performing version control with git
- Starting a repository on GitHub
- Leveraging the many useful features provided by RStudio
Data Science: Machine Learning
Length: 8 Weeks
Level: Introductory
You will learn
- The basics of machine learning
- How to perform cross-validation to avoid overtraining
- Several popular machine learning algorithms
- How to build a recommendation system
- What is regularization and why is it useful?