Data Science 1: Introduction to Data Science
STAT 109A
Subject & Catalog Number
Course Information
Description
Data Science 1 is the first half of a one-year introduction to data science. The course focuses on the analysis of messy, real-life data to make predictions using statistical and machine learning methods. Material covered integrates the five key facets of an investigation using data: (1) data collection – data wrangling and cleaning to obtain a suitable dataset; (2) data management – accessing data quickly and reliably; (3) exploratory data analysis – generating hypotheses and building intuition; (4) prediction or statistical learning – developing and applying models such as linear and logistic regression, k-nearest neighbors, decision trees, and probabilistic approaches based on Bayes’ rule; and (5) communication – summarizing results through visualization, storytelling, and interpretable summaries.
This is the first part of a two-course sequence. The curriculum builds throughout the academic year, and students are strongly encouraged to enroll in both the fall and spring courses within the same academic year.
Course Notes
Only one of the following can be taken for credit: Stat 109a, Stat 121a, CS 109a, AC 209a.
Available for Harvard Cross Registration
NOTE: This course requires additional sections; you will be prompted to choose secondary components during the Add to Cart process