Modeling Data
Follow these Khan Academy modules to learn about variable relationships, distributions, and study design. Check out our Getting Started page for a review of categorical and quantitive variables.
- Modeling data distributions: Calculate percentiles and z-scores to assess model distributions.
- Exploring bivariate numerical data: Use scatter plots to explore the relationship between quantitative variables.
- Statistical study design: Learn about different types of studies and their sampling and data collection methods.
Probability and Sampling
These modules focus on probability topics and sampling distributions.
- Probability basics: Learn about permutations, combinations, and more.
- Probability basics continued: Learn how to use the law of total probability.
- Random variables: Calculate probabilities and expected values of random variables.
- Sampling distributions: Learn about sample proportions and sample means.
Hypothesis Testing
Learn how to test if your data support your hypotheses.
- Confidence Intervals: Understand the purpose of confidence intervals for means and proportions.
- Hypothesis testing: Learn how to test for statistical significance.
- Confidence intervals and hypothesis testing for bivariates: Apply confidence intervals and significance tests to two-sample comparisons.
Intermediate Data Science
Follow these IBM OpenDS4All modules to get started with data representation and machine learning. Each module contains lecture slides, sample code in Jupyter Notebook, and homework problems. IBM transferred the management of the OpenDS4All GitHub repository to the NEBDHub in June 2023.
- Data Acquisition and Wrangling: Slides and Jupyter Notebook
- Data Representation and Modeling: Slides and Jupyter Notebook
- Unsupervised Machine Learning: Slides and Jupyter Notebook
- Supervised Machine Learning: Slides and Jupyter Notebook
Also check out the UC Berkeley course Data 8: Foundations of Data Science, which covers computational and programming skills, inferential thinking, and privacy and study design.
Join the open Big Data in Education course (from the University of Pennsylvania, Teachers College, & Columbia University) and learn the methods and strategies for using large-scale educational data to improve education and make discoveries about learning.
Stay Connected with Us
Email us at nsdc@nebigdatahub.org with any inquiries or questions.
Some ways to stay connected with the NSDC community:
- Join our Slack channel
- Follow us on Twitter, Instagram, or LinkedIn
- Subscribe to the Northeast Hub YouTube channel
- Sign up for our NSDC mailing list
- Check out the REAL Volunteer Program for more collaboration opportunities