National Student Data Corps Video Library

Home » National Student Data Corps » National Student Data Corps Video Library

National Student Data Corps Video Library

Check out these videos on data science topics based on IBM’s OpenDS4All curriculum and presented by Columbia University Master’s students.

Welcome to the NSDC Video Library where you can watch videos on data science topics based on IBM’s OpenDS4All curriculumas well as student-created SQL and R educational materials, data science use cases, and more, presented by data enthusiasts from around the world. IBM transferred the management of the OpenDS4All GitHub repository to the NEBDHub in June 2023.

If you’re interested in creating video content leveraging IBM’s OpenDS4All content, please email the NSDC HQ team at

Introduction to Data Science

With Yucen Wang and Varalika Mahajan

Watch these videos to learn more about the difference between data science, data analytics, and data engineering. Become familiar with topics including data science models, knowledge graphs, and additional data science applications.

What is Data Science? - Part 1

What is Data Science? - Part 2

Introduction to Data Science

Data Science Ethics

With Abhishek SinhaVaralika Mahajan, and Rahulraj Singh

These videos provide a framework for the important topic of ethics in the collection and usage of data. Watch these videos to learn more about privacy, transparency, consent, explainability and fairness in data science, and walk through a use case using the Breast Cancer Wisconsin (Diagnostic) Data Set in Part 1 of the AI Explainability series.

Data Science Ethics - Part 1

Data Science Ethics - Part 2

AI Explainability - Part 1

AI Explainability - Part 2

AI Fairness - Part 1

AI Fairness - Part 2

Data Acquisition & Wrangling

With Varalika MahajanRenyin Zhang and Sanket Bhandari

Learn more about structured and unstructured data, and practice acquiring, extracting, cleaning, plotting and grouping data from a dataset with real-world examples along the way.

Data Types

Data Acquisition and Wrangling - Part 1

Data Acquisition and Wrangling - Part 2

Data Acquisition and Wrangling - Part 3

Data Wrangling Jupyter Notebook - Part 1

Data Wrangling Jupyter Notebook - Part 2

Data Integration

With Stephanie Guo, Lylybell Teran, and Varalika Mahajan

Familiarize yourself with the process of data integration, including breakdowns of the most common data quality issues, feature selection approaches, and partitional clustering and hierarchical clustering methods. Learn how to detect inconsistencies, find duplicates, and handle outliers within your dataset.

Introduction to Data Cleaning

Feature Selection

Data Clustering

Data Visualization & Modeling

With Rahul Singh and Varalika Mahajan

Review how visual interfaces, knowledge graphs, and entity-relationship modeling can help analyze datasets and illustrate algorithmic performances, and practice your skills with a COVID Case Study.

Information Visualization and Visual Analytics

Data Exploration and Visualization – A COVID Case Study

Data Representation and Modeling

An Introduction to R

With Dashansh Prajapati

Familiarize yourself with R, a programming language commonly used for statistical analysis. Throughout this video you will learn about RStudio’s Source Pane, Console Pane, Environment/History Pane, and more.

An Introduction to R

An Introduction to Structured Query Language (SQL)

Content created by Hoang Luong and presented by Gabriella Qi

Discover relational databases and relational database management systems (RDBMS) including MySQL. Learn more about common operators and practice your skills with examples along the way.

An Introduction to SQL - Part 1

An Introduction to SQL - Part 2

An Introduction to SQL - Part 3

An Introduction to SQL - Part 4

An Introduction to SQL - Part 5

Supervised Machine Learning

with Tomislav Galjanic

Watch these videos to learn more about supervised machine learning, including topics such as classifiers, decision trees, and random forests.

Supervised Machine Learning - Part 1

Supervised Machine Learning - Part 2

Supervised Machine Learning - Part 3

Supervised Machine Learning - Part 4

Supervised Machine Learning - Part 5

Linear & Logistic Regression Presentation

Linear & Logistic Regression Jupyter Notebook

Regression Analysis

Artificial Intelligence

with Lylybell Teran and Sneha Dahiya

Watch these videos to learn how to solve machine learning and artificial intelligence problems with the use of artificial and convolutional neural networks.

Artificial Neural Networks - Part 1

Artificial Neural Networks - Part 2

Artificial Neural Networks - Part 3

Convolutional Neural Networks - Part 1

Convolutional Neural Networks - Part 2

Deep Learning

With Sanket Bhandari

Learn more about artificial intelligence through deep learning methods and models. 

Transfer Learning

NSDC MasterClass Video Series

The NSDC MasterClass Video Series showcases experts who share how data science tools and techniques are being leveraged in various domains. Future episodes may highlight the intersection of data science and healthcare, finance, athletics, entertainment, technology, education, public policy, and more.

An Introduction to AI in Precision Oncology

Generative Artificial Intelligence

Data for Fame, Fortune & Championships

When Not to Use Machine Learning

Data Science Use Cases – from the AI for Social Good Fall 2020 Symposium

In 2020, the Association for the Advancement of Artificial Intelligence’s AI for Social Good Fall Symposium featured student and researcher presentations on the role of AI can play in data science for social good initiatives.

Recent developments in the availability of big data and computational power are continuing to revolutionize several domains opening up new opportunities and challenges. In this symposium, we highlight two specific themes of humanitarian relief and healthcare where AI could be used for social good to achieve the United Nations (UN) sustainable development goals in those areas, which touch every aspect of human, social, and economic development. We expect the symposium to identify the critical needs and pathways for responsible AI solutions for achieving the sustainable goals, which demand holistic thinking on optimizing the trade-off between automation benefits and their potential side-effects.

Health Care Misinformation: An AI Challenge for Low-Resource Languages

Robust Lock-Down Optimization for COVID-19 Policy Guidance

Socioeconomic and Geographic Variations that Impact the Spread of Malaria

Asymptotic Cross-Entropy Weighting and Guided-Loss in Supervised Hierarchical Setting using Deep Attention Network

Clean Water: How the AI community can contribute to accessing water sources in developing countries

Measuring and Visualizing Social Distancing Using Deep Learning and 3D Computer Vision

Artificial Intelligence and Resource Allocation in Health Care: The Process-Outcome Divide in Perspectives on Moral Decision-Making

Two-Step Framework for Parkinson’s Disease Classification: Multiple One-Way ANOVA on Speech Features and Decision Trees

Check out the NSDC Educator Central and NSDC Learner Central for more data science resources.

Stay Connected with Us

Email us at with any inquiries or questions.

Some ways to stay connected with the NSDC community: