NEBDHub & NSDC Transportation Data Science Projects

Explorer Transportation Data Science Project

Project Description

Welcome to the NEBDHub & NSDC Transportation Data Science Project (TDSP)!

Leveraging a New York City OpenData transportation dataset, transportation data science learners will use data science tools and techniques to develop data-driven insights on how roads can be made safer for all.

Participants will learn to assess for potential bias in the data by comparing neighborhoods or zip codes, an important element of data science ethics. They will also create pertinent data visualizations and graphs to communicate what the data shows and develop analytical models that can be deployed in the real world.

Review the TDSP Overview to learn more.


NYC OpenData Motor Vehicle Collisions – Crashes Dataset

Relevant Skills You May Apply

There are no prerequisites for the Explorer TDSP, although some basic knowledge of data analysis and Python can be helpful! We will guide and support you throughout this project.

Skills You May Gain

Data Cleaning, Python Programming, Time Series Analysis, Geospatial Analysis, Data Visualization, and Critical Thinking Skills

Total Time

This project consists of 6 Milestones which may take 1-5 hours each to complete, based on your level of experience. This project, in total, is designed to be completed within 6-8 weeks (or less).

For those who are beginning the Explorer TDSP in February 2024, your project will be due in April 2024.

This project is not required to be completed in one sitting. Feel free to take breaks!


Milestone 1: Data Preparation

Milestone 2: Data Ethics, Pre-Processing, and Exploration

Milestone 3: Time Series Analysis

Milestone 4: Geospatial Analysis

Milestone 5: Self-Guided Research Question

Milestone 6: Virtual Poster Board Creation: Data Storytelling


At the end of this project, you will submit your Google Colab Notebook and a one-page virtual poster board to display your final visualizations and insights to the Northeast Big Data Innovation Hub and National Student Data Corps (NSDC) HQ team at Columbia University, and the U.S. Department of Transportation/Federal Highway Administration (DOT/FHWA). This final submission will showcase your data analysis skills and your ability to communicate findings effectively.