SQL Basics for Epidemiological Research Project


SQL Basics for Epidemiological Research

Project Description

This SQL project is designed for beginners who want to practice writing SQL queries using a real-world dataset. You will leverage a collection of the COVID-19 data maintained by Our World in Data, updated daily and including data on confirmed cases, deaths, and testing, as well as other variables of potential interest. The main goal is to help you become familiar with SQL syntax and basic data analysis tasks. You will gain access to comprehensive explanations of SQL queries and concepts, and interactive coding examples and challenges.


Dataset

This project will leverage the COVID-19 dataset by Our World in Data from Kaggle.


Relevant Skills You May Apply

Basic SQL knowledge


Skills You May Gain

Introductory to Intermediate SQL syntax and prompts


Total Time

Approximately 10 hours (1 to 2 weeks)


Milestones

Milestone 1: Libraries and Database Setup
Milestone 2: Loading the Dataset
Milestone 3: Basic SQL Queries
Milestone 4: Filtering Data
Milestone 5: Aggregating Data
Milestone 6: Grouping Data
Milestone 7: Sorting Data
Milestone 8: Joining Tables
Milestone 9: Subqueries
Milestone 10: Window Functions
Milestone 11: Data Analysis and Visualization
Milestone 12: Closing Connection
Milestone 13: Summary of Basic SQL Commands


Deliverables

Deliverables include a project report highlighting new skills gained and an interactive Python notebook (Jupyter/Google Colab).