
SQL Basics for Epidemiological Research
Project Description
This SQL project is designed for beginners who want to practice writing SQL queries using a real-world dataset. You will leverage a collection of the COVID-19 data maintained by Our World in Data, updated daily and including data on confirmed cases, deaths, and testing, as well as other variables of potential interest. The main goal is to help you become familiar with SQL syntax and basic data analysis tasks. You will gain access to comprehensive explanations of SQL queries and concepts, and interactive coding examples and challenges.
Dataset
This project will leverage the COVID-19 dataset by Our World in Data from Kaggle.
Relevant Skills You May Apply
Basic SQL knowledge
Skills You May Gain
Introductory to Intermediate SQL syntax and prompts
Total Time
Approximately 10 hours (1 to 2 weeks)
Milestones
Milestone 1: Libraries and Database Setup
Milestone 2: Loading the Dataset
Milestone 3: Basic SQL Queries
Milestone 4: Filtering Data
Milestone 5: Aggregating Data
Milestone 6: Grouping Data
Milestone 7: Sorting Data
Milestone 8: Joining Tables
Milestone 9: Subqueries
Milestone 10: Window Functions
Milestone 11: Data Analysis and Visualization
Milestone 12: Closing Connection
Milestone 13: Summary of Basic SQL Commands
Deliverables
Deliverables include a project report highlighting new skills gained and an interactive Python notebook (Jupyter/Google Colab).