Data Essentials & Hypothesis Building Project
Project Description
It is the first day of your new job at Acme Insurance, an American medical insurance company. You have been asked to do some preliminary research on the impact of Long COVID on healthcare in the U.S. There are 10 milestones in this project. There are several Tasks you need to complete along the way. This project should take you no more than 40 hours to finish (~3 hours per week during a 12 week semester)
Together, we’ll clean a Long COVID dataset, learn about the ethics of data science research, do some preliminary analysis, and develop a working hypothesis about the impact of Long COVID. We’ll build some simple data visualizations, test our hypotheses, write up a short conclusion, and prepare a brief presentation. Your project should prepare you to share your research with your colleagues (Milestone 10, optional). At the end of the project, you will submit a completed version of this document (Tasks are highlighted in green, Milestones are in purple) as well as a spreadsheet of your work in Excel / Google Sheets, and any visualization materials you develop. Refer to the CIC Slack Channel for project updates and to request support at any time.
Learn more about the COVID Information Commons (CIC) Student Working Group, including future projects and examples of submitted student projects on the CIC website.
Dataset
CDC Long COVID Household Pulse Survey
Relevant Skills You May Apply
No pre-requisites are required to begin this data science project!
Skills You May Gain
Data cleaning, data analysis, data science ethics, data visualization, scientific communications and presentations, Excel (pivot tables, formulas, etc.), scientific research
Total Time
Approximately 40 hours
Milestones
Milestone 1: Preliminary Research
Milestone 2: Finding & Sourcing Quality Data
Milestone 3: Diving into the Documentation & Identifying Bias
Milestone 4: The Basics of Data Prep & Cleaning
Milestone 5: Exploratory Data Analysis
Milestone 6: Continued Analysis
Milestone 7: Testing our Hypothesis
Milestone 8: Summarizing our Conclusions
Milestone 9: Visualizing our Findings
Milestone 10: Sharing our Insights
Deliverables
Deliverables include a final data visualization, a written summary of your research insights, and an optional presentation of your findings.