DS4All Workshop: Computer Science for All

Guest post by Catherine Cramer, Founder and Principal, Woods Hole Institute.

The Northeast Big Data Innovation Hub supports the work of DS4All and has done so since 2015. While there are several different strands to this work, the focus on data science in K-12 education continues to be prominent. 

The most recent effort was a workshop in January 2020 during which a group of 41 data science experts, practitioners, educators and researchers gathered at Cornell Tech on Roosevelt Island in New York City for two days. The goal of the workshop was to identify successes, challenges, and solutions to improving the use of data science in education settings – in general for formal learning settings, and specifically for high school. Through a process of brainstorming in a variety of formats (talks, panels, breakouts, demonstrations and group discussion) and then articulating concrete ideas, the group determined pathways to bringing data science into high school, along with a number of open questions.

The workshop included four elements: 

1) identifying the gaps in data science education within computer science.

2) a series of presentations from tool and software developers and learning researchers to describe validated practices.

3) collaborative brainstorming sessions to draft a fundamental set of current resources, challenges and solutions

4) authoring of proposal ideas to articulate the path forward 


Among the group’s conclusions is that a grasp of data science is vital for people to be fully prepared for the world of work, as well as to be an enlightened twenty-first century member of society. While there was much discussion on the difference between data literacy and data science and the roles they play in education, there was consensus on data literacy being closely aligned with ethics, security and privacy and the ability to be empowered by data. Data science, on the other hand, was more clearly defined as the ability to use tools and skills to gather, process and analyze data for a variety of purposes, including using data for advanced applications such as artificial intelligence.

The question of what the role of data science and/or literacy is in computer science pedagogy was not fully resolved, although significantly, the group agreed that data literacy and/or data science in some form needs to have a bigger role across disciplines in teaching and learning practice – where it fits the need and enhances learning of those topics. How data science fits directly into computer science courses in high school remained unclear, as did the value of a stand-alone course in data science. One area of clarity was the recognition that the lack of access to data science skills could contribute to existing inequities in high school education.

The group agreed unanimously that addressing the needs of teachers is key to making headway in the further development of applications of data science at the high school level. This includes bringing teachers fully into the development process and providing ongoing support and community building. Additionally, addressing issues of equity and accessibility emerged as being a top priority as these frameworks are built out. The participants felt strongly that we are at a pivotal moment in narrowing the data divide by providing data science educational assets and support throughout public school systems nationally.

Next Steps

While there are many tools, resources and approaches to data science education that are available, they are not part of a cohesive whole, nor are they entirely inclusive. The findings from this workshop suggest that a true coordination of effort is needed, as well as a thorough analysis of the learning ecosystem in which data science fits. This means knowing which programs and resources have already been developed and tested, as well as the programs and outcomes that appear in prior grades and clarifying what is happening at the undergraduate level.

To be equitable, opportunities for data science education must be available to all students, requiring curriculum and resource development that is inclusive of stakeholders; tools that are co-designed with teachers; and ongoing training and support for teachers. Ensuring equity and accessibility points to not offering data science exclusively through computer science but rather across the high school curriculum. 

This will likely require:

  • Development of a consensus high school data science framework that can align to curricula, standards, and scope and sequence, and can be adopted by policymakers and made available and supported at the federal, state and local levels, and that provides clear pathways into undergraduate level data science education.
  • Close work with teachers to identify where and how data science and data literacy fit, including initiating participatory design practices directly involving educators and curriculum developers with tool developers, data scientists, and learning scientists in the co-creation of accessible and adaptable data science curriculum, tools and resources for the classroom. 
  • Development of a new model to knit together existing data science curricula, tools, programs, games, and data sources into a unified whole. 

In sum, while significant strides have been made, a concurrent top-down and bottom-up approach is needed in order to accelerate the convergence of data science and teaching and learning, and for it to be scalable, sustainable and have true and lasting impact on our students. 

As an outcome of the workshop, DS4All submitted a proposal to NSF in the Computer Science for All program, in partnership with Northeast Hub members University of Pittsburgh and Gulf of Maine Research Institute. The proposal is focused on participatory design work with high school teachers in order to bring data science into their classrooms. Stay tuned for the next chapter in this work!

For more information, you can download the workshop report HERE

This effort is supported by the National Science Foundation under Award Number 1922898 to the New York Hall of Science. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.