|This post highlights one of the up-and-coming data science graduate students who participated in the Northeast Big Data Innovation Hub’s “Young Innovators” program this year. This program and others like it contribute to the Northeast Hub’s mission to build public-private partnerships to address high-priority societal challenges with data-driven solutions. The Northeast Hub gratefully acknowledges the support of the Computing Community Consortium for this program.|
Before he was a data scientist, Kenneth Graves was a high school English teacher. “I’ve always wanted to bridge the gap between the research community and schools,” the Teachers College doctoral student says. “I’m really compelled by this space where tech, education leadership and social justice come together.”
At the same time, he knows that using tech for education goes beyond just plug-and-play, requiring thoughtful implementation. He gives an example: “I used to teach at a lower-income public school that had provided a laptop for every student,” he says, “but there wasn’t any coordinated tech support for the teachers.”
Hoping to influence educational policy for the better, his work in the Northeast Big Data Innovation Hub’s Young Innovators program this summer used data science to gain meaningful insights from surveys of New York City computer science students. “There was an opportunity to explore some really in-depth questions there,” he notes.
The data Graves worked with had been collected by the NYC Foundation for Computer Science Education (CSNYC), as part of the Obama Administration’s Computer Science for All (CS4All) initiative. Over 60 school districts have committed to expanding computer science education opportunities, impacting over 4 million students. New York City has pursued an especially ambitious plan: Within 10 years, every public school student, from kindergarten through high school, will learn this critical subject throughout their K-12 education, empowering them to think and solve problems in new ways.
CSNYC partnered with the Department of Education on this initiative, with their efforts including collection of data on program efficacy, in the form of surveys from NYC public school students who participated in the initiative’s first year. Graves and his advisor, Dr. Alex Bowers of Columbia University, then worked with CSNYC for 3 months, analyzing this pilot data.
“My first time working with data that has real implications,” Graves says, grinning. “We wanted to tell a story about CS4All with the data. What worked? What didn’t? What can we learn from this first year that can be used to enhance the program and outcomes for students? It was a great, head-first experience!”
Bowers and Graves ran a latent class analysis on the survey data, and used hierarchical clustering methods to visualize the data as a heat map. The results, Graves says, were immediately apparent: “Looking at this heat map we generated, you could clearly see the students clustering into five groups – five different types of students, who responded to the computer science program in very distinct ways.”
The five types of students, according to the data, were:
1. Enthusiasts – These Graves describes as “the future CS majors, the kids going to the lab in their free time and exploring CS outside of the classroom.”
2. Engaged students – These students might not be interested in computer science for its own sake, but they were curious about the bigger picture, and how the program might ultimately benefit them.
3. Bookish students – These are the academically motivated students, the high achievers across the board. They see computer science as another opportunity to excel, but ultimately are not pursuing it outside of the classroom. “This is the kid who’s like a Division I athlete,” Graves says. “They’ve got the skill set, but they’re not going to go pro.”
4. Idlers – These students might not like their teacher or class, but believe the subject is important. There’s a tension for them between the material and the presentation.
5. Disinterested students – Those who indicate no interest in computer science. Graves notes that these students reported their teachers did not make learning computer science seem interesting or appealing.
Having this kind of insight about participating students creates an opportunity to fine-tune the initiative toward addressing each type’s needs more effectively over the next 9 years. It also allows for a much more nuanced view of student engagement. “Disinterest and interest are not two distinct categories,” Graves says. “Based on this data, it’s clearly not enough to say only that a student is interested or not interested in computer science. You have to go deeper to examine the nuances of their disinterest or interest, and what you find should guide how policy is created and implemented.”
He notes there is more that can be done. “No demographic data was collected with the surveys, for example,” he says. “Are African Americans more likely to cluster to a particular type? We don’t know.” However, the analysis provides a vital first step toward a more granular understanding. Graves is sharing his code for this analysis with CSNYC for further exploration of student data down the road.
The most important outcome, for Graves, was bringing these insights on students to light. “We were able to help clarify the work of the R&D team by telling a story with their data,” he says. “That story will have a real impact on how students in New York learn computer science.”
He is also grateful for the opportunity to partner with CSNYC and the Department of Education on a substantive project. “They have just done tremendous work in launching and collecting the data,” he says. “It was a great opportunity to prioritize the use of large-scale data, rather than anecdotes. And there is a dearth of research using advanced statistical methods in the computer science education community,” he adds. “It was an amazing opportunity for me to enter this space as an early career researcher.”
Ultimately, Graves hopes to delve deeper into education research using data science. “Schools use and interpret and perceive data in different ways than the research community does,” he says. “I want to traverse both arenas, and get them to talk more with each other.”