Success Stories

The Northeast Hub is a community convener, collaboration hub, and catalyst for data science innovation in the Northeast Region. The Hub amplifies successes of the community, and shares credit across the community to encourage collaboration and mutual success in data science endeavors.

Success stories highlight community activities which have accomplished significant project goals. These include outcomes that highlight the value and insights delivered from project activities, resources that can be leveraged by the community for broader impact, and requests for collaborations that could perhaps lead to new collaboration, insights and publications.


Using a data-driven approach to study health disparities and secular trends in the chemical and individual exposome in the NHANES (National Health and Nutrition Examination Surveys)

Guest post by Chirag Patel, Harvard Medical School This Success Story is a report on the results of the Northeast Big Data Innovation Hub’s 2021 Seed Fund program. This research project considered the health challenges posed by environmental hazards across the U.S., with a particular focus on the health disparities […]

NEBD Hub Logo

Improving Data Integrity Awareness in HPC Datasets using Sparsity Profiles

Guest post by Dr. Seung Woo Son, Associate Professor, University of Massachusetts, Lowell This Success Story is a report on the results of one of the awards in the Northeast Big Data Innovation Hub’s 2021 Seed Fund program. As scientists conduct analyses that rely on large-scale simulations to achieve breakthroughs […]

Seung Woo Son

Pala Students Win ‘Best New Team’ Award at 2022 DataJam

Supercomputers can complete tasks so impressive that they have outpaced science fiction movies. However, they do more than that, they can provide inspiration to the next generation of scientists. One supercomputing center inspiring youth is the San Diego Supercomputer Center, or SDSC, at the University of California San Diego, which provides […]

NEBD Hub Logo

Using Data Science To Study Environmental Racism, Justice, And Policy

Guest post by Dr. Aunshul Rege, Temple University This Success Story is a report on the results of the Northeast Big Data Innovation Hub’s 2020 Seed Fund program. This project examined environmental injustice using a qualitative criminological lens. The project surveyed known case studies of environmental injustice in the United […]

Anshul Rege

Data Literacy as an Enabler to Broaden the Participation Of Underrepresented Minorities in STEM Careers

Guest post by Dr. Babak D. Beheshti, New York Institute of Technology This Success Story is a report on the results of the Northeast Big Data Innovation Hub’s 2020 Seed Fund program. The objective of this project was to expand data literacy and broaden the participation of underrepresented minorities and […]

NEBD Hub Logo

Researchers from NYU Tandon release 3-D data tracking human interactions outside of coronavirus hotspots

Study to set groundwork to build machine learning models that rapidly analyze how a virus spreads In April when New York City was under a strict lockdown, a team of 16 student researchers from New York University’s Tandon School of Engineering commenced a National Science Foundation Rapid Response Research (RAPID) […]

Scatter plot

CUAHSI has been selected to be the Coordinating Hub for the Critical Zone (CZ) Collaborative Network

Guest post by Jerad Bales, Executive Director, CUAHSI The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) located in Cambridge, MA, has been selected to be the Coordinating Hub for the National Science Foundation’s Critical Zone (CZ) Collaborative Network. The 5-year cooperative agreement became effective September 1, 2020. The […]

Tree Rings

Splash drop

Water Data and Software Services to Support Discovery, Reproducibility, and Collaboration in the Water-Resources Domain and Beyond

Guest post by Emily Clark, Project Manager, CUAHSI The mission of the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is to enable interdisciplinary collaboration in the water sciences, provide critical cyberinfrastructure, and promote water science education at all levels. CUAHSI’s services can be especially useful in […]


Data-driven workshops help teachers understand and engage with students

Guest post by Ivon Arroyo, Associate Professor in the College of Education and the College of Computer Science at the University of Massachusetts Amherst. The Big Data for Education Spoke’s professional development workshops have empowered teachers to leverage data to identify and answer a variety of pedagogical questions about their […]

NEBD Hub Logo

NEBD Hub Logo

Massive online open course teaches machine learning and data mining for education research

Guest post by Ryan Baker, Associate Professor in the Graduate School of Education at the University of Pennsylvania. The Northeast Big Data for Education Spoke has conducted considerable outreach on methods for data science for educational data sets. Workshops have been conducted in New York City, Buffalo, Philadelphia, Pittsburgh, and […]


ASSISTments Longitudinal Data Competition challenges participants to determine correlation between early mathematics education and STEM careers

Guest post by Ryan Baker, Associate Professor in the Graduate School of Education at the University of Pennsylvania. The ASSISTments Longitudinal Data Competition invited data scientists around the world to participate in a competition around the analysis of student data. Data from middle school student use of a popular online […]

Industry 4.0 , Machine learning and artificial intelligence concept. Ai chipsets for robot arm , driveless cars , sports game chips in smart factory background

NY Sci logo

Data Science for All: NE Hub workshop explores teaching data science to high schoolers

Data science is expanding rapidly in undergraduate education, but at the K-12 level, few schools have integrated this critical subject into their curricula. Many questions must be answered first: how should data science be taught to high schoolers? As a standalone course, or integrated throughout other courses? What level of […]


Largest-ever cohort of U.S. twins fuels new BD Spoke study

Studying the causes of disease is essential to medical research. However, the discussion is sometimes framed, misleadingly, as ‘nature vs. nurture’—is your condition the result of your genetics or your environment? Generally, the answer is both. But to what degree? A new study in Nature Genetics explores this question for […]

NEBD Hub Logo

Students working together at table, hands-in

Funding Awarded for First Round of NEBDIH-Sponsored Big Data Workshops

As part of our mission to address high-priority challenges with data-driven solutions, the Northeast Big Data Innovation Hub put out a call for workshop proposals this spring. We sought to support community-driven workshops that are designed to plan and develop Big Data projects, and are delighted to announce our first […]


NEBD Hub Logo

UMass Amherst, WPI, Penn announce winners of Northeast big data competition

Winning Schemes for Predicting Student Interest in Science  UMass Amherst, WPI, Penn announce winners of Northeast big data competition AMHERST, Mass. – After a year-long, global data-mining competition, organizers today awarded the top three winning teams from Hong Kong, Japan and Michigan at the National Science Foundation’s (NSF) Northeast Big […]


NEBD Hub Logo

Big Data in Education: News and Competition

How can big data help predict student outcomes? Ryan Baker (U Penn) and Neil Heffernan (Worcester Polytechnic Institute) of our Big Data for Education Spoke hope to do just that via the Longitudinal Educational Big Data Competition. Using carefully de-identified, real-world educational data, participants will predict whether 172 students in validation and […]


“Enabling Seamless Data Sharing in Industry and Academia” Workshop Report Released

Click here to access the report Data sharing challenges are extensive in cases involving industry and academia, and highlight the need for sharable, adaptable solutions. To that end, a report summarizing the proceedings and outputs of 2016’s Northeast Hub workshop on Data Sharing has recently been published. The workshop convened data science practitioners to […]

Drexel MRC logo

Graves at CSNYC

Telling a Story with Data: Young Innovator Kenneth Graves

This post highlights one of the up-and-coming data science graduate students who participated in the Northeast Big Data Innovation Hub’s “Young Innovators” program this year. This program and others like it contribute to the Northeast Hub’s mission to build public-private partnerships to address high-priority societal challenges with data-driven solutions. The […]


Data Sharing event

NEBDIH Data Sharing Workshop a Success: “I wish I could have gone to this workshop two years ago!”

On September 29th and 30th, stakeholders from across the Northeast and beyond joined the Hub at Drexel University in Philadelphia for “Enabling Seamless Data Sharing in Industry and Academia,” a cross-sector workshop put on by our community to tackle the challenges of sharing data head-on. In short “TED talk”-style presentations and […]