The Hub Builds Data Science Partnerships to Address Societal Challenges
The Northeast Big Data Innovation Hub has been awarded a four year, $4 million grant from the National Science Foundation (NSF) to continue its work in building cross-sector data science partnerships that address societal challenges, spur economic development, and accelerate big data innovation. The Northeast Hub was launched in 2015 with a grant from the NSF. This second round of funding validates the efforts of the Hub to harness the data revolution for social impact.

During its first three years of operations, the Northeast Hub built a network of more than 200 organizations throughout the Northeastern United States and beyond. Its activities fell under eight priority areas, ranging from data sharing to responsible data science. Initiatives have included collaborations across a number of application areas ─ for example, to develop the first-ever exposome data warehouse integrating environmental exposure and clinical data for large-scale health research ─ as well as efforts to develop resources helpful across application areas, such as a licensing model and ecosystem for data sharing.

“We’re very excited to expand these activities,” said René Bastón, the Hub’s executive director. “For our next phase, we’ll be coordinating projects that build on the insights we gained during our first three years ─ enhancing data science capacity for underserved institutions and emphasizing translational data science.”
The Northeast Big Data Innovation Hub is one of four Big Data Regional Innovation Hubs, a national network of academic, industry, government, and nonprofit partners, supported by the NSF. Each of the four Hubs received additional funding for a total investment of $16 million.
“By catalyzing partnerships that integrate academic researchers into the fabric of communities across the U.S.,” said Beth Plale, one of the National Science Foundation program directors managing the Big Data Hubs awards, “we can accelerate and deepen the impact of basic research on a range of societal issues.”

Under this new round of funding, the Northeast Hub will place an emphasis on mission driven projects that coordinate and stimulate translational data science. For example, the Hub will work with its stakeholders on aggregating and helping to develop best practices for responsible data science; creating frameworks for data fluency; fostering better management of data security and privacy; integrating health data from traditional and novel sources; improving education through big data; and reducing barriers for data sharing within and between different sectors.
As a new service to the community, each Big Data Hub will maintain a seed fund for translational data science as part of its project budget. This fund will provide grants to pilot early feasibility studies for innovative new solutions to grand challenges of importance to the region
The Northeast Hub will also continue to collaborate with its six Big Data Spokes, project collaborations that focus on topics of specific interest to its region. The most recent Big Data Spokes were launched in 2018 ─ a collaborative platform for computational social science, data-driven discovery and rational design in chemistry, and a series of community workshops addressing data integration of the ecological long tail.
Hosted by Columbia University’s Data Science Institute, the Northeast Hub will be coordinated by Principal Investigator Jeannette M. Wing (Columbia University), Executive Director and Co-Principal Investigator René Bastón (Columbia University), Co-Principal Investigator James Hendler (Rensselaer Polytechnic Institute), Co-Principal Investigator Vasant Honavar (Penn State), and Co-Principal Investigator Andrew McCallum (University of Massachusetts at Amherst).

“The Big Data Hub has built an extensive network of data science experts and stakeholders from academia, industry and local government across the Northeast,” said Wing. “Its coordinated efforts address important societal challenges, such as healthcare and education, faced by the region, from urban to rural environments. The new NSF grant will allow us to expand this work in two ways: first, by addressing cross-cutting themes on data privacy and data ethics, to ensure positive social impact; and second, by coordinating with the three other regional hubs toward a national network of data science institutions.”
Each of the four Big Data Hubs is located in one of the four U.S. census regions (Northeast, South, Midwest, and West), and serves as a thought leader and convening force on social and economic challenges unique to their regions – for example, fresh water in the West, agriculture in the Midwest, coastal flooding in the South, and aging urban infrastructure in the Northeast. Beyond their regional focus, the Big Data Hubs will act as a single national body as needed to respond to issues that cross regions, such as the evolution of U.S. transportation infrastructure and workforce development. Embarking on the next phase of growth and national coordination, the Hubs will host anAll Hubs All Hands community data science meeting open to the public as a signature event in 2020.

“Developing innovative, effective solutions to grand challenges requires linking scientists and engineers with local communities,” said Jim Kurose, Assistant Director for Computer and Information Science and Engineering at the National Science Foundation, which funded this award. “The Big Data Hubs provide the glue to achieve those links, bringing together teams of data science researchers with cities, municipalities and anchor institutions.”
For Press inquiries or to connect directly with our team, please email Katie Naum at: