Northeast Student Data Corps



The Northeast Big Data Innovation Hub Seed Fund is designed to promote collaboration and support the cross-pollination of tools, data, and ideas across disciplines and sectors including academia, industry, government, and communities. Funding provided through this program is intended to support the northeast region and align with the Major Goals and Focus Areas of the Northeast Big Data Hub.

The Northeast Big Data Hub is delighted to invite applications from the community to serve on the Founding Committee of the Northeast Student Data Corps (NSDC). The NSDC is a community-developed initiative that will teach data science fundamentals to students across the northeastern United States, with a special focus on underserved institutions and students. To better build in community insights and perspectives from the start, the Northeast Big Data Hub invites community members to apply to join the Founding Committee of the Northeast Student Data Corps.

View and download an insight paper on the Northeast Student Data Corps

Starting in 2020, the Founding Committee will be responsible for planning, designing, and launching Northeast Student Data Corps activities in partnership with leadership of the Northeast Big Data Hub. This is an opportunity to contribute to the creation of a groundbreaking and inclusive new program in data science. A modest honorarium is offered to Founding Committee members in recognition of their service to the community and as part of the Northeast Big Data Hub’s seed fund program.

Applications will be accepted through the below submission form in two rounds:

  • First Round Deadline: August 31, 2020
  • Second Round Deadline: October 1, 2020

Apply Here

If you are unable to use Google Forms to apply, contact for assistance.

Watch our recorded webinar from Monday, August 10th, in which we provide an overview of the program and answer applicants’ questions:

Watch the Webinar on YouTube | Download Webinar PPT Slides

For more information about applying, scroll down or click through the navigation bar below. Contact with any questions.



Northeast Hub Goals

The Northeast Hub is a community convener, collaboration hub, and catalyst for data science innovation in the Northeast Region. The Hub amplifies successes of the community and shares credit across the community to encourage collaboration and mutual success in data science endeavors.

The goals of the Northeast Hub are to:

  • Build collaborations to address real-world challenges through translational data science approaches
  • Foster innovation and scale endeavors that reflect regional interests and align with national priorities related to data science
  • Support and promote representative community engagement/impact across all Hub activities
  • Increase data science capacity and talent, emphasizing underserved communities

Learn more about Northeast Big Data Hub Focus Areas below.


The Northeast Big Data Hub welcomes applications from undergraduate and graduate students; postdoctoral researchers; faculty; and leaders from academia, industry, government, and non-profit organizations based in the Northeastern United States, defined as Pennsylvania, New Jersey, New York, Connecticut, Rhode Island, Massachusetts, Vermont, New Hampshire, and Maine.

Colleagues from historically black colleges and universities, minority-serving institutions, Hispanic-serving institutions and others from underrepresented groups are strongly encouraged to apply.

Application Process

For consideration, all NSDC Founding Committee applications must include:

  1. Individual’s name
  2. Institution
  3. Role
  4. Statement of interest in joining the Founding Committee [up to 200 words]
  5. Areas of Expertise [checklist]
  6. Overview of Areas of Expertise [up to 500 words]
  7. Please provide your thoughts on how the Founding Committee might plan and launch one or more of the three activities the Northeast Student Data Corps will engage in [up to 500 words]
  8. Resume or CV (up to 2 pages)
  9. For undergraduate, graduate students, and postdocs: contact information of a faculty member or supervisor who can provide a reference

To apply to join the Northeast Student Data Corps Founding Committee, please submit your proposal via Google Forms.

Contact if you are unable to use Google Forms.


Proposals will be considered with deadlines for two rounds of awards, and based on the availability of funds. Proposals submitted to the first round will receive decisions of award, no award, or remain in consideration for the second deadline.  Proposals with a negative decision on the first round are ineligible for resubmission in the second round. 

  • First Round Deadline: August 31, 2020
  • Second Round Deadline: October 1, 2020

Review Process and Selection Criteria

The Northeast Big Data Innovation Hub Seed Fund Steering Committee (SFSC) will review applications. The SFSC roster is available at:

Each proposal will be reviewed by 3 or more members of the SFSC. The SFSC will use NSF’s Conflicts of Interest policy during the review process:

Service Requirements

Founding Committee members will be expected to carry out their service of planning, designing, and launching the Northeast Student Data Corps in conjunction with Northeast Big Data Hub leadership, starting in fall 2020. The Northeast Student Data Corps is expected to launch initial activities by 2021.


Learn More About Our Focus Areas

View the Northeast Big Data Innovation Hub’s website to learn more about our impact in regional focus areas, or view the information below if your interest lies in a particular focus area:


The objective of this focus area is to help advance data sharing, acquisition, integration, analysis and resulting insights in the pursuit of improved public health and health outcomes. This includes:

  1. Enabling data sharing, acquisition and integration to develop knowledge and insight in the pursuit of improved health outcomes, including alternative health data sources such as environmental factors, social media and mobile health
  2. Supporting the development and deployment of advanced analytics, including causal discovery and reasoning, artificial intelligence, and machine learning in biomedicine and health care
  3. Enabling data science collaborations in the support of precision medicine, including advanced decision support leveraging data to deliver customized and personalized knowledge, insight and recommendations


The objective of this focus area is to facilitate activities and collaborations focused on improving the understanding of data ecosystems in urban and rural environments, and the potential application of data analytics and responsible data science practices to benefit rural and urban communities, citizens, and the environment. This includes:

  1. Enabling acquisition and leverage of interoperable and accessible data sets for use across rural and urban communities, for data driven community and environmental planning and stewardship
  2. In collaboration with the Health focus area, determining the impact of urbanicity on health and other measures of well-being, for humans and their environment
  3. Creating data science innovations to envision and enable sustainable environments, including collaborations incorporating data science, environmental science, and geosciences broadly.


The objective of this focus area is to develop recommendations and best practices for Responsible Data Science (RDS) by design, with security, privacy and ethical use of data and data science methods. This includes:

  1. Enabling the development and dissemination of responsible data science best practices, such as building and deploying of integrative data equity systems, incorporating ethical and legal norms in all stages of the data science life cycle, with security, privacy and ethics
  2. Engaging IEEE standards development activities for the TIPPSS framework – Trust, Identity, Privacy, Protection, Safety and Security – and its use in clinical Internet of Things, connected healthcare, and smart and connected communities
  3. Collaborating with cybersecurity researchers and domain scientists to determine their needs and challenges regarding trustworthy data, and data driven cybersecurity, privacy, and ethics


The objective of this focus area is to work with the community to guide the development and dissemination of data science approaches and resources for education across the community, from PreK-12 to higher education and industry, with the goal to improve data literacy and educational outcomes. This includes:

  1. Amplifying the success and reach of data science education and data literacy activities, from virtual and in-person workshops to Massive Open Online Courses (MOOCs), to empower educators and researchers to improve data science practices and reach a broad community
  2. Enabling community and educational institutions across the region to deliver data science education to underserved constituencies, extending the success of the Data Science for All workshops, and perhaps adding domain-specific data science and data carpentry workshops
  3. Increasing community collaboration to benefit from and contribute to the Big Data for Education Spoke best practices, including innovations in intelligent tutor pedagogy, and collaborative analyses of Big Data for Education MOOC data