On September 29th and 30th, stakeholders from across the Northeast and beyond joined the Hub at Drexel University in Philadelphia for “Enabling Seamless Data Sharing in Industry and Academia,” a cross-sector workshop put on by our community to tackle the challenges of sharing data head-on. In short “TED talk”-style presentations and collaborative breakout sessions, participants from industry, academia, government and non-profit organizations articulated the challenges they’ve encountered in sharing data; presented real-life case studies of data sharing failures and successes; identified lessons learned and best practices; and dreamed big on possible solutions to these challenges, both technical and non-technical.
Among the themes that emerged were:
- The need to incentivize data sharing
- The challenge of effectively managing complex regulations governing data sharing
- Trust and security as the essential foundation for data stewardship – and ensuring all stakeholders and administrators have the knowledge they need
- Solutions ranging from a common license template with pre-defined clauses, to machine actionable enforcement, to automated compliance tracking, and more.
We’ll be sharing a forthcoming report derived from notes taken at the workshop; in addition, a Slack channel has been established to support continued conversation among the workshop’s 50+ vocal participants. As speaker Amen Ra Mashariki put it, “I wish I could have gone to this workshop two years ago.”
The workshop boasted speakers from Comcast, Experian, IMS Health, GE, New York City Mayor’s Office of Data Analytics, MIT, Drexel, Penn State, and more. A full speaker list is available here (PDF).
Data Sharing is one of the Northeast Hub’s “Rings” – fundamental data science priority areas that span all vertical “Spokes.” Workshop organizers Jane Greenberg, Sam Madden, and Tim Kraska are PIs on the National Science Foundation award “A Licensing Model and Ecosystem for Data Sharing.” As described in their abstract, the goals for their grant include: ” 1) Creating a licensing model for data that facilitates sharing data that is not necessarily open or free between different organizations and (2) Developing a prototype data sharing software platform, ShareDB, which enforces the terms and restrictions of the developed licenses.” The workshop represented the first step toward these larger goals.
We gratefully acknowledge the support provided by the Computing Community Consortium and Drexel University’s Metadata Research Center for this workshop. Special thanks to workshop organizers Jane Greenberg, Sam Madden, Tim Kraska, and Florence Hudson, as well as all our amazing speakers and participants!
Contact Katie Naum with any questions about this post.