Guest Post: Jeremy Prasad
A recording of this event is available at the Northeast Big Data Hub’s YouTube channel.
On August 19th, 2022, the National Student Data Corps (NSDC) hosted its ninth Data Science Panel, designed to showcase various careers in data science. The panel featured four established data scientists from vastly different sectors of the industry, ranging from financial analytics to bioinformatics. Emily Rothenberg, NSDC Program Coordinator, and Florence Hudson, Executive Director of the Northeast Big Data Innovation Hub, were moderating the discussion.
The National Student Data Corps is operated by the Northeast Big Data Innovation Hub (NEBDHub), which is funded by the National Science Foundation and uses its platform to fuel data science innovation. The NSDC is the Hub’s premier program within the Education and Data Literacy Focus Area. It is part of a broader, community-driven initiative to teach data science fundamentals to students worldwide, with an emphasis on underserved institutions and communities.
The panel consisted of four professional data scientists with distinguished backgrounds in their respective fields: Mayank Varia, Associate Professor and researcher at Boston University studying theoretical and applied cryptography; Steve Horne, Chief Data & Strategy Officer at Bridge; Vivien Bonazzi, Biomedical Data Scientist at Deloitte; and Yu Yu, Director of Data Science at BlackRock.
The discussion opened with each panelist describing their career journeys in data science. Mayank began by mentioning his academic background and research interests. His cryptography-oriented research focuses on designing systems that allow people to interact with encrypted data to extract meaningful information while preserving anonymity in aggregate form by using a protocol called secure multiparty computation, or secure MPC.
Vivien then continued the conversation by explaining her own career journey in data science. She has a longstanding academic history in biomedical sciences but was also fascinated by computers at an early age. Her work as a biomedical data scientist allows her to blend aspects of data science and genomics to contribute to the field of computational biology, also known as bioinformatics.
Next, Steve explained his extensive background in computers and financial analytics. While still in college, he started working for Dun & Bradstreet and stayed with the company for 28 years, creating statistical algorithms. Steve then moved on to work for several other commercial data science companies, helping businesses understand their audiences in the marketplace.
Finally, Yu wrapped up the question by explaining the difficulties in relating academia to impactful work in the industry. This sparked her to move from being an assistant professor of marketing to working at various financial institutions as a data scientist. All the panelists come from contrasting backgrounds, emphasizing the broad scope that data science encompasses and illustrating how one’s career path is not always linear.
Emily kicked off the next topic by asking about mentorship and educational experiences that helped shape each panelist’s career path. Vivien stressed that one’s academic background is tremendously important in her field. She also expressed the importance of having a mentor to guide young professionals, citing her own formative experience when just starting out in her career. Steve then carried on the conversation by sharing his experience with a mentor who allowed him to take over some classes as a graduate student. Yu and Mayank both emphasized the importance of practical, hands-on approaches to growth, offering up several online resources, including Coursera, LeetCode and HackerRank, in addition to programs designed to provide opportunities to underserved communities such as Girls Who Code. Emily and Florence also highlighted some of the Hub’s resources, adding the REAL Volunteer Program and the Learner Central to the mix.
When asked which technical tools are frequently used in data science professions, most panelists stated that SQL, Python and R are common languages used in their roles. Vivien and Steve also mentioned that in more advanced data-oriented roles, machine learning tools such as TensorFlow and PyTorch are employed. Contrastingly, when asked which non-technical soft skills are most important on the job, all panelists emphasized that clearly articulating one’s ideas and findings to a broad, general audience is pivotal to advancing one’s career. Vivien expressly mentioned the following sentiment: “not understanding how to code does not automatically disqualify people from making impactful decisions; it is vital to convey technical ideas to non-technical audiences effectively.”
Data science is disrupting the current digital landscape and shaping the technology of tomorrow. Emily asked Mayank and Vivien about how data science will innovate and adapt within the next five to 10 years. Both panelists shared their views of data science’s implications in the context of other professions. Vivien explored the rise of protecting privacy rights as biomedical science evolves, making room for bioethicists to bridge the gap between technology practices and ordinary citizens’ data. Mayank made similar observations, posing the question, “How does data science connect to the rest of the world?”
The final question of the panelist discussion section asked how students can get involved in data science with little to no experience using relevant tools. Yu answered by urging students to take advantage of online resources such as Kaggle and other problem-solving platforms. Steve and Mayank went on to emphasize how important it is to get started early and gain hands-on, practical experience in any field. Finally, Vivien wrapped up the discussion by referencing Francis Crick’s Gossip Test, which states you might be interested in what you gossip about.
After concluding the panel discussion, Emily shared some of the NSDC’s resources available to students interested in the world of data science, including the Learner Central, Educator Central, NSDC Video Library, Chapter Central, and the Inaugural NSDC Data Science Symposium, among others.
Then, Emily opened up the Q&A section by asking the panelists some audience-generated questions. The first question asked the panelists to explain something they know now that they wished they knew earlier. Yu reflected on her career experience, saying that one’s professional journey is not always straightforward. Vivien communicated that retrospect is always 20/20, explaining that she followed what she was interested in, allowing her to perform exciting work throughout her career.
The next question in the Q&A session asked whether panelists ever felt unprepared for a role they applied for. All panelists generally agreed that they had felt that way at some point in their career; Vivien went as far as saying she felt that way all the time, which helps keep her sharp.
The final question of the afternoon asked panelists about opportunities for people interested specifically in the story-telling aspect of data science. Mayank answered that journalism is becoming an increasingly popular field within data science as large corporations push to become more transparent with their handling of data. Additionally, he noted that ethics and data science would only become more intertwined as the public’s reliance on data grows, thus requiring more professionals to possess the ability to translate technical insights into terms policymakers and the general public can understand.
With that, Emily and Florence wrapped up the panel by thanking each panelist for their contributions, as well as the people on the NSDC HQ team and student volunteers.
Stay Connected!
Find more information about the National Student Data Corps on our website
Connect with us and fellow data science enthusiasts by joining the NSDC Slack community
Sign up for the NSDC newsletter
Follow the Hub on Twitter at @nebigdatahub and the Northeast Student Data Corps at @data_corps
Follow the Hub on Instagram and LinkedIn
Start or Join an NSDC Chapter
Email us at contact@nebigdatahub.org with any questions or comment