NSDC Data Science Project – Sentiment Analysis

Creative Teamwork Meeting Discussion Ideas Concept

Sentiment Analysis of Movie Reviews

Project Description

This project will introduce students to an array of skills as they strive to create a sentiment analysis model to annotate a given review as positive, negative or neutral. Sentiment Analysis leverages both Natural Language Processing (NLP) and Machine Learning (ML) skills – how to represent text in a machine-understandable format so as to classify the text and extract sentiment. We will also cover visualizations and deploying models in the real world.


Internet Movie Database (IMDb) Movie Reviews (.csv here)

Relevant Skills You May Apply

Basic Python Programming and NLP understanding

Skills You May Gain

Data Cleaning & Pre-Processing, Data Visualization, Machine Learning Models and NLP Techniques

Total Time

4-6 weeks (2-3 hours per week per person in each team)


Milestone 1: Set up Python Notebook, Read Comma-separated Values (CSV) file, Basic Data Pre-Processing and Cleaning (steps will be outlined)
Milestone 2: More Advanced Data Pre-Processing (Tokenizing, Stemming, etc.)
Milestone 3: Building the Machine Learning Classifier
Milestone 4: More Machine Learning Classifiers and Evaluation Metrics & Visualizations


Deliverables include a project report highlighting new skills gained and an interactive Python notebook (Jupyter/Google Colab).