I am a graduate student at UC Berkeley, Master's in Industrial Engineering and Operations Research (Concentration on Data Science and Machine Learning). I have more than 4 years of work experience as a Business Data Analyst and Quantitative Research Assistant in information technology and financial services industry.

In my free time, I enjoy hanging out with friends, singing and playing guitar, reading books (and Quora), playing tennis and travelling.

Feel free to contact me about positions relating to Data Science & Analytics at anh.nnguyen@berkeley.edu.

My Portfolio

img


COMMODITY PRICE FORECASTING FOR SUPPLY CHAIN OPTIMIZATION

Forecasting metal price listed in the commodity market based on the historical price using different time-series prediction models.

Tools: Python, Time Series Models (ARIMA), Linear Regression Models (Ridge, Lasso), XGBoost, GRU

img


DIABETES PREDICTION WITH MACHINE LEARNING MODELS

Trained and compared the performance of the machine learning models with two different missing-data imputation: mean imputation and guess matrix.

Tools: Python, Scikit-Learn, Logistic Regression, Random Forest Classifier, AdaBoost, Perceptron.

img


SENTIMENT ANALYSIS ON IMDB MOVIE REVIEWS

Perform Sentiment Analysis on IMDB Movie Reviews using Unigram and Bigram setting, compared model performances with and without stemming and lemmatizing methods.

Tools: Python, Sci-kit Learn, Random Forest Classifier, Stemming, Lemmatizing.

img


BUILDING HEATING LOAD PREDICTION WITH MACHINE LEARNING MODELS

Predict building heating load with machine learning techniques and classification models including Linear Regression, Logistic Classficiation Regression, Feature Scaling (Unity Based Normalization).

Tools: Python, Scikit-Learn, Feature Scaling, Linear Regression, Logistic Classification Model.

img


HANDWRITTEN DIGITS RECOGNITION USING TENSORFLOW

Using the hand-written digit database MNIST, create a machine learning model to recognize hand-written digits. By using Tensorflow, the model was trained to recognize digits by having it "look" at thousands of examples and check the model's accuracy with the test data.

Tools: Python, Sci-kit Learn, Tensorflow, Vanilla Dense Neural Network (Vanilla DNN)

img


ANALYSIS ON PRESIDENTIAL DEBATES WITH WEBSCRAPING AND TEXT MANIPULATION

Using Webscraping and Text Manipulation to perform analysis on Presidential Debates for the years from 1960 to 2012

Tools: Python, Webscrapping with BeautifulSoup

img


OPTIMIZATION AND SENSITIVITY ANALYSIS FOR HOUSING PLAN

Optimizing different product types to maximize the company’s net profits using linear programming (LP) model, performing sensitivity analysis on the constraints and the variables along with business plans and recommendations.

Tools: AMPL (A Mathematical Programming Language), Linear Programming Model, Mixed Integer Linear Programming Model, LaTeX