Anna Nguyen

PROGRAMMING AND CODING SKILLS

Python, SQL (MySQL, PostgreSQL), R, STATA, SPSS

MACHINE LEARNING

Linear and Logistic Regression, Classification and Regression Trees, Random Forest, XGBoost, K-means Clustering, Time Series Models (ARIMA), NLP, Boostrapping, K-fold Cross Validation

DATA VISUALIZATION

Tableau, Python (Matplotlib, Seaborn, Altair, Plotly, Bokeh)

STATISTICS

Hypothesis Testing, A/B Testing, Design of Experiments, MLE, Probability Distributions, Bayes Theorem

BUSINESS ACUMEN

A comprehensive understanding of business goals and industry, to discern the problems to help the business sustain and grow as well as exploring new business opportunities

COMMUNICATION SKILLS

A Story teller -- present insights and interesting patterns in a clear and concise manner to business executives

OTHERS

Project Management, Problem-solving Skills, Detail-oriented, Creativity, Team Player, Curiosity, Critical Thinking, Open-minded

My Portfolio

All
Machine Learning
Data Visualization
Prediction
Time Series Analysis
Others

COMMODITY PRICE FORECASTING FOR SUPPLY CHAIN OPTIMIZATION

Forecasting metal price listed in the commodity market based on the historical price using different time-series prediction models.

Tools: Python, Time Series Models (ARIMA), Linear Regression Models (Ridge, Lasso), XGBoost, GRU

DIABETES PREDICTION WITH MACHINE LEARNING MODELS

Trained and compared the performance of the machine learning models with two different missing-data imputation: mean imputation and guess matrix.

Tools: Python, Scikit-Learn, Logistic Regression, Random Forest Classifier, AdaBoost, Perceptron.

SENTIMENT ANALYSIS ON IMDB MOVIE REVIEWS

Perform Sentiment Analysis on IMDB Movie Reviews using Unigram and Bigram setting, compared model performances with and without stemming and lemmatizing methods.

Tools: Python, Sci-kit Learn, Random Forest Classifier, Stemming, Lemmatizing.

BUILDING HEATING LOAD PREDICTION WITH MACHINE LEARNING MODELS

Predict building heating load with machine learning techniques and classification models including Linear Regression, Logistic Classficiation Regression, Feature Scaling (Unity Based Normalization).

Tools: Python, Scikit-Learn, Feature Scaling, Linear Regression, Logistic Classification Model.

HANDWRITTEN DIGITS RECOGNITION USING TENSORFLOW

Using the hand-written digit database MNIST, create a machine learning model to recognize hand-written digits. By using Tensorflow, the model was trained to recognize digits by having it "look" at thousands of examples and check the model's accuracy with the test data.

Tools: Python, Sci-kit Learn, Tensorflow, Vanilla Dense Neural Network (Vanilla DNN)

ANALYSIS ON PRESIDENTIAL DEBATES WITH WEBSCRAPING AND TEXT MANIPULATION

Using Webscraping and Text Manipulation to perform analysis on Presidential Debates for the years from 1960 to 2012

Tools: Python, Webscrapping with BeautifulSoup

OPTIMIZATION AND SENSITIVITY ANALYSIS FOR HOUSING PLAN

Optimizing different product types to maximize the company’s net profits using linear programming (LP) model, performing sensitivity analysis on the constraints and the variables along with business plans and recommendations.

Tools: AMPL (A Mathematical Programming Language), Linear Programming Model, Mixed Integer Linear Programming Model, LaTeX

I'm