A data scientist and management accountant by profession. Associate Member of the Chartered Institute of Management Accountants. Interests include machine learning; statistical inference; cost accounting and fundraising for non-profits; interactive data simulations and design.
Notable Projects:
Predicting Mortality using Air Quality and Other Features
Regression techniques to isolate the effect of air pollution on mortality in the US broadly caused by ischaemic heart disease and COPD; while also predicting mortality counts by county/year.
Tools used: Poisson GLM, Linear Regression, Random Forest Regression, K-Means Classifier
Natural Language Processing with Shakespearean Text
Sentiment analysis on Shakespeare’s great protagonists and antagonists, along with topic modeling for each act in the five act structure.
Tools used: TSNE, Pandas Dataframes, Python Dictionaries, Word2Vec, Textblob, SVD and PCA
Prediction of General State of Happiness
Predicting an individual’s general state of being from the GSS survey (1973 - 2016) using classification techniques.
Tools used: Logistic Regression, Random Forest & Decision Tree classifiers, SVMs, Naive Bayes classifiers and Pandas Dataframes
Predicting Gross Movie Revenue
Predicting a movie’s total worldwide gross revenue using features such as best actor/director nominations, budget, IMDB rating, etc.
Tools used: Linear Regression with interaction features and log transformations, Pandas Dataframes
EDA on MTA Turnstile Data
Exploration of foot traffic at various subway stations in the NYC metro area to find optimal fundraising locations.
Tools used: Pandas Dataframes, Matplotlib, Seaborn, Geopy
Contact me
Connect with me via LinkedIn: https://www.linkedin.com/in/jitsen/