Hi! Thanks for dropping by my website.

I’m Daphne, a data scientist based in Sydney. I love all things data and this website is where I document personal data projects as well as my thoughts on anything and everything data science.

As a data scientist, I use Python, Statistics, and Machine learning to develop predictive models. To generate, extract, and manipulate the data I use to train my models, I employ one or a combination of the following: SQL, Web scraping, numpy and pandas (the latter two for data manipulation). I use plotting libraries in Python like matplotlib, seaborn, and plotly for visualisation and good-ole Tableau to create dashboards!

Recent Projects

Below are some of the projects I’ve completed.

Tennis Win Prediction App

Developed a model to predict a tennis player’s win probability based on the competing players’ match performance statistics and player characteristics. Used ensemble technique to improve performance. Deployed the model through an app using streamlit.io

Job Title & Salary Prediction

Scraped local job search platform for data science job ads. Used natural language processing techniques to develop a model that predicts job title and salary bands based on the job ad details. Compared performance of two types of vectorizers and determined the best vectorizer-model combination. Compared performance of supervised and unsupervised models.

Property Sale Price Model

Used the Ames Housing Data set to develop a regression model to predict property sale price based on non-renovatable property characteristics. Determined how the property price is affected by its renovatable features. Used classification techniques to identify factors that result to an abnormal sale.

