People Analytics – Attrition Predictions

According to the U.S. Bureau of Labor Statistics, 4.5 years is the average amount of time employees stay with their company today. It hurts an organization’s financials and morale , considering the amount of time they spend training. Can management learn from the past attrition and manage to reduce turnovers? Answer is yes. We will build some predicative models using the fictional IBM data set which contains 1470 employee attrition records.

This post is part of a series of people analytics experiments I am putting together:

Continue reading “People Analytics – Attrition Predictions”

Simple Skill-based Job Recommendation Engine

What are the most demanded skills for data scientists? Python, R, SQL, and the list goes on and on. There are many surveys and reports that show some good statistics on popular data skills. In this post, I am going to gather first-hand information by scraping data science jobs from indeed.ca, analyze top skills required by employers, and make job recommendations by matching skills from resume to posted jobs. It will be fun!

Quick summary of the project workflow:

Workflow
Workflow

Continue reading “Simple Skill-based Job Recommendation Engine”

How many bikes to be shared in Vancouver NEXT WEEK – Part 2

This is Part 2 of building predictive models on Vancouver bike share. Part 1 is here.  Python code can be found on my GitHub.

Model Training

Training dataset contains hourly bike rentals for each day from 01/01/2017 to 07/24/2018.

Two decision tree models were trained: Random Forest (RF) and Gradient Boosted Trees (GBM). They are well known for delivering better performance and efficiency on noisy datasets. However, tuning hyperparameters can be some challenges so that they will not overfit.

Continue reading “How many bikes to be shared in Vancouver NEXT WEEK – Part 2”

How many bikes to be shared in Vancouver NEXT WEEK – Part 1

Despite of worldwide debates on bike sharing benefits and challenges, Vancouver launched its own bike sharing program in summer 2016, Mobi sponsored by Shaw Go. First bike share appeared in Amsterdam in 1960’s, and then was introduced to other big European cities. It has got popularized by the Chinese in the last decade – 13 out of 15 world biggest bike share programs are in China.

I like bike share program because it is simply convenient and helps to save environment. So I decided to look into Vancouver bike share historical data, and hoped to find some trends/patterns. Thanks to Mobi who made their bike usage data available, predictive models can be built to forecast future rides.

Quick summary of the project workflow:

Project workflow
Project workflow

Continue reading “How many bikes to be shared in Vancouver NEXT WEEK – Part 1”

How much time will it take to cross the border at Peace Arch?

Living in Vancouver, it is so convenient driving cross the border and have some fun on the other side. However, if you have headache of waiting in the long border-crossing lines and getting stuck for almost an hour, you are not alone. We all know the basic strategies on best/worst days/hours to cross, for example, avoid long weekend or Christmas week, arriving the border early, etc. A crystal ball that can tell us ahead of time on our wait time at the border crossing will be just fantastic! Well, I decided to give it a swing and make a crystal ball – to build a machine learning model.

Below is a quick summary of the workflow on this mini project.

Project workflow

Continue reading “How much time will it take to cross the border at Peace Arch?”