John MacAdam

Using Data Science to Make Better Predictions

June 21, 2022

Recently I attempted to answer a fairly common question: Given this historical data, when should we expect to reach the next milestone?

One of the companies I support has a product impacting a lot of people. Each day this product reaches roughly 100,000 new users. I was given the past two years of data and asked "when will we reach 300 million users?" My instinct was to answer the question using a spreadsheet. So I started with Excel:

I could have stopped here. This guess was likely sufficient. However, I wanted to try answering the same question with a machine learning model, for a couple of reasons:

My first experience building a machine learning model

I spent some time looking around for an approach that would fit this type of data & question. I landed on Prophet, a tool built for forecasting time series data where seasonal effects (yearly, weekly, daily, holidays, etc.) are factored into the trends and predictions. Perfect.

Prophet models can be built in either Python or R. I opted to give R a shot (I have always wanted a reason to try R). So I followed the Prophet R API Documentation: