Demand Forecasting – Online Advertisements

Objective: Given an ad campaign’s targeting constraints, ad slots, start date and end date we want to forecast the available inventory for that line item. If the ad slots in question are being targeted by other active campaigns during the concerned time period then the forecasting system should project demand taking that latter account.

Procedure:

To effectively forecast the demand we can use a two phase approach defined below

1. Forecast overall inventory from a particular ad slot
2. Project the proportion of the forecasted overall inventory available for the new campaign given some active campaigns targeting the ad slot.

Trend Detection : In this step we will try to analyze the global trend e.g. are the number of requests linearly increasing, if so at what rate.
Seasonal Behavior: This include weekly and monthly fluctuations in inventory particularly if we have an e-commerce publishers. This step will involve tapping the monthly seasonal fluctuations and include the details in the overall projected inventory.
Noise Removal: Here we will try to smooth out the series via removing the inherent noise in the data.
Impulse Handling: This consists of unprecedented and sometimes unaccounted sudden rise in inventory which if not handled properly will lead to over projection of inventory. Sometimes publishers introduce new features( e.g.some production related issue can cause unserved requests) or due to some unpredictable event (in case of news site) huge peaks (impulses) are often observed in the page views time series, our aim is to somehow detect that the impulses are not long term trend and ignore them while projecting available inventory.

To accomplish the above mentioned goals, the research phase will involve but not restricted to testing the performance of time series algorithms like

Holt-Winters
Kalman Filters (State Space Models)
Elastic Smooth Season Filtering
Discrete Fourier Transforms based forecasting techniques
ARIMA
regression models for global trend detection etc
Some heuristic based techniques (e.g. moving average) to handle impulse and noise

Experiments and Observations

I did some descriptive analysis on impressions delivery plots of Ad Id X from publisher XX on the data we fetched from Google DFP API. Interestingly just by visual inference we could infer weekly pattern in data. To find weekly pattern (weekly seasonal behavior) theoretically month is taken as one period in time series and to find monthly data pattern (monthly season behavior) year is considered as one period unit. By analyzing yearly data we found global trend i.e. if overall impressions rate to a site is increasing or not. But since we only have data from July 2013 to August 2014 we cannot infer anything statistically relevant about the monthly season behavior and trend, but we can clearly see weekly seasonal behavior.

2014 monthly impressions plot – Jan(01) to Oct(10)

Initial Experimentation

We trained the model, tested it and tuned it using XX(publisher) DFP impression data. Once done we compared it with the daily forecasted impression values with actual delivered impressions. We will be updating the model parameters on daily basis so as to effectively incorporate new insights from data.

For creating initial baseline model for creating benchmark performance statistics we zeroed in on a variation of Holt-Winter’s forecasting algorithm with the additional feature of multi season forecasting i.e. along with daily trends within a week, e.g. learning that more impressions are received in weekends compared to weekdays, it will also adjust according to monthly seasonality. Since we didn’t have even 2 years of data, the algorithm would not be that efficient while forecasting accurately when it comes to dealing with monthly variations (though it will do far better than naive approaches e.g. Google DFP’s approach of using just last 28 days data without considering seasonal trends) but I hope it will at least learn good amount from weekly data and distinguish patterns from weekday and weekends impression delivery.

Overall Features of the algorithm

Trend Detection : Finding if overall impression delivery is increasing over time or decreasing
Monthly Seasonal Decomposition : Find seasonal traffic behavior e.g. if traffic is much more during year end use this information for better forecast during following year’s end.
Weekly Seasonal Component : This will involve learning from the fluctuating daily traffic behavior and learning on which particular days of week more traffic is received.

Experiment 1.1 : Holt Winters Additive – 12*7 Seasons : In this prototype we tried to learn global trend and seasonal patterns (pattern for each day of week day, Monday to Sunday for each month, so in total 12*7 distinct seasons). Since overall we didn’t have much data points to train for 84 distinct seasons ( only 4 data points for each season) the results have significant variance but we are still capping overall seasonal trends, day wise and monthly shifts and overall increasing global trend.

The plot below shows the predicted(one day ahead) and real impression. The training data used belongs from 07 July 2013 to 15 Oct 2014.

Screen Shot 2018-02-20 at 3.18.27 AM

Experiment 1.2 : Holt Winters Additive – 12 Month Seasons

*In this approach we smooth out the weekly fluctuations by taking centered moving average with period 7.

In this manner we removed the noise in data due to weekend and now we can concentrate on learning from the monthly behavior. The RMSE of test data reduced significantly with this approach though the only caveat is we will have huge errors if we want to predict #impressions for one particular day. Though the algorithm will perform good if we want overall #impressions for a time period.

Screen Shot 2018-02-20 at 3.20.04 AM

test data = overall impressions severed from October 17,2014 to October 27, 2014 = 346246

smoothed value of test data = 372105

Screen Shot 2018-02-20 at 3.21.29 AM

Issues:

COVARIATE SHIFT: If the seasonal behavior of impression corresponding to some ad-unit changes totally then the algorithm is having problem adjusting to this abrupt change

Screen Shot 2018-02-20 at 3.24.13 AM.png

Screen Shot 2018-02-20 at 3.24.45 AM.png

Screen Shot 2018-02-20 at 3.25.08 AM.png

Screen Shot 2018-02-20 at 3.25.35 AM.png

Screen Shot 2018-02-20 at 3.25.57 AM

Screen Shot 2018-02-20 at 3.26.30 AM

Screen Shot 2018-02-20 at 3.27.07 AM

Screen Shot 2018-02-20 at 3.27.40 AM

Screen Shot 2018-02-20 at 3.29.45 AM.png

Screen Shot 2018-02-20 at 3.30.24 AM.png

Results of modeling level of series

Screen Shot 2018-02-20 at 3.31.52 AM.png

Screen Shot 2018-02-20 at 3.32.14 AM.png

Screen Shot 2018-02-20 at 3.32.54 AM.png

predicted value. I will try to work around to find some sub optimal value and use it to draw CIs.

Generalized Additive Model (GAM)

Extracts from “Forecasting at Scale”, Sean J. Taylor et. al.

Screen Shot 2018-02-21 at 12.43.48 AM

The above mentioned paper is a really good study on how to model a time series as a GAM with change point detection to tackle covariate shift.

It further explores following notions while modeling Trend

A saturating non linear growth function
Linear trend with change points
Automatic Changepoint Selection

For modeling Seasonal component the paper uses Fourier series to provide a flexible model of periodic effects.

Screen Shot 2018-02-21 at 12.57.55 AM

The third model is for Holidays and Events to account for calendar effect(spikes and drops).

The above methodology has been implemented in Facebook’s Prophet library. In the next section I try to show it’s use on a toy data set.

Data Set:

Screen Shot 2018-02-21 at 1.02.52 AM

Advantage of Facebook Prophet

The formulation is flexible: we can easily accommodate seasonality with multiple periods and different assumptions about trends.
Unlike with ARIMA models, the time series measurements need not have a regular period and we do not need to interpolate missing values to fit.
Automatic Bayesian change point detection
We don’t need to explicitly split time series in level and seasonality components in a linear fashion
Models Level/Trend via a Non-linear growth function
Periodic Seasonality modelled via Fourier Series
Holiday Modeling: Incorporates list of holidays into the model in a straightforward way assuming that the effects of holidays are independent.
Can incorporate change in seasonality behavior e.g. if we used to monitor peeks on Monday but this behavior changes the Prophet can detect change point and incorporate the effects in seasonality component.

Fitting the model and forecasting

Screen Shot 2018-02-21 at 1.07.13 AM

Extracting Components:

Screen Shot 2018-02-21 at 1.08.25 AM.png

The following attached pdf link (generated from ipython notebook) showcases comprehensive analysis of the toy time series used above. In the attached doc I explore various ways of modeling components of a time series and highlights their corresponding pros and cons.

Timeseries Analysis IPython Notebook(pdf)

	Revamping Dual Encod… on Feature Fusion For The Un…
	Neural Ranking Archi… on Feature Fusion For The Un…
	Neural Ranking Archi… on Talk On Multi Stage Ranki…
	Graph Neural Network… on Attribute Discovery For E-Comm…
	Siddharth Sharma on CTR Prediction System –…