CMSC 178DA | Week 10 — Session 1

Predicting the Future with Data

Time Series Fundamentals

Department of Computer Science

University of the Philippines Cebu

Lecture 19: Fundamentals & Smoothing

Every Quarter, $8.9 Billion Flows into the Philippines

OFW remittances follow the same seasonal pattern year after year.

~8% of Philippine GDP
2.2M OFWs abroad
Dec peak every year

This repeating pattern is a time series. Today we learn to analyze it.

Agenda

Session 1 Objectives

Components

Decompose time series into trend, seasonality, and residuals.

Stationarity

Test and transform data for forecasting readiness using ADF and differencing.

Smoothing

Apply moving average and exponential smoothing methods to extract signal from noise.

Running Example

Meet Our Data: OFW Remittances

We'll use monthly Philippine OFW remittance data throughout both sessions. Real BSP data — small enough to trace by hand, big enough to show real patterns.

Three Patterns to Spot

  • Trend (long-term direction of the data) — remittances grow ~4% per year
  • Seasonality (repeating pattern at fixed intervals) — December peak (Christmas!)
  • Noise (random, unpredictable variation) — random monthly fluctuation
Challenge

Look at the table. Can you guess what January 2024 will be? That's exactly what our algorithms will learn to do.

MonthRemittances (B USD)Pattern
Jan 20237.8
Feb 20237.5
Mar 20238.0
Apr 20237.9
May 20238.2
Jun 20238.1
Jul 20238.4
Aug 20238.3
Sep 20238.5
Oct 20238.7↗ trend
Nov 20238.9↗ trend
Dec 20239.8↑ seasonal peak!

Source: BSP (Bangko Sentral ng Pilipinas). Values illustrative.

Part I

What Makes Time Unique

Unlike cross-sectional data, time series carries memory. Today depends on yesterday.

This section covers time series structure, decomposition, and resampling.

Clock representing time series data
Video Resource

Common Time Series Patterns

Watch this 5-minute overview before we dive into each pattern. Source: DeepLearning.AI

0:00 — Trend

Moore's Law upward trend

0:30 — Seasonality

Weekend dips on dev sites

1:30 — Autocorrelation

Memory, lags, innovations

3:00 — Non-Stationary

Behavior changes over time

Part I · Fundamentals

Time Series Data Has Memory

A time series is a sequence of data points ordered by time — each observation may depend on previous ones.

Key Characteristics

  • Temporal dependence — today affects tomorrow
  • Trend — long-term direction
  • Seasonality — repeating patterns
  • Noise — random variation
Philippine OFW remittances monthly data showing trend and seasonal December peaks
Part I · Fundamentals

Four Components Hide Inside Every Series

Decomposition = splitting a series into its building blocks. Every time series is a mix of these four:

1. Trend (T)

The long-term direction — is the series going up, down, or flat over years?

2. Seasonality (S)

Repeating patterns at fixed intervals — December peaks, weekend dips, summer surges.

3. Cyclical (C)

Rise and fall without fixed period — business cycles, economic booms/busts (years-long waves).

4. Residual / Noise (R)

Random leftover after removing the other three — unpredictable, ideally small.

Additive

$Y_t = T_t + S_t + C_t + R_t$

Constant seasonal swing (+₱500M every Dec)

Multiplicative

$Y_t = T_t \times S_t \times C_t \times R_t$

Growing seasonal swing (+15% every Dec)

Time series with annotated trend, seasonal, cyclical, and noise components
Baseline Method

Naive Forecast: The Simplest Prediction

Rule: Tomorrow's value = today's value. That's it. The simplest possible forecast — and every other method must beat this to be useful.

Algorithm: Naive Forecast
for each time t:
ŷt = yt−1

Golden Rule

"If your fancy model can't beat naive, throw it away." — Every forecasting textbook

Month Remittances (B$) Jan Feb Mar Apr May Jun Actual Naive (ŷₜ = yₜ₋₁) Error MAE = 0.26 B$ This is our baseline to beat

The naive forecast always lags 1 step behind — it misses every move.

Part I · Fundamentals Interactive

Decomposition Reveals the Hidden Structure

What to Look For: In Additive mode, the Seasonal panel shows peaks of equal height throughout. The Residual should look like random noise near zero.
Model Comparison: Additive (Y=T+S+R): seasonal peaks stay the same size. Multiplicative (Y=T×S×R): peaks grow with the trend. Toggle between them — watch how the Seasonal and Residual panels change.
Part I · How It Works

Decomposition: Step by Step

Step 1: Extract Trend (T)

Smooth the series with a centered moving average of window = period.

$T_t = \frac{1}{m}\sum_{j=-k}^{k} Y_{t+j}$

For period $m$=12: average 12 months centered on each point. Removes seasonality, keeps only the long-term direction.

Step 2: Extract Seasonality (S)

Detrend first, then average all same-month values.

$D_t = Y_t - T_t$ (detrended)
$S_{\text{month}} = \frac{1}{n}\sum D_t$ for all Jans, all Febs, …

Example: average all December detrended values → $S_{\text{Dec}}$ = +0.5 (always a peak).

Step 3: What's Left = Residual (R)

Subtract trend and seasonality from the original.

Additive:
$R_t = Y_t - T_t - S_t$
Multiplicative:
$R_t = \frac{Y_t}{T_t \times S_t}$

If the decomposition is good, R should look like random noise near zero (additive) or near 1 (multiplicative).

Decomposition: Observed, Trend, Seasonal, Residual panels
Reading the chart: Top panel = original data. Blue trend line rises over time. Green seasonal repeats every 12 months. Bottom residual should be random — if you see patterns there, the decomposition missed something.

Python code: see Appendix

Part I · Fundamentals

Resampling Changes the Granularity

Resampling means changing the time granularity — downsampling (daily → monthly) aggregates, upsampling (monthly → daily) interpolates.

Same data at daily, weekly, monthly, and quarterly granularity
Live Demo

Moving Average Explorer

5
Algorithm: Moving Average
1. for each time step t ≥ w:
forecast[t] = (y[t−w] + … + y[t−1]) / w
2. MAE = mean(|actual − forecast|)
Window=5: MAE=—
Month Value (B USD)
Part II

The Stationarity Requirement

Most forecasting models assume the future looks statistically like the past.

If the mean or variance drifts over time, predictions break down.

Stock chart representing stationarity
Part II · Stationarity

Forecasting Breaks When the Rules Keep Changing

Side-by-side stationary noise vs non-stationary random walk
Part II · Stationarity

The ADF Test

Hypothesis

H0: Series has a unit root (non-stationary)
H1: Series is stationary

p < 0.05 → Reject H0Stationary
p ≥ 0.05Non-stationary → difference & retest

ADF Equation
$\Delta y_t = \alpha + \beta y_{t-1} + \sum \gamma_i \Delta y_{t-i} + \varepsilon_t$
Test if $\beta = 0$ (shocks never fade)
if p < 0.05 → $\beta \neq 0$ → stationary
else → $y'_t = y_t - y_{t-1}$, retest

Unit root = past shocks never decay  |  p-value = probability under H0

Raw Series — Non-Stationary trend ↗ ADF = -2.34   p = 0.16 p > 0.05 → ✗ Non-stationary difference: y'ₜ = yₜ − yₜ₋₁ After Differencing — Stationary mean ≈ 0 ADF = -5.67   p = 0.0001 p < 0.05 → ✓ Stationary!

Python code: see Appendix

Part II · Stationarity

Differencing Removes the Trend

Three-panel differencing: original, first diff, second diff with ADF p-values
Live Demo

Differencing Demo

Algorithm: Differencing for Stationarity
1. plot original → check for trend/seasonality
2. if trend: diff₁[t] = y[t] − y[t−1]
3. if seasonal: diffs[t] = y[t] − y[t−period]
4. run ADF test → p < 0.05 = stationary ✓
5. now fit ARIMA on stationary residual
Original
Knowledge Check

Stationarity Quiz

A PSEi closing price series has a clear upward trend over 5 years. What should you do before applying ARIMA?

A) Nothing, ARIMA handles trends
B) Apply first differencing
C) Remove all outliers first
D) Use a larger rolling window

Click to reveal answer

B) Apply first differencing

An upward trend means the series is non-stationary. First differencing (d=1) removes the linear trend and makes the series suitable for ARIMA.

Part III

Reading the Autocorrelation Signature

ACF and PACF plots are the fingerprint of any time series — they tell you which model to use.

Data analytics dashboard
Part III · Autocorrelation

Past Values Predict Future Values

ACF (Autocorrelation Function)

Correlation between Yt and Yt-k at each lag k. Includes indirect effects through intermediate lags.

PACF (Partial Autocorrelation)

Direct correlation between Yt and Yt-k after removing effects of intervening lags.

ρk = Cov(Yt, Yt-k) / Var(Yt)

Intuition: If sales were high last month, are they likely high this month too? Autocorrelation measures exactly this — how much the past predicts the future.

Definitions

  • Autocorrelation — how much a series correlates with its own past values
  • Lag — a delayed version of the series (lag-1 = last month, lag-12 = same month last year)
  • ACF (Autocorrelation Function) — shows correlation at every lag
  • PACF (Partial ACF) — shows direct correlation at each lag, removing intermediate effects
ACF showing exponential decay and PACF showing cutoff at lag 2 for AR(2) process
Part III · Autocorrelation

What Is Autoregression?

Autoregression (AR) is a regression where the predictors are the series' own past values — lagged versions of itself. An AR(p) model uses the last p observations to predict today.

ŷt = c + φ1yt−1 + φ2yt−2 + … + φpyt−p + εt

What

A linear model that regresses Yt on its own lags. Each coefficient φi tells you exactly how much lag-i pulls on today.

Why

Captures temporal dependence cheaply. Simple, interpretable, and a strong baseline when past values carry real predictive signal.

When

PACF cuts off at lag p, the series is stationary (after differencing if needed), and no external regressors are required. Not for non-linear regimes.

OFW Remittance Intuition

This December's remittance is strongly predicted by last December's. Yt and Yt−12 correlate year after year — an AR term captures that persistence directly.

  • AR(1) — last month predicts this month (short memory).
  • AR(12) — same month last year predicts today (yearly cycle).
  • AR(p) — blend of the last p months.

Diagnostic Signal

A good AR fit leaves white-noise residuals — no pattern remains in εt. If residuals still autocorrelate, increase p or move to ARMA / ARIMA.

TL;DR AR(p) = "predict today from the last p yesterdays."

Part III · Autocorrelation

ACF/PACF Patterns Guide Model Choice

  • AR(p) (Autoregressive) — predict from p past values: ŷt = φ₁yt−1 + φ₂yt−2 + …
  • MA(q) (Moving Average) — predict from q past errors: ŷt = θ₁εt−1 + θ₂εt−2 + …
ACF Pattern PACF Pattern Model Suggested Interpretation
Cuts off at lag q Exponential decay MA(q) Past errors drive the series
Exponential decay Cuts off at lag p AR(p) Past values drive the series
Exponential decay Exponential decay ARMA(p,q) Both values and errors matter
Significant at lag s Significant at lag s Seasonal Calendar-driven pattern
Part IV

Smoothing the Signal

Before forecasting, we need to separate signal from noise.

Smoothing techniques reveal underlying patterns by reducing random variation.

Dashboard analytics visualization
Part IV · Smoothing Interactive

Moving Averages Trade Detail for Clarity

7
30
Part IV · Smoothing Interactive

Exponential Smoothing Weights Recent Data More

Exponential Smoothing — a technique giving exponentially decreasing weights to older data. Recent observations matter more. SES (Simple Exponential Smoothing) uses one parameter α to balance between the latest data point and the previous smoothed value. α (alpha) — smoothing factor (0–1). Higher α = trusts recent data more. Lower α = smoother, slower to react.

St = 0.30 · Yt + 0.70 · St-1

The α Parameter

  • α → 0: Smooth, slow to react
  • α → 1: Reactive, follows every wiggle
  • Sweet spot: 0.2–0.3 for most business data
Live Demo

SES Step-Through Simulator

tYtα·Yt(1-α)·St-1St
Press Step to initialize S₁ = Y₁
Algorithm: Simple Exponential Smoothing
1. initialize: S₁ = Y₁
2. for t = 2, 3, …, n:
St = α · Yt + (1−α) · St−1
3. forecast: ŷn+1 = Sn
Month Value
Part IV · Smoothing

From SES to Holt-Winters: Handling Trend and Seasonality

Simple Exponential Smoothing (SES) Handles: Level only Sₜ = αYₜ + (1−α)Sₜ₋₁ Parameter: α + Trend Holt's Method (Double) Handles: Level + Trend Adds slope equation bₜ Parameters: α, β + Season Holt-Winters (Triple) Handles: Level + Trend + Seasonality Parameters: α, β, γ Each method builds on the previous one α = level    β = trend    γ = seasonal smoothing
Live Demo

ARIMA Forecast Animator

Press Play or Step to begin.
Algorithm: ARIMA(2,1,0)
1. split → train (1–36) | test (37–48)
2. difference train data (d=1)
3. fit AR(p) on differenced data
4. forecast h steps + confidence intervals
5. compare vs actual → MAE
Month Value (B USD)
Big Picture

The Forecasting Ladder: Simple → Complex

Each method builds on the last. More complexity = more accuracy, but also harder to explain.

#MethodEquationMAEBeats Naive?
1Naive ŷt = yt−1 0.26Baseline
2Moving Avg ŷt = mean(yt−w:t) 0.43No — lags behind
3Diff + MA MA on Δy + past 0.23✓ Yes
4SES (α=0.3) St = αYt + (1−α)St−1 0.21✓ Yes
5ARIMA(2,1,1) φ1y't−1 + φ2y't−2 + θ1εt−1 0.18✓✓ Best classical
6Prophet g(t) + s(t) + h(t) 0.15✓✓✓ Best overall

Session 1 Key Takeaways

  1. Time series = ordered data where today depends on yesterday
  2. Decomposition separates trend, seasonality, and noise
  3. Stationarity is required for ARIMA — test with ADF, fix with differencing
  4. ACF/PACF plots are your model selection guide
  5. Exponential smoothing adapts to trend and seasonality

Next: Session 2 — Forecasting Methods (ARIMA, Prophet, Evaluation)

CMSC 178DA | Week 10 — Session 2

From Understanding to Prediction

ARIMA, Prophet & Evaluation

Department of Computer Science

University of the Philippines Cebu

Lecture 20: Forecasting & Evaluation

Jollibee Group Opens 700+ Stores Per Year

Every new location needs a multi-year sales forecast before opening day.

10,000+ outlets (all brands)
700+ new stores/year
5 yr strategic plan horizon

The tool they need? ARIMA and Prophet.

Agenda

Session 2 Objectives

ARIMA

Build ARIMA/SARIMA models and choose p, d, q parameters systematically.

Prophet

Use Meta Prophet for business forecasting with holidays and changepoints.

Evaluation

Measure forecast accuracy with MAE, RMSE, MAPE and proper temporal splits.

Part I

ARIMA: The Workhorse of Forecasting

Three ideas from Session 1 — autoregression, differencing, and moving average — combined into one powerful model.

Forecasting charts and data
Part I · ARIMA

What Is ARIMA?

ARIMA = AutoRegressive Integrated Moving Average — a single model that stitches together three Session 1 ideas into one framework, controlled by parameters (p, d, q).

What

AR(p) uses p past values, I(d) differences the series d times to remove trend, and MA(q) corrects using q past forecast errors. Parameters (p, d, q) say how much of each.

Why

One unified framework for trend + autocorrelation + shock-persistence. Well-understood theory, fast to fit on a laptop, and delivers built-in confidence intervals out of the box.

When

Univariate series with mild-to-moderate patterns, stationary after d differences, at least ~50 observations. Not for strong multi-seasonality (use SARIMA) or abrupt regime shifts (use Prophet).

Picking (p, d, q)

p → PACF cutoff  ·  d → ADF test  ·  q → ACF cutoff
  • Run ADF; difference until p-value < 0.05 → that's d.
  • Inspect PACF of differenced series → first lag that cuts off = p.
  • Inspect ACF of differenced series → first lag that cuts off = q.

Connect Back to Session 1

You already know each piece: AR from autocorrelation (Part III), I from differencing & the ADF test (Part II), and MA as a residual-correction mechanism. ARIMA just composes them.

TL;DR ARIMA(p, d, q) = AR + differencing + MA.

Part I · ARIMA

ARIMA Combines Three Ideas You Already Know

ARIMA = AutoRegressive Integrated Moving Average — the workhorse of classical time series forecasting. p = number of past values used (AR order) | d = number of times differenced | q = number of past errors used (MA order).

AR(p) AutoRegressive Past values predict the future ŷₜ = φ₁yₜ₋₁ + φ₂yₜ₋₂ + … + I(d) Integrated Differencing to make stationary y'ₜ = yₜ − yₜ₋₁ (applied d times) + MA(q) Moving Average Past errors correct the forecast ŷₜ = θ₁εₜ₋₁ + θ₂εₜ₋₂ + … ARIMA(p, d, q) The workhorse of time series forecasting
Part I · ARIMA

The ARIMA Equation in Plain English

Yt = c + φ1Yt-1 + … + φpYt-p + θ1εt-1 + … + θqεt-q + εt

In words: "Today's value = constant + weighted past values + weighted past errors + new shock."

y'ₜ = c + φ₁y'ₜ₋₁ + φ₂y'ₜ₋₂ + θ₁εₜ₋₁ + εₜ AR (AutoRegressive) φ₁y'ₜ₋₁ + φ₂y'ₜ₋₂ Use p past values to predict I (Integrated) y'ₜ = yₜ − yₜ₋₁ Differencing d times for stationarity MA (Moving Avg) θ₁εₜ₋₁ + θ₂εₜ₋₂ + … Use q past errors to correct εₜ (Noise) Random error — unpredictable Should look like white noise ARIMA(p, d, q) = AR(p) + I(d) + MA(q) p = how many past values | d = how many diffs | q = how many past errors
Part I · ARIMA

Building ARIMA in Python

The statsmodels ARIMA class handles fitting, diagnostics, and forecasting.

Workflow

  1. Choose p, d, q (from ACF/PACF or auto)
  2. Fit model & check summary
  3. Run diagnostics (residual plots)
  4. Forecast with confidence intervals
Algorithm: Box-Jenkins Method
1. plot series → look for trend and seasonality
2. difference until ADF says stationary:
$y'_t = y_t - y_{t-1}$   ($d$ = number of diffs needed)
3. read ACF/PACF of differenced series:
PACF cuts off at lag $p$ → AR order  |  ACF cuts off at lag $q$ → MA order
4. fit the ARIMA equation:
$y'_t = c + \phi_1 y'_{t-1} + \phi_2 y'_{t-2} + \theta_1 \varepsilon_{t-1} + \varepsilon_t$
5. check residuals $\varepsilon_t \approx$ white noise? (no patterns left)
6. forecast: iterate forward   CI = $\hat{y} \pm 1.96\,\sigma\sqrt{h}$
Model Diagnostics (residuals should look random) Residuals Over Time ✓ Random scatter = good Residual Distribution ✓ Bell-shaped = normal ACF of Residuals ✓ All within bands = no pattern Q-Q Plot (Normality) ✓ Points on line = normal residuals

ARIMA(2,1,1) Model Summary

ParameterCoeffStd ErrMeaning
ar.L10.720.08Strong positive from 1 month ago
ar.L2-0.210.07Mild correction from 2 months ago
ma.L1-0.890.05Strong error correction

AIC: 478.3 (lower = better)

Python code: see Appendix

Part I · ARIMA

Choosing p, d, q

Step 1: ADF Test Is the series stationary? p < 0.05? Yes d = 0 No Difference & retest d += 1 Step 2: Plot PACF Cuts off at lag p → AR order Step 3: Plot ACF Cuts off at lag q → MA order ARIMA(p, d, q) Fit model → check residuals → forecast Stationary ✓

Manual approach: 3 steps to find (p, d, q)

PACF → find p Cuts off at lag 2 → p = 2 ACF → find q Tails off exponentially → AR process

Automatic (pmdarima)

Searches all combinations of (p, d, q) and picks the one with the lowest AIC. No manual ACF/PACF reading needed.

AIC (Akaike Information Criterion) = fit quality − complexity penalty. Lower = better.

auto_arima Search Results

ModelAIC
ARIMA(0,1,0)523.1
ARIMA(1,1,0)498.4
ARIMA(2,1,0)485.2
ARIMA(2,1,1)478.3← Best!
ARIMA(3,1,1)479.8worse
Which to use? Start with auto_arima for speed. Use Box-Jenkins manually when you want to understand why specific parameters were chosen.

Python code: see Appendix

Part I · ARIMA Interactive

Forecasting with Confidence Intervals

Forecast horizon — how many time steps into the future you're predicting. Longer = more uncertain. Confidence interval (CI) — a range (e.g., 95% CI) where the true value is likely to fall. CI widens with longer horizons.

30 steps
Part I · ARIMA

SARIMA Adds Seasonal Intelligence

SARIMA = Seasonal ARIMA. Adds seasonal AR, differencing, and MA on top of regular ARIMA. (P,D,Q,m): P = seasonal AR lags, D = seasonal differencing, Q = seasonal MA terms, m = period (12 = monthly).

ARIMA(p,d,q)(P,D,Q)s

Seasonal Parameters

  • P: Seasonal AR order
  • D: Seasonal differencing
  • Q: Seasonal MA order
  • s: Seasonal period (12=monthly, 7=weekly)

Example: SARIMA(1,1,1)(1,1,1)12

Dec peak! Training Forecast + 95% CI

SARIMA(1,1,1)(1,1,1)12 — 12-Month Forecast

MonthForecast (B$)95% CI
Jan9.28.6 – 9.8
Mar8.88.0 – 9.6
Jun8.77.8 – 9.6
Sep9.18.0 – 10.2CI widens
Dec10.18.8 – 11.4Peak!
Key: The (1,1,1)12 seasonal component captures the December remittance peak. CI grows from ±0.6 to ±1.3 further out.

Python code: see Appendix

Knowledge Check

ARIMA Quiz

Your ADF test gives p=0.03 after first differencing. PACF cuts off at lag 2 and ACF decays exponentially. What ARIMA order should you try?

A) ARIMA(0, 1, 2)
B) ARIMA(2, 1, 0)
C) ARIMA(1, 0, 1)
D) ARIMA(2, 0, 2)

Click to reveal answer

B) ARIMA(2, 1, 0)

PACF cutoff at 2 → p=2. One differencing needed (p=0.03 after) → d=1. ACF decays (doesn't cut off) → q=0. This is a pure AR(2) model on differenced data.

Part II

Prophet: Built for Business

Meta's open-source tool handles missing data, holidays, and changepoints automatically.

Designed for analysts who need good forecasts fast, not ARIMA experts.

AI and technology prediction
Part II · Prophet

What Is Prophet?

Prophet is Meta's open-source additive decomposable forecasting model. Instead of requiring stationarity, it fits trend, seasonality, and holidays directly — and combines them with Bayesian parameter estimation.

y(t) = g(t) + s(t) + h(t) + εt

trend + seasonality + holidays + noise

What

Piecewise-linear trend g(t) with auto-detected changepoints, Fourier-series seasonality s(t), and user-defined holiday effects h(t), fit via Bayesian optimization (Stan/MCMC).

Why

Robust to missing data and outliers. Detects changepoints automatically. Interpretable components (trend / seasonality / holiday) you can plot separately. Analyst-friendly API — two columns: ds and y.

When

Business forecasting with strong calendar/holiday effects, messy real-world data, multiple seasonalities (daily + weekly + yearly), or when a non-specialist needs a good default quickly.

ARIMA vs Prophet — Pick the Right Tool

Property ARIMA Prophet
Needs stationarity?Yes (differencing)No
Handles missing data?PoorlyNatively
FrameworkClassical / MLEBayesian
Holidays & regressorsManual (SARIMAX)First-class
Best fit forClean, stationaryMessy, calendar-driven

Philippine Use Case

Monthly OFW remittances peak around December (Christmas sendings) and dip around Undas. Prophet models these as holiday effects and a piecewise trend without requiring us to difference the series first.

TL;DR Different tools, different jobs — we evaluate both in Part III.

Part II · Prophet

Prophet Solves Real Business Problems

Prophet — Meta's open-source library for business forecasting. Handles missing data, holidays, and trend changes automatically. Uses Fourier series (sine and cosine waves) to mathematically represent repeating seasonal patterns.

Missing Data

Handles gaps automatically — no imputation needed.

Outlier Robust

COVID-era spikes won't break your forecast.

Changepoints

Detects trend shifts automatically (e.g., policy changes).

Changepoint — a moment where the trend's growth rate shifts (e.g., new competitor enters market, policy change).

Holiday Effects

Add Christmas, Undas, or any custom event.

Part II · Prophet

Prophet Setup and Forecasting

Algorithm: Facebook Prophet
1. decompose into 3 learnable components:
$y(t) = g(t) + s(t) + h(t) + \varepsilon_t$
$g(t)$ = piecewise linear trend with auto-detected changepoints
$s(t) = \sum_{n=1}^{N}\left[a_n \cos\!\left(\frac{2\pi nt}{P}\right) + b_n \sin\!\left(\frac{2\pi nt}{P}\right)\right]$ (Fourier)
$h(t)$ = indicator function for holidays (Christmas, Undas, …)
2. fit all parameters via Bayesian optimization (Stan/MCMC)
3. forecast with uncertainty: sample from posterior distribution
Prophet forecast components
Output: DataFrame with ds (date), yhat (prediction), yhat_lower/upper (95% CI). Use model.plot_components() for trend + seasonality breakdown.

Python code: see Appendix

Part II · Prophet

Philippine Holidays Make Forecasts Smarter

Forecast comparison with and without Philippine holiday effects
Part III

Measuring Forecast Quality

A forecast without an error estimate is just a guess.

This section covers metrics, temporal splits, and model comparison.

Analysis and metrics evaluation
Part III · Evaluation

Four Metrics Every Analyst Must Know

Metric Formula Interpretation When to Use
MAE mean(|y − ŷ|) Average absolute error in original units General purpose
RMSE √mean((y − ŷ)²) Penalizes large errors more When big misses are costly
MAPE mean(|y − ŷ|/y) × 100 Percentage error — scale-free Stakeholder reports
MASE MAE / naive_MAE <1 means better than naive forecast Comparing across datasets
Part III · Evaluation

Time Series Train-Test Split: Never Shuffle

Temporal split — split by TIME (train on past, test on future). Never shuffle time series data! White noise — a series of completely random values with no pattern (mean=0, constant variance). Good residuals look like white noise.

Correct temporal split vs wrong random shuffle comparison
Part III · Capstone

Philippine Remittances: Complete Forecasting Pipeline

Complete pipeline comparing ARIMA vs Prophet on Philippine remittance data
Live Demo

Method Showdown Race

#MethodMAEvs Naive
Current Method
1. select a forecasting method
2. train on months 1–36
3. forecast months 37–48
4. compute MAE = mean(|actual − forecast|)
Month Value (B USD)
Resources

Keep Learning: Free Courses & Datasets

These free resources go deeper into every topic we covered today.

Session 2 Key Takeaways

  1. ARIMA(p,d,q) = AR + differencing + MA in one model
  2. Use ACF/PACF or auto_arima to choose parameters
  3. SARIMA adds seasonal (P,D,Q)s for periodic data
  4. Prophet is ideal for business forecasting with holidays and missing data
  5. Always use temporal train-test splits, never random shuffle
  6. MAPE is the most intuitive metric for business stakeholders

Lab 10: Time Series Forecasting Project

Forecast a Philippine economic indicator. Compare ARIMA vs Prophet. Present results to "management."

Python Code Reference

Complete runnable code for every algorithm.
Copy-paste into your Jupyter notebook.

Appendix

Decomposition Code

What This Does

Splits a time series into trend, seasonal, and residual components using statsmodels.

Key Parameters

  • model: 'additive' or 'multiplicative'
  • period: seasonal cycle length (12 for monthly)
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose
import matplotlib.pyplot as plt

df = pd.read_csv('bsp_remittances.csv',
                  parse_dates=['date'])
df.set_index('date', inplace=True)

decomp = seasonal_decompose(
    df['remittances_usd'],
    model='additive',  # or 'multiplicative'
    period=12           # monthly seasonality
)

decomp.plot()
plt.tight_layout()
plt.show()
Appendix

ADF Test Code

What This Does

Tests whether a time series is stationary using the Augmented Dickey-Fuller test. p < 0.05 means stationary.

from statsmodels.tsa.stattools import adfuller

result = adfuller(df['remittances_usd'])

print(f'ADF Statistic: {result[0]:.4f}')
print(f'p-value:       {result[1]:.4f}')

if result[1] < 0.05:
    print("Stationary! Ready to model.")
else:
    print("Non-stationary. Apply diff.")
    # Apply differencing
    df_diff = df['remittances_usd'].diff().dropna()
    result2 = adfuller(df_diff)
    print(f'After diff - p: {result2[1]:.4f}')
Appendix

ARIMA Code

What This Does

Fits an ARIMA(p,d,q) model and generates forecasts with confidence intervals.

order=(p, d, q)

  • p: AR lags
  • d: differencing order
  • q: MA terms
from statsmodels.tsa.arima.model import ARIMA

# Fit model
# order=(p, d, q) = (2 AR, 1 diff, 1 MA)
model = ARIMA(df['sales'], order=(2, 1, 1))
results = model.fit()

# Model summary
print(results.summary())

# Diagnostic plots (residuals)
results.plot_diagnostics(figsize=(12, 8))

# Forecast 30 steps ahead
forecast = results.get_forecast(steps=30)
mean = forecast.predicted_mean
ci = forecast.conf_int()  # 95% CI
Appendix

auto_arima Code

What This Does

Automatically searches for the best (p,d,q) by comparing AIC scores. No manual ACF/PACF reading needed.

from pmdarima import auto_arima

auto_model = auto_arima(
    df['sales'],
    start_p=0, max_p=5,
    start_q=0, max_q=5,
    d=None,        # auto-detect d
    seasonal=False,
    trace=True     # show AIC comparisons
)

print(auto_model.summary())
# Best model: ARIMA(2,1,1) AIC=478.3
Appendix

SARIMA Code

What This Does

Extends ARIMA with seasonal components. The seasonal_order=(P,D,Q,m) adds monthly patterns.

seasonal_order=(P, D, Q, m)

  • P,D,Q: seasonal AR, diff, MA
  • m=12: monthly cycle
from statsmodels.tsa.statespace.sarimax import SARIMAX

model = SARIMAX(
    df['remittances_usd'],
    order=(1, 1, 1),
    seasonal_order=(1, 1, 1, 12)
    # P=1, D=1, Q=1, period=12 months
)
results = model.fit(disp=False)

# Forecast next 12 months
forecast = results.forecast(steps=12)
print(forecast)
Appendix

Prophet Code

What This Does

Meta's Prophet handles trends, seasonality, and holidays automatically. Requires columns named 'ds' (date) and 'y' (value).

from prophet import Prophet

# Prepare data (must have 'ds' and 'y')
df_p = df.reset_index()
df_p.columns = ['ds', 'y']

# Fit model
model = Prophet(
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=False
)
model.fit(df_p)

# Forecast
future = model.make_future_dataframe(periods=30)
forecast = model.predict(future)

# Visualize
model.plot(forecast)
model.plot_components(forecast)