Time Series Fundamentals
Department of Computer Science
University of the Philippines Cebu
Lecture 19: Fundamentals & Smoothing
OFW remittances follow the same seasonal pattern year after year.
This repeating pattern is a time series. Today we learn to analyze it.
Decompose time series into trend, seasonality, and residuals.
Test and transform data for forecasting readiness using ADF and differencing.
Apply moving average and exponential smoothing methods to extract signal from noise.
Moving from "what happened" to "what will happen next."
Two sessions: Fundamentals (today) + Forecasting (next).
Unlike cross-sectional data, time series carries memory. Today depends on yesterday.
This section covers time series structure, decomposition, and resampling.
A time series is a sequence of data points indexed in time order where each observation depends on previous ones.
Yt = Tt + St + Rt
Seasonal swing is constant (e.g., +$500M every Dec).
Yt = Tt × St × Rt
Seasonal swing grows with level (e.g., +15% every Dec).
Observed = Trend + Seasonal + Residual. Try changing the period and model to see the effect.
seasonal_decompose(df, model='additive', period=12)
Python's statsmodels handles decomposition automatically.
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose
import matplotlib.pyplot as plt
# Load PH remittance data
df = pd.read_csv('bsp_remittances.csv',
parse_dates=['date'])
df.set_index('date', inplace=True)
# Decompose (monthly, annual cycle)
decomp = seasonal_decompose(
df['remittances_usd'],
model='additive',
period=12
)
# Plot all four components
decomp.plot()
plt.tight_layout()
'W' = weekly 'M' = month-end 'Q' = quarter 'Y' = year
weekly = df['sales'].resample('W').sum()
monthly = df['sales'].resample('M').mean()
Most forecasting models assume the future looks statistically like the past.
If the mean or variance drifts over time, predictions break down.
ARIMA, exponential smoothing, and most forecasting models assume stationarity. Non-stationary data must be transformed first.
p < 0.05 → Reject H0, declare stationary.
p ≥ 0.05 → Non-stationary, needs differencing.
from statsmodels.tsa.stattools import adfuller
result = adfuller(df['remittances_usd'])
print(f'ADF Statistic: {result[0]:.4f}')
print(f'p-value: {result[1]:.4f}')
# Interpret
if result[1] < 0.05:
print("Stationary! Ready to model.")
else:
print("Non-stationary. Apply diff.")
diff_1 = df['sales'].diff() # 1st
diff_2 = df['sales'].diff().diff() # 2nd
seasonal = df['sales'].diff(12) # seasonal
First differencing (d=1) fixes most trends. Second differencing (d=2) is rarely needed. Seasonal differencing handles calendar cycles.
A PSEi closing price series has a clear upward trend over 5 years. What should you do before applying ARIMA?
Click to reveal answer
B) Apply first differencing
An upward trend means the series is non-stationary. First differencing (d=1) removes the linear trend and makes the series suitable for ARIMA.
ACF and PACF plots are the fingerprint of any time series — they tell you which model to use.
Correlation between Yt and Yt-k at each lag k. Includes indirect effects through intermediate lags.
Direct correlation between Yt and Yt-k after removing effects of intervening lags.
| ACF Pattern | PACF Pattern | Model Suggested | Interpretation |
|---|---|---|---|
| Cuts off at lag q | Exponential decay | MA(q) | Past errors drive the series |
| Exponential decay | Cuts off at lag p | AR(p) | Past values drive the series |
| Exponential decay | Exponential decay | ARMA(p,q) | Both values and errors matter |
| Significant at lag s | Significant at lag s | Seasonal | Calendar-driven pattern |
TL;DR ACF → MA order (q). PACF → AR order (p).
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
plot_acf(df['sales'], lags=30)
plot_pacf(df['sales'], lags=30)
Before forecasting, we need to separate signal from noise.
Smoothing techniques reveal underlying patterns by reducing random variation.
sma = df['sales'].rolling(7).mean()
Small window = responsive but noisy. Large window = smooth but laggy. Drag the sliders to see it.
SimpleExpSmoothing(y)
.fit(smoothing_level=0.2)
ExponentialSmoothing(y,
trend='add').fit()
ExponentialSmoothing(y,
trend='add',
seasonal='add',
seasonal_periods=12).fit()
Next: Session 2 — Forecasting Methods (ARIMA, Prophet, Evaluation)
ARIMA, Prophet & Evaluation
Department of Computer Science
University of the Philippines Cebu
Lecture 20: Forecasting & Evaluation
Every new location needs a multi-year sales forecast before opening day.
The tool they need? ARIMA and Prophet.
Build ARIMA/SARIMA models and choose p, d, q parameters systematically.
Use Meta Prophet for business forecasting with holidays and changepoints.
Measure forecast accuracy with MAE, RMSE, MAPE and proper temporal splits.
From understanding patterns to predicting outcomes.
ARIMA (classical) vs Prophet (modern) — we learn both.
Three ideas from Session 1 — autoregression, differencing, and moving average — combined into one powerful model.
Past values predict future. PACF tells you p.
Differencing order for stationarity. ADF test tells you d.
Past errors correct future. ACF tells you q.
In words: "Today's value = constant + weighted past values + weighted past errors + new shock."
AR coefficients — how much past values influence the present.
MA coefficients — how much past errors correct the present.
White noise — the unpredictable random shock at time t.
Not in the equation directly — it's the number of times you differenced before fitting.
The statsmodels ARIMA class handles fitting, diagnostics, and forecasting.
from statsmodels.tsa.arima.model import ARIMA # Fit ARIMA(2,1,1) model = ARIMA(df['sales'], order=(2, 1, 1)) results = model.fit() # Summary table print(results.summary()) # Diagnostic plots (residuals) results.plot_diagnostics(figsize=(12, 8)) # Forecast 30 steps ahead forecast = results.get_forecast(steps=30) mean = forecast.predicted_mean ci = forecast.conf_int() # 95% CI
Searches over all combinations and picks the best by AIC.
from pmdarima import auto_arima
auto_model = auto_arima(
df['sales'],
start_p=0, max_p=5,
start_q=0, max_q=5,
d=None, # auto-detect
seasonal=False,
trace=True
)
print(auto_model.summary())
Uncertainty compounds over time. Drag the slider to see how confidence bands grow with longer forecasts.
fc = results.get_forecast(30)
ci = fc.conf_int()
Example: SARIMA(1,1,1)(1,1,1)12
from statsmodels.tsa.statespace.sarimax \
import SARIMAX
# SARIMA with monthly seasonality
model = SARIMAX(
df['remittances_usd'],
order=(1, 1, 1),
seasonal_order=(1, 1, 1, 12)
)
results = model.fit(disp=False)
# Forecast next 12 months
forecast = results.forecast(steps=12)
# Or use auto_arima with seasonal
from pmdarima import auto_arima
auto = auto_arima(df['remittances_usd'],
seasonal=True, m=12,
trace=True)
Your ADF test gives p=0.03 after first differencing. PACF cuts off at lag 2 and ACF decays exponentially. What ARIMA order should you try?
Click to reveal answer
B) ARIMA(2, 1, 0)
PACF cutoff at 2 → p=2. One differencing needed (p=0.03 after) → d=1. ACF decays (doesn't cut off) → q=0. This is a pure AR(2) model on differenced data.
Meta's open-source tool handles missing data, holidays, and changepoints automatically.
Designed for analysts who need good forecasts fast, not ARIMA experts.
Handles gaps automatically — no imputation needed.
COVID-era spikes won't break your forecast.
Detects trend shifts automatically (e.g., policy changes).
Add Christmas, Undas, or any custom event.
Requirement Prophet needs only two columns: ds (date) and y (value).
Install: pip install prophet
from prophet import Prophet
# Prepare data (must be ds + y)
df_p = df.reset_index()
df_p.columns = ['ds', 'y']
# Create and fit model
model = Prophet(
yearly_seasonality=True,
weekly_seasonality=True,
daily_seasonality=False
)
model.fit(df_p)
# Create future dates
future = model.make_future_dataframe(
periods=30
)
# Predict
forecast = model.predict(future)
# Visualize
model.plot(forecast)
model.plot_components(forecast)
ph_holidays = pd.DataFrame({
'holiday': 'ph_holiday',
'ds': ['2024-12-25', '2024-11-01',
'2024-06-12', '2024-04-09'],
'lower_window': 0,
'upper_window': 1
})
model = Prophet(holidays=ph_holidays)
lower_window: days before the holiday affected. upper_window: days after. E.g., Christmas shopping starts early: lower_window=-7.
A forecast without an error estimate is just a guess.
This section covers metrics, temporal splits, and model comparison.
| Metric | Formula | Interpretation | When to Use |
|---|---|---|---|
| MAE | mean(|y − ŷ|) | Average absolute error in original units | General purpose |
| RMSE | √mean((y − ŷ)²) | Penalizes large errors more | When big misses are costly |
| MAPE | mean(|y − ŷ|/y) × 100 | Percentage error — scale-free | Stakeholder reports |
| MASE | MAE / naive_MAE | <1 means better than naive forecast | Comparing across datasets |
TL;DR Use MAPE for business stakeholders. Use RMSE when large errors are costly.
The baseline: tomorrow = today. Any good model should beat this. MASE < 1 means your model adds value.
# Preserve time order!
train = df['sales'][:-30]
test = df['sales'][-30:]
# Never use: train_test_split(shuffle=True)
Random shuffling lets future data leak into training. Your model "sees the future" and metrics look artificially good.
Neither model is universally better. ARIMA excels on clean, stationary data. Prophet handles messy, real-world data with holidays and gaps.
Lab 10: Time Series Forecasting Project
Forecast a Philippine economic indicator. Compare ARIMA vs Prophet. Present results to "management."