Solution Manual — Fama-French Tutorial Series - Factor Investing: From Statistical Foundations to Fama-French Models

Complete Worked Solutions to All Exercises¶

Where This Fits¶

This is the companion notebook to the series. The recommended reading order is:

Statistical Foundations — CLT, hypothesis testing, OLS, sandwich estimators, Newey-West derivation
Time Series Foundations — Stationarity, autocorrelation, volatility clustering, ergodicity
Fama-French 3-Factor Model — Data, regression mechanics, diagnostics, robust SEs, interpretation
Advanced Factor Models — FF5, momentum, profitability, beta anomaly, rolling windows
Practical Factor Investing — Real ETF analysis, portfolio construction, constrained optimisation
Solution Manual — this notebook

This notebook provides detailed solutions to every exercise from the tutorial notebooks. Try each exercise yourself first before looking at the solution.

What’s Included¶

Source Notebook	Exercises
Statistical Foundations	5 exercises
Time Series Foundations	4 exercises
Fama-French 3-Factor Model	5 exercises
Advanced Factor Models	4 exercises

Each solution includes the code, the output interpretation, and an explanation of why the answer is what it is. Discussion-type exercises include model answers that you can compare against your own reasoning.

Prerequisites¶

Notebooks 01–04 — Work through the exercises yourself before consulting the solutions
The data downloads and library imports below mirror those tutorials

# ============================================================================
# Setup: Import all libraries needed across all solutions
# ============================================================================

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import statsmodels.api as sm
from statsmodels.tsa.stattools import acf, pacf, adfuller
from statsmodels.stats.diagnostic import acorr_ljungbox, het_arch
import yfinance as yf
import urllib.request
import zipfile
import tempfile
import os
import re
import warnings
warnings.filterwarnings('ignore', category=FutureWarning)

try:
    plt.style.use('seaborn-v0_8-darkgrid')
except:
    plt.style.use('default')
sns.set_palette("husl")
np.random.seed(42)


def get_close(raw, ticker=None):
    """Extract Close price from yf.download result, handling MultiIndex columns."""
    if isinstance(raw.columns, pd.MultiIndex):
        try:
            return raw[('Close', ticker)]
        except KeyError:
            return raw.xs('Close', axis=1, level=0).iloc[:, 0]
    elif 'Close' in raw.columns:
        return raw['Close']
    else:
        return raw.iloc[:, 0]


print("All libraries imported successfully!")

# ============================================================================
# Helper: Download and parse a CSV from Kenneth French's data library
# (reused from Notebook 04)
# ============================================================================

def download_ff_csv(zip_filename):
    """Download and parse a CSV from Kenneth French's data library."""
    url = f"https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/{zip_filename}"
    with tempfile.TemporaryDirectory() as tmpdir:
        zip_path = f"{tmpdir}/data.zip"
        urllib.request.urlretrieve(url, zip_path)
        with zipfile.ZipFile(zip_path, 'r') as zf:
            zf.extractall(tmpdir)
        files = [f for f in os.listdir(tmpdir) if f.lower().endswith('.csv')]
        if not files:
            raise ValueError(f"No CSV found in {zip_filename}")
        csv_path = f"{tmpdir}/{files[0]}"
        
        with open(csv_path, 'r') as f:
            lines = f.readlines()
        
        skip = 0
        for i, line in enumerate(lines):
            stripped = line.strip()
            if re.match(r'^\d{6}', stripped):
                skip = i
                break
        
        header_line = skip - 1 if skip > 0 else 0
        df = pd.read_csv(csv_path, skiprows=header_line, index_col=0)
    
    df.index = pd.Index(df.index.astype(str).str.strip())
    df = df[df.index.str.match(r'^\d{6}$')]
    df.index = pd.to_datetime(df.index, format='%Y%m')
    df = df.apply(pd.to_numeric, errors='coerce').dropna()
    df = df / 100  # Convert percentages to decimals
    return df


def run_ff_regression(y, factor_names, data, name="Asset"):
    """Run a Fama-French regression with Newey-West SEs and print results."""
    X = sm.add_constant(data[factor_names].values)
    model = sm.OLS(y if y is not None else data.iloc[:, 0].values, X).fit()
    n = len(data)
    lag = int(np.floor(4 * (n / 100) ** (2/9)))
    robust = model.get_robustcov_results(cov_type='HAC', maxlags=lag)
    print(f"\n{'='*60}")
    print(f"  {name}  (Newey-West, lag={lag})")
    print(f"{'='*60}")
    print(robust.summary().tables[1])
    print(f"  R² = {robust.rsquared:.4f}   Adj-R² = {robust.rsquared_adj:.4f}")
    return robust


print("Helper functions defined.")

# ============================================================================
# Download all data needed for solutions
# ============================================================================

# --- Fama-French 3-Factor (2000-2023, for NB02 exercises) ---
print("Downloading Fama-French 3-Factor data...")
ff3 = download_ff_csv("F-F_Research_Data_Factors_CSV.zip")
ff3.columns = ['Mkt_RF', 'SMB', 'HML', 'RF']
ff3_full = ff3.loc['2000-01-01':'2023-12-31'].copy()
# Also keep a 2010-2023 slice for NB03 exercises
ff3_short = ff3.loc['2010-01-01':'2023-12-31'].copy()

# --- Fama-French 5-Factor ---
print("Downloading Fama-French 5-Factor data...")
ff5 = download_ff_csv("F-F_Research_Data_5_Factors_2x3_CSV.zip")
ff5.columns = ['Mkt_RF', 'SMB', 'HML', 'RMW', 'CMA', 'RF']
ff5 = ff5.loc['2000-01-01':'2023-12-31']

# --- Momentum ---
print("Downloading Momentum factor...")
mom = download_ff_csv("F-F_Momentum_Factor_CSV.zip")
mom.columns = ['UMD']
mom = mom.loc['2000-01-01':'2023-12-31']

# --- Combine all factors (keep RF for excess-return calculations) ---
factors = ff5.join(mom, how='inner')

print(f"\nAll data loaded.")
print(f"  3-Factor (full):  {len(ff3_full)} months  ({ff3_full.index[0].date()} – {ff3_full.index[-1].date()})")
print(f"  3-Factor (short): {len(ff3_short)} months ({ff3_short.index[0].date()} – {ff3_short.index[-1].date()})")
print(f"  5-Factor + Mom:   {len(factors)} months")
print(f"  Factor columns:   {list(factors.columns)}")

Part 1: Statistical Foundations (Notebook 01) — Solutions¶

Exercise 1: Hand Computation¶

Problem: Given $x = [2, 5, 3, 8, 7]$ and $y = [4, 7, 5, 10, 9]$ :

Compute mean, variance, std dev of $x$ and $y$
Compute $\text{Cov}(x,y)$ and $\text{Corr}(x,y)$
Set up the design matrix and compute $\hat\beta$ via the normal equations

# ============================================================================
# NB01 Exercise 1 — Hand Computation (verified with NumPy)
# ============================================================================

x = np.array([2, 5, 3, 8, 7], dtype=float)
y = np.array([4, 7, 5, 10, 9], dtype=float)
n = len(x)

# ── Part 1: Mean, Variance, Standard Deviation ──
mean_x = x.sum() / n
mean_y = y.sum() / n

var_x = ((x - mean_x)**2).sum() / (n - 1)   # sample variance (Bessel's correction)
var_y = ((y - mean_y)**2).sum() / (n - 1)

std_x = np.sqrt(var_x)
std_y = np.sqrt(var_y)

print("PART 1: Descriptive Statistics")
print("=" * 50)
print(f"  x = {x}")
print(f"  y = {y}")
print(f"\n  mean(x) = {x.sum()}/{n} = {mean_x:.2f}")
print(f"  mean(y) = {y.sum()}/{n} = {mean_y:.2f}")
print(f"\n  var(x)  = Σ(xᵢ − x̄)² / (n−1) = {((x - mean_x)**2).sum():.2f} / {n-1} = {var_x:.4f}")
print(f"  var(y)  = {((y - mean_y)**2).sum():.2f} / {n-1} = {var_y:.4f}")
print(f"\n  std(x)  = √{var_x:.4f} = {std_x:.4f}")
print(f"  std(y)  = √{var_y:.4f} = {std_y:.4f}")

# Verify
print(f"\n  Verify with NumPy:  std(x, ddof=1) = {x.std(ddof=1):.4f}  ✓")

# ── Part 2: Covariance and Correlation ──
cov_xy = ((x - mean_x) * (y - mean_y)).sum() / (n - 1)
corr_xy = cov_xy / (std_x * std_y)

print(f"\nPART 2: Covariance and Correlation")
print("=" * 50)
print(f"  Cov(x,y) = Σ(xᵢ-x̄)(yᵢ-ȳ)/(n-1) = {((x-mean_x)*(y-mean_y)).sum():.2f}/{n-1} = {cov_xy:.4f}")
print(f"  Corr(x,y) = Cov(x,y)/(s_x · s_y) = {cov_xy:.4f}/({std_x:.4f}·{std_y:.4f}) = {corr_xy:.4f}")
print(f"\n  Verify: np.corrcoef = {np.corrcoef(x, y)[0, 1]:.4f}  ✓")

# ── Part 3: Normal Equations ──
X = np.column_stack([np.ones(n), x])   # design matrix with intercept
XtX     = X.T @ X
XtX_inv = np.linalg.inv(XtX)
Xty     = X.T @ y
beta_hat = XtX_inv @ Xty

print(f"\nPART 3: OLS via Normal Equations")
print("=" * 50)
print(f"  Design matrix X (n×2):\n{X}")
print(f"\n  X'X =\n{XtX}")
print(f"\n  (X'X)⁻¹ =\n{XtX_inv}")
print(f"\n  X'y = {Xty}")
print(f"\n  β̂ = (X'X)⁻¹ X'y = [{beta_hat[0]:.6f}, {beta_hat[1]:.6f}]")
print(f"\n  Interpretation:  ŷ = {beta_hat[0]:.4f} + {beta_hat[1]:.4f} · x")
print(f"    → intercept ≈ {beta_hat[0]:.2f}, slope ≈ {beta_hat[1]:.2f}")

# Quick sanity check: slope should be close to Cov(x,y)/Var(x)
print(f"\n  Sanity: Cov(x,y)/Var(x) = {cov_xy/var_x:.6f}  (should match slope)  ✓")

Exercise 2: CLT in Action — Beta(0.5, 0.5) Distribution¶

Problem: Repeat the CLT experiment with $\text{Beta}(0.5, 0.5)$ — a bimodal distribution. Show that sample means become normal as $n$ grows.

# ============================================================================
# NB01 Exercise 2 — CLT with Beta(0.5, 0.5)
# ============================================================================

np.random.seed(42)
n_sims = 5000
a, b = 0.5, 0.5
mu_beta  = a / (a + b)                          # = 0.5
var_beta = (a * b) / ((a + b)**2 * (a + b + 1)) # = 0.125

sample_sizes = [5, 30, 100]

fig, axes = plt.subplots(1, len(sample_sizes) + 1, figsize=(18, 4))

# Panel 0: Show the raw Beta(0.5, 0.5) density (bimodal)
x_grid = np.linspace(0.001, 0.999, 500)
axes[0].plot(x_grid, stats.beta.pdf(x_grid, a, b), 'k-', linewidth=2)
axes[0].set_title('Beta(0.5, 0.5) density\n(bimodal!)', fontweight='bold')
axes[0].set_xlabel('x')
axes[0].set_ylabel('Density')
axes[0].grid(True, alpha=0.3)

# Panels 1–3: Histograms of standardised sample means
for i, n in enumerate(sample_sizes):
    means = np.array([np.random.beta(a, b, size=n).mean() for _ in range(n_sims)])
    z = (means - mu_beta) / np.sqrt(var_beta / n)   # standardise to N(0,1) if CLT holds
    
    ax = axes[i + 1]
    ax.hist(z, bins=50, density=True, alpha=0.6, color='steelblue', edgecolor='white')
    
    # Overlay N(0,1) density
    t_grid = np.linspace(-4, 4, 200)
    ax.plot(t_grid, stats.norm.pdf(t_grid), 'r-', linewidth=2, label='N(0,1)')
    
    ax.set_title(f'n = {n}', fontweight='bold', fontsize=13)
    ax.set_xlabel('Standardised z')
    ax.legend(fontsize=9)
    ax.set_xlim(-4, 4)
    ax.grid(True, alpha=0.3)

fig.suptitle('CLT Demonstration: Beta(0.5, 0.5) — Sample Means Become Normal',
             fontweight='bold', fontsize=13, y=1.03)
plt.tight_layout()
plt.show()

# ── Discussion answers ──
print("ANSWERS")
print("=" * 60)
print("""
Q3: At what n does the distribution look approximately normal?
   Even at n = 5 the histogram is already quite close to the
   N(0,1) bell curve. By n = 30 it is virtually indistinguishable.
   This is FASTER than for Exponential(1).

Q4: Why does a bimodal distribution converge faster?
   The Beta(0.5, 0.5) distribution is SYMMETRIC around its mean
   (μ = 0.5). Its third central moment (skewness) is exactly zero.
   The CLT's convergence rate depends heavily on skewness: the
   Berry-Esseen bound is proportional to E[|X − μ|³] / σ³.
   Because the Beta(0.5,0.5) has zero skewness, the leading error
   term in the normal approximation vanishes, giving much faster
   convergence than the heavily right-skewed Exponential(1).
""")

Exercise 3: Sandwich Estimator Comparison¶

Problem: Take the 5-observation dataset from NB01 Section 5, inflate $y$ for the last two observations by a factor of 3 to create heteroscedasticity, and compare Classic, HC0, and Newey-West standard errors.

# ============================================================================
# NB01 Exercise 3 — Sandwich Estimator with Artificial Heteroscedasticity
# ============================================================================

# Original 5-observation dataset from NB01 Section 5
x_orig = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
y_orig = np.array([2.1, 3.9, 6.2, 7.8, 10.1])

# Step 1: Inflate last two y-values by factor 3
multiplier = np.array([1, 1, 1, 3, 3], dtype=float)
y_het = y_orig * multiplier

print("STEP 1: Create heteroscedastic data")
print("=" * 60)
print(f"  y_original  = {y_orig}")
print(f"  multiplier  = {multiplier}")
print(f"  y_inflated  = {y_het}")

# Step 2: Fit OLS on the inflated data
X_ex3 = sm.add_constant(x_orig)
model_ex3 = sm.OLS(y_het, X_ex3).fit()

# Classic SE
se_classic = model_ex3.bse

# HC0
model_hc0 = model_ex3.get_robustcov_results(cov_type='HC0')
se_hc0 = model_hc0.bse

# Newey-West (L=1 for n=5)
model_nw = model_ex3.get_robustcov_results(cov_type='HAC', maxlags=1)
se_nw = model_nw.bse

print(f"\nSTEP 2: OLS on inflated data")
print(f"  β̂₀ (intercept) = {model_ex3.params[0]:.4f}")
print(f"  β̂₁ (slope)     = {model_ex3.params[1]:.4f}")

print(f"\nSTEP 3: Compare Standard Errors for β̂₁ (slope)")
print("=" * 60)
print(f"  {'Method':<20s}  {'SE(β̂₁)':>10s}  {'% vs Classic':>14s}")
print(f"  {'-'*48}")
print(f"  {'Classic OLS':<20s}  {se_classic[1]:10.4f}  {'(baseline)':>14s}")
print(f"  {'HC0':<20s}  {se_hc0[1]:10.4f}  {(se_hc0[1]/se_classic[1]-1)*100:+13.1f}%")
print(f"  {'Newey-West (L=1)':<20s}  {se_nw[1]:10.4f}  {(se_nw[1]/se_classic[1]-1)*100:+13.1f}%")

print(f"""
INTERPRETATION
{'='*60}
The HC0 standard error changes the most relative to classic OLS.

Why? Classic OLS assumes Var(εᵢ) = σ² for all i — a single 
variance for every observation. But by inflating the last two 
y-values, we made the residuals for observations 4 and 5 much 
larger, violating homoscedasticity.

HC0 replaces the single σ² with observation-specific ε̂ᵢ², so 
it correctly accounts for the larger residuals at the end. The 
Newey-West estimator also picks this up (it nests HC0 at lag 0) 
but with only 5 observations the additional autocorrelation 
correction is negligible.

With n = 5, ALL estimators are unreliable — this is just to 
show the mechanics. With realistic sample sizes (n ≥ 100), the 
differences become both larger and more trustworthy.
""")

Exercise 4: Monte Carlo — Varying the Autocorrelation (Challenge)¶

Problem: Run Monte Carlo simulations with $\rho \in \{0, 0.3, 0.6, 0.9\}$ and plot rejection rates for Classic OLS vs Newey-West.

# ============================================================================
# NB01 Exercise 4 — Monte Carlo: Rejection Rate vs Autocorrelation
# ============================================================================

np.random.seed(0)
num_mc = 2000
n_mc = 150
rho_values = [0.0, 0.3, 0.6, 0.9]

results_mc = []  # list of (rho, classic_rate, nw_rate)

for rho in rho_values:
    reject_classic = 0
    reject_nw = 0
    
    for sim in range(num_mc):
        x_mc = np.random.randn(n_mc) * 0.04
        
        # AR(1) errors with heteroscedasticity and fat tails
        innov = stats.t.rvs(df=5, size=n_mc)
        eps = np.zeros(n_mc)
        for t in range(n_mc):
            eps[t] = rho * eps[t-1] + innov[t] if t > 0 else innov[t]
        eps = eps * (0.01 + 0.4 * np.abs(x_mc))
        
        y_mc = 0.0 + 0.0 * x_mc + eps   # true β = 0
        
        X_mc = sm.add_constant(x_mc)
        res_mc = sm.OLS(y_mc, X_mc).fit()
        lag_mc = int(np.floor(4 * (n_mc / 100) ** (2/9)))
        res_nw_mc = res_mc.get_robustcov_results(cov_type='HAC', maxlags=lag_mc)
        
        if res_mc.pvalues[1] < 0.05:
            reject_classic += 1
        if res_nw_mc.pvalues[1] < 0.05:
            reject_nw += 1
    
    rate_c = reject_classic / num_mc * 100
    rate_nw = reject_nw / num_mc * 100
    results_mc.append((rho, rate_c, rate_nw))
    print(f"  ρ = {rho:.1f}:  Classic = {rate_c:.1f}%,  Newey-West = {rate_nw:.1f}%")

# ── Plot ──
rhos    = [r[0] for r in results_mc]
classic = [r[1] for r in results_mc]
nw      = [r[2] for r in results_mc]

fig, ax = plt.subplots(figsize=(8, 5))
ax.plot(rhos, classic, 'o-', color='#d9534f', linewidth=2, markersize=8, label='Classic OLS')
ax.plot(rhos, nw,      's-', color='#5cb85c', linewidth=2, markersize=8, label='Newey-West HAC')
ax.axhline(5, color='black', linestyle='--', linewidth=1.5, label='Nominal 5% level')
ax.set_xlabel('Autocorrelation coefficient ρ', fontsize=12)
ax.set_ylabel('Rejection rate (%)', fontsize=12)
ax.set_title('False Rejection Rate Under H₀ vs Autocorrelation Strength\n'
             f'({num_mc} simulations, n={n_mc}, heteroscedastic t(5) errors)',
             fontweight='bold', fontsize=12)
ax.set_xticks(rho_values)
ax.set_ylim(0, max(classic) * 1.25)
ax.legend(fontsize=10)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print(f"""
INTERPRETATION
{'='*60}
• At ρ = 0 (no autocorrelation), classic OLS is already slightly
  above 5% because of heteroscedasticity alone.

• By ρ = 0.3, classic OLS rejects roughly {classic[1]:.0f}% of the time
  — already a meaningful over-rejection.

• At ρ = 0.6, classic OLS rejects at ~{classic[2]:.0f}% — you'd be "finding"
  significant results roughly {classic[2]/5:.0f}× more often than you should.

• At ρ = 0.9, classic OLS is wildly unreliable ({classic[3]:.0f}% rejection).

• Newey-West stays close to 5% across all ρ values, confirming
  that the HAC correction works as advertised.

CONCLUSION: Even moderate autocorrelation (ρ ≈ 0.3) is enough
to make classic OLS seriously misleading. Always use Newey-West
for time-series regressions.
""")

Exercise 5: Why Does the Mixing CLT Matter? (Discussion)¶

Model Answer¶

Q1. Why can’t the standard CLT (Lindeberg-Lévy) be applied directly to time-series data?

The Lindeberg-Lévy CLT requires the observations $X_1, X_2, \ldots, X_n$ to be independent and identically distributed (i.i.d.). In time-series data, consecutive observations are typically correlated — today’s stock return is not independent of yesterday’s. This dependence violates the independence assumption. The standard CLT doesn’t apply because its proof relies on independence to decompose the variance of the sum into a sum of variances: $\text{Var}(\sum X_i) = \sum \text{Var}(X_i)$ . With dependence, cross-covariance terms $\text{Cov}(X_i, X_j)$ appear and the decomposition breaks.

Q2. What additional condition does the Mixing CLT require, and why is it satisfied for financial returns?

The Mixing CLT requires that the dependence between observations decays sufficiently fast as they get further apart in time. Formally, the $\alpha$ -mixing coefficients (which measure the maximum dependence between the “past” and the “future” separated by a gap of $k$ periods) must converge to zero at a fast enough polynomial rate as $k \to \infty$ .

This is satisfied for financial returns because:

Return autocorrelations are typically very small (close to zero at lag 1, essentially zero at lag 2+), as shown empirically in Notebook 02 Section 3.
Volatility clustering creates dependence in squared returns, but the autocorrelation of squared returns also decays exponentially (as demonstrated with GARCH models).
The mixing condition requires only that dependence decays — it doesn’t require zero dependence. Financial returns easily clear this bar.

Q3. If the mixing condition were NOT satisfied (e.g., a random walk), what would go wrong with Newey-West?

A random walk $P_t = P_{t-1} + \epsilon_t$ is non-stationary: its variance grows linearly with $t$ , so $\text{Var}(P_t) = t \sigma^2_\epsilon$ . The mixing condition fails because the process never “forgets” past shocks — $P_t$ depends on all past innovations equally.

Consequences for Newey-West:

The long-run variance $\sigma^2_{LR} = \gamma_0 + 2 \sum_{k=1}^{\infty} \gamma_k$ diverges (does not converge to a finite limit), because the cumulative autocovariance grows without bound.
The Newey-West estimator tries to estimate this (finite) long-run variance by summing Bartlett-weighted sample autocovariances up to lag $L$ . But as $L$ grows with $n$ , the partial sums keep growing rather than stabilising.
The resulting “standard errors” are meaningless — they don’t converge to anything useful.
Takeaway: Newey-West requires stationarity (or at least a finite long-run variance). This is why we always test for stationarity (ADF test) before running regressions, and why we use returns (stationary) rather than prices (non-stationary) in financial econometrics.

Part 2: Time-Series Foundations (Notebook 02) — Solutions¶

Exercise 1: Stationarity on a New Series (AAPL)¶

Problem: Download daily AAPL prices, compute log returns, run ADF tests on prices and returns, and plot rolling moments.

# ============================================================================
# NB02 Exercise 1 — Stationarity: AAPL Prices vs Log Returns
# ============================================================================

# Download AAPL data
raw_aapl = yf.download('AAPL', start='2015-01-01', end='2023-12-31', progress=False)
if isinstance(raw_aapl.columns, pd.MultiIndex):
    prices = raw_aapl[('Close', 'AAPL')]
else:
    prices = raw_aapl['Close']
prices = prices.dropna()

log_returns = np.log(prices / prices.shift(1)).dropna()

# ── Part 1: ADF tests ──
print("ADF TEST RESULTS")
print("=" * 60)

adf_price = adfuller(prices.values, maxlag=20, autolag='AIC')
print(f"\nAAPL PRICES:")
print(f"  ADF statistic:  {adf_price[0]:.4f}")
print(f"  p-value:        {adf_price[1]:.4f}")
print(f"  Conclusion:     {'STATIONARY' if adf_price[1] < 0.05 else 'NON-STATIONARY'}")

adf_ret = adfuller(log_returns.values, maxlag=20, autolag='AIC')
print(f"\nAAPL LOG RETURNS:")
print(f"  ADF statistic:  {adf_ret[0]:.4f}")
print(f"  p-value:        {adf_ret[1]:.6f}")
print(f"  Conclusion:     {'STATIONARY' if adf_ret[1] < 0.05 else 'NON-STATIONARY'}")

print(f"\nDo results match Fama-French findings?")
print(f"  YES — prices are non-stationary (unit root), returns are stationary.")
print(f"  This is universal for financial assets, not specific to any one stock.")

# ── Part 2: Rolling 60-day mean and std dev ──
fig, axes = plt.subplots(2, 2, figsize=(15, 8))

axes[0, 0].plot(prices.index, prices, color='darkblue', linewidth=0.8)
axes[0, 0].set_title('AAPL Daily Close Price', fontweight='bold')
axes[0, 0].set_ylabel('Price ($)')
axes[0, 0].grid(True, alpha=0.3)

axes[0, 1].plot(log_returns.index, log_returns * 100, color='steelblue', linewidth=0.5, alpha=0.7)
axes[0, 1].set_title('AAPL Daily Log Returns', fontweight='bold')
axes[0, 1].set_ylabel('Return (%)')
axes[0, 1].grid(True, alpha=0.3)

rolling_mean = log_returns.rolling(60).mean() * 100
axes[1, 0].plot(rolling_mean.index, rolling_mean, color='darkred', linewidth=1)
axes[1, 0].axhline(log_returns.mean() * 100, color='black', linestyle='--', alpha=0.5,
                     label=f'Full-sample mean = {log_returns.mean()*100:.3f}%')
axes[1, 0].set_title('Rolling 60-Day Mean of Log Returns', fontweight='bold')
axes[1, 0].set_ylabel('Mean return (%)')
axes[1, 0].legend(fontsize=9)
axes[1, 0].grid(True, alpha=0.3)

rolling_std = log_returns.rolling(60).std() * 100
axes[1, 1].plot(rolling_std.index, rolling_std, color='darkorange', linewidth=1)
axes[1, 1].axhline(log_returns.std() * 100, color='black', linestyle='--', alpha=0.5,
                     label=f'Full-sample std = {log_returns.std()*100:.3f}%')
axes[1, 1].set_title('Rolling 60-Day Std Dev (Volatility)', fontweight='bold')
axes[1, 1].set_ylabel('Std dev (%)')
axes[1, 1].legend(fontsize=9)
axes[1, 1].grid(True, alpha=0.3)

plt.suptitle('AAPL: Stationarity Analysis', fontweight='bold', fontsize=14, y=1.01)
plt.tight_layout()
plt.show()

print("""
INTERPRETATION
  The rolling mean fluctuates around the full-sample mean and doesn't
  show a systematic trend — consistent with stationarity in the mean.
  
  The rolling std dev varies substantially (volatility clustering), but
  this is TIME-VARYING VOLATILITY, not non-stationarity in the traditional
  sense. The series is still covariance-stationary to a reasonable
  approximation, as confirmed by the ADF test.
""")

Exercise 2: Interpreting ACF Plots (Conceptual)¶

Problem: For (A) white noise, (B) AR(1) with φ=0.8, and (C) financial returns with GARCH volatility, describe the ACF of $X_t$ and $X_t^2$ , and explain why (C) makes Newey-West necessary.

Model Answer¶

Process	ACF of $X_t$	ACF of $X_t^2$
(A) White noise	All zero beyond lag 0. The ACF plot shows no bars exceeding the 95% confidence band.	Also all zero — $X_t^2$ is i.i.d. $\chi^2(1)$ (rescaled), so its ACF is also flat.
(B) AR(1), φ = 0.8	Exponential decay: $\rho_k = 0.8^k$ . Significant at many lags, slowly dying out.	Also shows significant autocorrelation (because $X_t$ is autocorrelated, so is $X_t^2$ ), decaying roughly as $\rho_k^2 \approx 0.64^k$ .
(C) Financial returns	Nearly zero at all lags — looks almost identical to (A). Perhaps a tiny significant value at lag 1, but mostly within the bands.	Highly significant at many lags — strong, slowly decaying autocorrelation. This is the hallmark of volatility clustering: large returns (of either sign) tend to follow large returns.

Why does (C) make Newey-West necessary?

Process (C) is the tricky one. Looking at the ACF of returns, you’d conclude “basically white noise, no correction needed.” But this is misleading. The returns are uncorrelated but not independent — their conditional variance is serially dependent (GARCH structure). This dependence affects the variance of OLS estimates in a subtle way:

The classic OLS standard error formula assumes both homoscedasticity AND no autocorrelation.
Process (C) violates homoscedasticity (time-varying variance = heteroscedasticity), even though it satisfies zero autocorrelation in levels.
Newey-West handles both problems simultaneously: it makes no assumption about the variance structure. Even if autocorrelation is zero, the HAC estimator correctly computes the sandwich formula using the actual squared residuals (which are autocorrelated).
In practice, financial return residuals from a factor regression often show mild autocorrelation (even if raw returns don’t), because the factor model may not perfectly capture all time-series dynamics. Newey-West is insurance against this.

Exercise 3: Effect of Sample Period on ARCH Tests¶

Problem: Split Mkt-RF into 2000–2011 and 2012–2023. Run ARCH(6) tests on each and compare.

# ============================================================================
# NB02 Exercise 3 — ARCH Tests on Sub-Periods
# ============================================================================

mkt_early = ff3_full['Mkt_RF'].loc[:'2011-12']
mkt_late  = ff3_full['Mkt_RF'].loc['2012-01':]

print("ARCH(6) TEST ON MKT-RF SUB-PERIODS")
print("=" * 60)

for label, series in [("2000-2011 (includes GFC)", mkt_early),
                       ("2012-2023 (post-GFC)",     mkt_late)]:
    lm_stat, lm_pval, fstat, f_pval = het_arch(series.values, nlags=6)
    print(f"\n  {label}")
    print(f"  {'─'*50}")
    print(f"  Number of months:  {len(series)}")
    print(f"  Monthly std dev:   {series.std()*100:.2f}%")
    print(f"  LM statistic:      {lm_stat:.4f}")
    print(f"  LM p-value:        {lm_pval:.4f}")
    sig = "SIGNIFICANT" if lm_pval < 0.05 else "NOT significant"
    print(f"  Conclusion:        {sig} volatility clustering")

# Visualise
fig, axes = plt.subplots(1, 2, figsize=(14, 4.5))

for ax, (label, series) in zip(axes, [("2000–2011 (GFC era)", mkt_early),
                                        ("2012–2023 (post-GFC)", mkt_late)]):
    ax.plot(series.index, series * 100, linewidth=0.8, color='darkblue')
    ax.fill_between(series.index, series * 100, 0, alpha=0.3, color='steelblue')
    ax.set_title(f'Mkt-RF: {label}', fontweight='bold')
    ax.set_ylabel('Return (%)')
    ax.set_ylim(-25, 20)
    ax.grid(True, alpha=0.3)
    ax.axhline(0, color='black', linewidth=0.5)

plt.tight_layout()
plt.show()

print("""
INTERPRETATION
  The 2000–2011 period includes the dot-com bust (2000–2002) and the
  Global Financial Crisis (2008–2009) — two episodes of extreme
  volatility clustering. The ARCH test statistic is likely much
  larger (stronger evidence of volatility clustering).

  The 2012–2023 period is calmer overall, though it includes the
  COVID crash (March 2020). Volatility clustering is typically
  still present but weaker.

  KEY INSIGHT: Volatility clustering is ubiquitous in financial
  data, but its STRENGTH varies with the sample period. This is
  why Newey-West standard errors are always recommended regardless
  of what any single-period test shows.
""")

Exercise 4 — When Does the Mixing Condition Fail in Practice?¶

Problem: Name two realistic scenarios where financial return data might violate the mixing (weak dependence) condition required for the CLT and Newey-West consistency.

Model Answer:

Structural Breaks / Regime Changes When the data-generating process itself changes — e.g., a central bank switches from inflation targeting to yield-curve control, or a market transitions from floor trading to electronic trading — the return series is no longer a single stationary, ergodic process. Early observations carry no information about the new regime, so the “forgetting” property that mixing requires is violated. In finite samples the Newey-West estimator uses autocovariances from the old regime to estimate uncertainty in the new one, which can be wildly wrong.
Unit-Root or Near-Unit-Root Behaviour Asset prices follow a near-random walk (the ratio $\operatorname{Var}[\bar{X}_n] / (\sigma^2/n) \to \infty$ ). If a researcher accidentally regresses price levels instead of returns, the mixing condition fails because shocks never decay. Even for returns, very persistent predictors such as log dividend-price ratios ( $\rho \approx 0.99$ monthly) can produce near-unit-root regressors whose mixing rate is too slow for the bandwidth available in a typical sample.
Long Memory / Fractional Integration (Bonus) Realised-volatility measures often exhibit long memory ( $d \approx 0.4$ ). Autocovariances decay hyperbolically rather than exponentially, so the strong-mixing coefficients $\alpha(k)$ decay too slowly for the standard Newey-West theory. Special long-run variance estimators (e.g., VARHAC or fixed- $b$ asymptotics) are needed in this case.

Practical take-away: Always plot the data, run rolling-window checks, and test for structural breaks (Chow, CUSUM) before trusting any single asymptotic standard-error estimate.

Part 3 — Fama-French 3-Factor Model (Notebook 03)¶

Exercise 1 — Predicting Factor Loadings (Conceptual)¶

Problem: Before running any code, predict the factor loadings ( $\beta_{\text{MKT}}$ , $\beta_{\text{SMB}}$ , $\beta_{\text{HML}}$ ) for QQQ, IWM, and a 2× leveraged S&P 500 ETF.

Model Answer:

ETF	$\beta_{\text{MKT}}$	$\beta_{\text{SMB}}$	$\beta_{\text{HML}}$	Reasoning
QQQ	≈ 1.0–1.1	Slightly negative	Negative (strong)	Nasdaq-100 is large-cap (negative SMB) and heavily tilted toward growth stocks (negative HML, since HML is value minus growth).
IWM	≈ 1.0–1.1	Positive (strong)	Near zero	Russell 2000 is explicitly small-cap (high SMB). It is a blend index — roughly equal value and growth — so HML exposure should be near zero.
2× S&P 500	≈ 2.0	Near zero	Near zero	A 2× leveraged fund amplifies all returns by 2, so $\beta_{\text{MKT}} \approx 2$ . S&P 500 is diversified across value/growth and is large-cap, so SMB ≈ 0 and HML ≈ 0. Alpha may be slightly negative due to daily rebalancing costs and financing.

Important caveat: These are predictions. Actual estimates will differ because of tracking error, rebalancing effects, and the specific sample period. Running the regression in Exercise 2 or 5 lets us check these predictions empirically.

Exercise 2 — Comparing ETFs (Hands-On)¶

Problem: Run the FF3 regression on IWN (Small Cap Value). Compare its $R^2$ and $\beta_{\text{SMB}}$ to SPY and VTV.

# ============================================================================
# NB03 Exercise 2 — Compare SPY, IWN, VTV factor loadings
# ============================================================================

tickers = ['SPY', 'IWN', 'VTV']
results_dict = {}

for tic in tickers:
    raw = yf.download(tic, start='2010-01-01', end='2023-12-31', progress=False)
    monthly = get_close(raw, tic).resample('MS').first().pct_change().dropna()
    # Shift index back 1 month: pct_change at Feb 1 = Jan→Feb return = "January" return
    monthly.index = monthly.index - pd.DateOffset(months=1)
    monthly.name = tic
    df = ff3_short[['Mkt_RF', 'SMB', 'HML', 'RF']].join(monthly, how='inner')
    df['Excess'] = df[tic] - df['RF']
    Y = df['Excess'].values
    X = sm.add_constant(df[['Mkt_RF', 'SMB', 'HML']].values)
    res = sm.OLS(Y, X).fit(cov_type='HAC', cov_kwds={'maxlags': 6})
    results_dict[tic] = res

# Pretty summary table
labels = ['Alpha', 'Mkt-RF', 'SMB', 'HML']
print(f"{'':12s} {'SPY':>16s} {'IWN':>16s} {'VTV':>16s}")
print("=" * 62)
for i, lab in enumerate(labels):
    row = f"  {lab:8s}"
    for tic in tickers:
        b = results_dict[tic].params[i]
        se = results_dict[tic].bse[i]
        star = '*' if results_dict[tic].pvalues[i] < 0.05 else ' '
        row += f"  {b:7.4f}{star}({se:.4f})"
    print(row)
print("-" * 62)
for tic in tickers:
    r2 = results_dict[tic].rsquared
    print(f"  R²  {tic:6s}  {r2:.4f}")

print("""
INTERPRETATION
  • IWN (Small Cap Value) should have the HIGHEST R² because it has
    strong exposure to ALL three factors (Mkt, SMB, HML), so the
    3-factor model captures more of its variation.
  • SPY is dominated by Mkt-RF alone — SMB and HML add little.
  • β_SMB ranking:  IWN >> SPY (near 0) ≈ VTV (near 0 or slightly negative).
    IWN is explicitly small-cap, while SPY and VTV are large/mega-cap.
  • β_HML ranking:  IWN > VTV > SPY (likely negative).
    Both IWN and VTV are value-tilted, while SPY is blend/slight growth.
""")

Exercise 3 — Sensitivity of Newey-West Standard Errors¶

Problem: Re-fit the SPY regression with maxlags=0, default, and maxlags=8. How sensitive are conclusions to lag choice?

# ============================================================================
# NB03 Exercise 3 — Newey-West Lag Sensitivity
# ============================================================================

# Use the SPY data from Exercise 2
spy_res = results_dict['SPY']
spy_model = spy_res.model                     # underlying OLS model

lag_configs = {
    'Classic OLS':  spy_model.fit(),
    'NW lag=0':     spy_model.fit(cov_type='HAC', cov_kwds={'maxlags': 0}),
    'NW lag=3':     spy_model.fit(cov_type='HAC', cov_kwds={'maxlags': 3}),
    'NW lag=6 (default)': spy_model.fit(cov_type='HAC', cov_kwds={'maxlags': 6}),
    'NW lag=8':     spy_model.fit(cov_type='HAC', cov_kwds={'maxlags': 8}),
    'NW lag=12':    spy_model.fit(cov_type='HAC', cov_kwds={'maxlags': 12}),
}

labels = ['Alpha', 'Mkt-RF', 'SMB', 'HML']

print("STANDARD ERRORS FOR SPY 3-FACTOR REGRESSION ACROSS LAG CHOICES")
print("=" * 80)
header = f"{'Coeff':>10s}"
for name in lag_configs:
    header += f"  {name:>14s}"
print(header)
print("-" * 80)

for i, lab in enumerate(labels):
    row = f"{lab:>10s}"
    for name, res in lag_configs.items():
        row += f"  {res.bse[i]:14.6f}"
    print(row)

print()
print("P-VALUES FOR ALPHA ACROSS LAG CHOICES")
for name, res in lag_configs.items():
    print(f"  {name:>22s}:  p = {res.pvalues[0]:.4f}")

print("""
INTERPRETATION
  • NW lag=0 corrects for heteroscedasticity ONLY (equivalent to
    White/HC0 robust SEs). These are usually very close to classic OLS
    for monthly data, which often shows little heteroscedasticity.
  • As lags increase, NW incorporates more autocovariance structure.
    Standard errors typically INCREASE modestly because serial
    correlation in the errors is accounted for.
  • For SPY at monthly frequency, the differences are usually small,
    because monthly factor-model residuals have weak autocorrelation.
  • KEY LESSON: If conclusions flip based on reasonable lag choices
    (e.g., 4 vs 8), the result is fragile. A robust finding should
    survive a range of bandwidth choices. Andrews (1991) or Newey-West
    (1994) automatic bandwidth selection provides a principled default.
""")

Exercise 4 — Economic vs. Statistical Significance (Discussion)¶

Problem: $\hat{\alpha} = 0.0003$ (0.03 %/month), $p = 0.04$ . Is this statistically significant? Economically significant? What happens with NW standard errors?

Model Answer:

Statistically significant? Yes — the p-value (0.04) is below the conventional 5 % threshold.
Economically significant? Almost certainly no.
- 0.03 % per month $\approx$ 0.36 % per year.
- Typical ETF expense ratios are 0.03–0.20 %; bid-ask spreads and market impact add more. After transaction costs the apparent alpha evaporates.
- A rational investor would not restructure their portfolio to capture 36 basis points per year, especially when that estimate is noisy.
Effect of Newey-West SEs:
- Newey-West SEs are generally larger than classic OLS SEs because they correct for positive residual autocorrelation and heteroscedasticity.
- A larger SE → a smaller $t$ -statistic → a larger p-value. The result that was barely significant at 5 % with classic OLS could easily become insignificant (e.g., $p = 0.08$ ) once robust SEs are used.

Take-away: Statistical significance ≠ Economic significance. A tiny alpha with a low p-value may be a real pricing anomaly, but it is not an investable opportunity once real-world frictions are included. Always convert monthly alphas into annualised terms and compare against realistic trading costs.

Exercise 5 — Extending to Another Asset (Challenge)¶

Problem: Download a stock of your choice, compute monthly excess returns, run the FF3 regression with Newey-West SEs, and interpret the results with diagnostics.

# ============================================================================
# NB03 Exercise 5 — Full Worked Example: AAPL
# ============================================================================

# Step 1: Download AAPL monthly returns
aapl_raw = yf.download('AAPL', start='2010-01-01', end='2023-12-31', progress=False)
aapl_monthly = get_close(aapl_raw, 'AAPL').resample('MS').first().pct_change().dropna()
# Shift index back 1 month: pct_change at Feb 1 = Jan→Feb return = "January" return
aapl_monthly.index = aapl_monthly.index - pd.DateOffset(months=1)
aapl_monthly.name = 'AAPL'

# Step 2: Align with FF3 factors and compute excess returns
df_aapl = ff3_short[['Mkt_RF', 'SMB', 'HML', 'RF']].join(aapl_monthly, how='inner')
df_aapl['Excess'] = df_aapl['AAPL'] - df_aapl['RF']
print(f"Sample: {df_aapl.index[0].strftime('%Y-%m')} to {df_aapl.index[-1].strftime('%Y-%m')}  "
      f"({len(df_aapl)} months)")

# Step 3: Run regression with Newey-West SEs
Y = df_aapl['Excess'].values
X = sm.add_constant(df_aapl[['Mkt_RF', 'SMB', 'HML']].values)
res_aapl = sm.OLS(Y, X).fit(cov_type='HAC', cov_kwds={'maxlags': 6})
print(res_aapl.summary(xname=['Alpha', 'Mkt-RF', 'SMB', 'HML'],
                        title='AAPL — Fama-French 3-Factor (NW SEs)'))

# Step 4: Diagnostic plots
fig, axes = plt.subplots(2, 2, figsize=(13, 9))

# 4a. Residual time series
resids = res_aapl.resid
axes[0, 0].plot(df_aapl.index, resids * 100, linewidth=0.7, color='steelblue')
axes[0, 0].axhline(0, color='black', linewidth=0.5)
axes[0, 0].set_title('Residuals Over Time', fontweight='bold')
axes[0, 0].set_ylabel('Residual (%)')

# 4b. Histogram + normal overlay
axes[0, 1].hist(resids * 100, bins=30, density=True, alpha=0.7,
                color='steelblue', edgecolor='white')
x_grid = np.linspace(resids.min()*100, resids.max()*100, 200)
axes[0, 1].plot(x_grid, stats.norm.pdf(x_grid, resids.mean()*100, resids.std()*100),
                'r-', linewidth=2, label='Normal')
axes[0, 1].set_title('Residual Distribution', fontweight='bold')
axes[0, 1].legend()

# 4c. QQ plot
stats.probplot(resids, dist='norm', plot=axes[1, 0])
axes[1, 0].set_title('Q-Q Plot', fontweight='bold')
axes[1, 0].get_lines()[0].set_markersize(3)

# 4d. ACF of residuals
from statsmodels.graphics.tsaplots import plot_acf
plot_acf(resids, lags=15, ax=axes[1, 1], alpha=0.05, color='steelblue')
axes[1, 1].set_title('Residual ACF', fontweight='bold')

plt.suptitle('AAPL — Regression Diagnostics', fontsize=14, fontweight='bold', y=1.01)
plt.tight_layout()
plt.show()

# Interpretation
alpha_ann = res_aapl.params[0] * 12 * 100
print(f"\nINTERPRETATION")
print(f"{'='*60}")
print(f"  Annualised alpha: {alpha_ann:.2f}%")
print(f"  β_Mkt: {res_aapl.params[1]:.3f}  — AAPL is a high-beta stock")
print(f"  β_SMB: {res_aapl.params[2]:.3f}  — Negative = large-cap behaviour")
print(f"  β_HML: {res_aapl.params[3]:.3f}  — Negative = growth stock behaviour")
print(f"  R²:    {res_aapl.rsquared:.3f}  — Low for a single stock (idiosyncratic risk)")
print(f"""
  AAPL behaves as a large-cap growth stock with substantial 
  idiosyncratic risk. Its strong positive alpha over this period 
  reflects AAPL's extraordinary price appreciation that cannot be 
  explained by standard risk factors — consistent with the view 
  that single-stock alphas capture both true skill/innovation AND 
  idiosyncratic luck.
  
  Diagnostics: residuals likely show fat tails (QQ plot deviates 
  at extremes) — typical for individual stocks — confirming the 
  need for robust standard errors.
""")

Part 4 — Advanced Factor Models (Notebook 04)¶

Exercise 1 — 5-Factor Model on IWM¶

Problem: Run the full 5-factor regression on IWM (Russell 2000 small-cap blend). Compare SMB loading to IWN, check HML significance, and compare 3-factor vs 5-factor alpha.

# ============================================================================
# NB04 Exercise 1 — 5-Factor Regression on IWM
# ============================================================================

# Download IWM
iwm_raw = yf.download('IWM', start='2000-01-01', end='2023-12-31', progress=False)
iwm_monthly = get_close(iwm_raw, 'IWM').resample('MS').first().pct_change().dropna()
# Shift index back 1 month: pct_change at Feb 1 = Jan→Feb return = "January" return
iwm_monthly.index = iwm_monthly.index - pd.DateOffset(months=1)
iwm_monthly.name = 'IWM'

# Also download IWN for comparison
iwn_raw = yf.download('IWN', start='2000-01-01', end='2023-12-31', progress=False)
iwn_monthly = get_close(iwn_raw, 'IWN').resample('MS').first().pct_change().dropna()
iwn_monthly.index = iwn_monthly.index - pd.DateOffset(months=1)
iwn_monthly.name = 'IWN'

# Merge with 5-factor data
factor_cols_3 = ['Mkt_RF', 'SMB', 'HML', 'RF']
factor_cols_5 = ['Mkt_RF', 'SMB', 'HML', 'RMW', 'CMA', 'RF']

def run_3f_5f(monthly_ret, label):
    """Run both 3-factor and 5-factor regressions and print comparison."""
    df = factors[factor_cols_5].join(monthly_ret, how='inner').dropna()
    df['Excess'] = df[monthly_ret.name] - df['RF']
    Y = df['Excess'].values
    
    # 3-factor
    X3 = sm.add_constant(df[['Mkt_RF', 'SMB', 'HML']].values)
    res3 = sm.OLS(Y, X3).fit(cov_type='HAC', cov_kwds={'maxlags': 6})
    
    # 5-factor
    X5 = sm.add_constant(df[['Mkt_RF', 'SMB', 'HML', 'RMW', 'CMA']].values)
    res5 = sm.OLS(Y, X5).fit(cov_type='HAC', cov_kwds={'maxlags': 6})
    
    print(f"\n{'='*60}")
    print(f"  {label}  —  {len(df)} months")
    print(f"{'='*60}")
    print(f"{'':>14s}  {'3-Factor':>12s}  {'5-Factor':>12s}")
    print(f"  {'─'*46}")
    
    names_3 = ['Alpha', 'Mkt-RF', 'SMB', 'HML']
    names_5 = ['Alpha', 'Mkt-RF', 'SMB', 'HML', 'RMW', 'CMA']
    for i, name in enumerate(names_5):
        v3 = f"{res3.params[i]:.4f}" if i < 4 else "—"
        v5 = f"{res5.params[i]:.4f}"
        print(f"  {name:>10s}    {v3:>12s}  {v5:>12s}")
    print(f"  {'R²':>10s}    {res3.rsquared:>12.4f}  {res5.rsquared:>12.4f}")
    print(f"  {'α p-val':>10s}    {res3.pvalues[0]:>12.4f}  {res5.pvalues[0]:>12.4f}")
    
    return res3, res5

iwm_3, iwm_5 = run_3f_5f(iwm_monthly, 'IWM (Russell 2000 — Small Cap Blend)')
iwn_3, iwn_5 = run_3f_5f(iwn_monthly, 'IWN (Russell 2000 Value)')

print(f"""
INTERPRETATION
{'='*60}
1. SMB Comparison:
   Both IWM and IWN have LARGE positive SMB loadings (both are
   small-cap funds). IWN's SMB loading may be slightly higher
   because small-cap value stocks tend to have even smaller
   market caps on average.

2. HML Significance:
   - IWM (blend): HML loading near zero / insignificant
     → blend of growth + value, so value factor washes out
   - IWN (value): HML loading is LARGE and significant
     → explicit value tilt captured by the model

3. Alpha (3F vs 5F):
   Adding RMW and CMA can change the alpha:
   - If alpha shrinks in 5F → the fund's excess return was partly
     explained by profitability/investment factors
   - R² improvement shows whether extra factors add explanatory power

4. RMW and CMA loadings:
   Small-cap value stocks often load NEGATIVELY on RMW (lower
   profitability) and positively on CMA (more conservative
   investment) — consistent with the Fama-French interpretation
   of the value premium.
""")

Exercise 2 — Rolling Sharpe Ratios for Combined Strategies¶

Problem: Compute 36-month rolling Sharpe ratios for (a) HML alone, (b) Value + Profitability, (c) Value + Profitability + Momentum. Plot and compare.

# ============================================================================
# NB04 Exercise 2 — Rolling Sharpe Ratios
# ============================================================================

# Build strategy returns (equal-weight combinations)
strats = pd.DataFrame({
    'HML only':        factors['HML'],
    'Value + Prof':    (factors['HML'] + factors['RMW']) / 2,
    'Val + Prof + Mom':(factors['HML'] + factors['RMW'] + factors['UMD']) / 3,
})

# Rolling 36-month annualised Sharpe ratio
window = 36
rolling_sharpe = (strats.rolling(window).mean() / strats.rolling(window).std()) * np.sqrt(12)
rolling_sharpe = rolling_sharpe.dropna()

# Plot
fig, ax = plt.subplots(figsize=(14, 5.5))
colors = ['#1f77b4', '#ff7f0e', '#2ca02c']
for col, c in zip(rolling_sharpe.columns, colors):
    ax.plot(rolling_sharpe.index, rolling_sharpe[col], label=col, linewidth=1.2, color=c)

ax.axhline(0, color='black', linewidth=0.5)
ax.fill_between(rolling_sharpe.index, 0, rolling_sharpe.min(axis=1),
                where=rolling_sharpe.min(axis=1) < 0, alpha=0.1, color='red')
ax.set_title('36-Month Rolling Sharpe Ratios: Single vs Combined Factor Strategies',
             fontweight='bold', fontsize=13)
ax.set_ylabel('Annualised Sharpe Ratio')
ax.legend(loc='upper right', framealpha=0.9)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Analysis
print("SUMMARY STATISTICS")
print("=" * 60)
print(f"{'Strategy':>25s}  {'Mean SR':>8s}  {'Std SR':>8s}  {'% < 0':>8s}")
print("-" * 60)
for col in rolling_sharpe.columns:
    sr = rolling_sharpe[col]
    print(f"{col:>25s}  {sr.mean():8.3f}  {sr.std():8.3f}  {(sr < 0).mean()*100:7.1f}%")

# Check if all three are simultaneously negative
all_neg = (rolling_sharpe < 0).all(axis=1)
if all_neg.any():
    periods = rolling_sharpe.index[all_neg]
    print(f"\nPeriods where ALL three strategies have negative rolling Sharpe:")
    print(f"  {periods[0].strftime('%Y-%m')} to {periods[-1].strftime('%Y-%m')} "
          f"({all_neg.sum()} months)")
else:
    print("\nNo period where ALL three strategies are simultaneously negative.")

print("""
INTERPRETATION
  • The combined strategy (green) has a HIGHER and MORE STABLE
    rolling Sharpe than HML alone. Diversification across low-
    correlated factors smooths returns.
  • The biggest benefit shows during HML's worst periods (e.g.,
    2018–2020 "value winter"): momentum and profitability partly
    offset HML's losses.
  • Adding momentum is especially valuable because UMD has near-
    zero or negative correlation with HML, providing genuine
    diversification.
""")

Exercise 3 — Momentum Crash Anatomy¶

Problem: Find the worst 3-month period for UMD. What was the market doing? How much would an equal-weight UMD+HML portfolio have reduced the drawdown?

# ============================================================================
# NB04 Exercise 3 — Momentum Crash Anatomy
# ============================================================================

# Part 1: Worst 3-month period for UMD
rolling_3m_umd = factors['UMD'].rolling(3).sum()
worst_end = rolling_3m_umd.idxmin()
worst_start = worst_end - pd.DateOffset(months=2)

print("WORST 3-MONTH PERIOD FOR MOMENTUM (UMD)")
print("=" * 60)
print(f"  Period:  {worst_start.strftime('%Y-%m')} to {worst_end.strftime('%Y-%m')}")
print(f"  3-month cumulative UMD return: {rolling_3m_umd.loc[worst_end]*100:.2f}%")
print()

# Show individual months
crash_months = factors.loc[worst_start:worst_end, ['UMD', 'Mkt_RF', 'HML']]
print("  Monthly detail:")
print(f"  {'Month':>10s}  {'UMD':>8s}  {'Mkt-RF':>8s}  {'HML':>8s}")
for date, row in crash_months.iterrows():
    print(f"  {date.strftime('%Y-%m'):>10s}  {row['UMD']*100:8.2f}%  "
          f"{row['Mkt_RF']*100:8.2f}%  {row['HML']*100:8.2f}%")

# Part 2: Market context
mkt_3m = factors['Mkt_RF'].rolling(3).sum()
print(f"\n  Market (Mkt-RF) over same 3 months: {mkt_3m.loc[worst_end]*100:+.2f}%")
print("""
  Daniel & Moskowitz (2016) narrative: Momentum crashes occur when
  the market RECOVERS sharply after a downturn. Past losers (now
  high-beta, beaten-down stocks) surge on the recovery, while past
  winners (defensive/low-beta stocks) lag. The momentum portfolio
  is short the rebounding losers and long the lagging winners,
  creating catastrophic losses.
""")

# Part 3: Hedged portfolio
hedged = (factors['UMD'] + factors['HML']) / 2  # equal-weight combo
rolling_3m_hedged = hedged.rolling(3).sum()

print("HEDGING EFFECT: UMD + HML (equal weight)")
print("=" * 60)
print(f"  Worst 3-month UMD alone:   {rolling_3m_umd.loc[worst_end]*100:+.2f}%")
print(f"  Hedged portfolio same dates: "
      f"{rolling_3m_hedged.loc[worst_end]*100:+.2f}%")
print(f"  Drawdown reduction:          "
      f"{(1 - rolling_3m_hedged.loc[worst_end]/rolling_3m_umd.loc[worst_end])*100:.1f}%")

# Visual comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Left: Cumulative returns around the crash
window_start = worst_start - pd.DateOffset(months=6)
window_end   = worst_end   + pd.DateOffset(months=6)
subset = factors.loc[window_start:window_end, ['UMD', 'HML', 'Mkt_RF']].copy()
subset['UMD+HML'] = (subset['UMD'] + subset['HML']) / 2
cum = (1 + subset[['UMD', 'HML', 'UMD+HML', 'Mkt_RF']]).cumprod()

for col, c in zip(['UMD', 'UMD+HML', 'HML', 'Mkt_RF'],
                   ['red', 'darkorange', 'steelblue', 'gray']):
    ls = '--' if col == 'Mkt_RF' else '-'
    axes[0].plot(cum.index, cum[col], label=col, linewidth=1.8, color=c, linestyle=ls)

axes[0].axvspan(worst_start, worst_end, alpha=0.15, color='red', label='Crash window')
axes[0].set_title('Cumulative Returns Around Momentum Crash', fontweight='bold')
axes[0].set_ylabel('Growth of $1')
axes[0].legend(loc='lower left', fontsize=9)
axes[0].grid(True, alpha=0.3)

# Right: Full-sample max drawdowns
for label, series in [('UMD', factors['UMD']),
                       ('UMD+HML', hedged)]:
    cum_full = (1 + series).cumprod()
    running_max = cum_full.cummax()
    dd = (cum_full - running_max) / running_max
    axes[1].plot(dd.index, dd * 100, label=label, linewidth=1)

axes[1].set_title('Maximum Drawdown Over Time', fontweight='bold')
axes[1].set_ylabel('Drawdown (%)')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("""
KEY INSIGHTS
  • Momentum's worst crashes coincide with sharp market rebounds.
  • HML often does well when momentum suffers (value stocks = past
    losers that recover first), making it a natural hedge.
  • The equal-weight UMD+HML combination substantially reduces
    the worst drawdown, albeit at the cost of lower average return.
  • This is a practical example of why multi-factor portfolios
    dominate single-factor bets in risk-adjusted terms.
""")

Exercise 4 — The Beta Anomaly: Subsample Robustness (Discussion + Hands-On)¶

Problem: Split the beta-sorted decile portfolios into 2000–2011 and 2012–2023. Plot the empirical SML for each subperiod. Regress a simple BAB factor on FF5 + Momentum for each half. Discuss whether leverage-constraint changes (leveraged ETFs, zero-commission brokers) might affect the premium.

# ============================================================================
# NB04 Exercise 4 — Beta Anomaly Subsample Robustness
# ============================================================================

import re
from io import StringIO

# ── Step 1: Download beta-sorted decile portfolios ──
def download_beta_portfolios():
    """Download beta-sorted DECILE portfolio returns from Ken French's library."""
    url = "https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/Portfolios_Formed_on_BETA_CSV.zip"
    with tempfile.TemporaryDirectory() as tmpdir:
        zip_path = f"{tmpdir}/data.zip"
        urllib.request.urlretrieve(url, zip_path)
        with zipfile.ZipFile(zip_path, 'r') as zf:
            zf.extractall(tmpdir)
        files = [f for f in os.listdir(tmpdir) if f.lower().endswith('.csv')]
        csv_path = f"{tmpdir}/{files[0]}"
        with open(csv_path, 'r') as f:
            raw_lines = f.readlines()

    sections = []
    current_block = []
    for line in raw_lines:
        stripped = line.strip()
        if re.match(r'^\s*\d{6}', stripped):
            current_block.append(stripped)
        else:
            if current_block:
                sections.append(current_block)
                current_block = []
    if current_block:
        sections.append(current_block)

    target = None
    for block in sections:
        ncols = len(block[0].split(',')) - 1 if ',' in block[0] else len(block[0].split()) - 1
        if ncols >= 10:
            target = block
            break
    if target is None:
        raise ValueError("Could not find decile section in beta file.")

    csv_text = '\n'.join(target)
    df = pd.read_csv(StringIO(csv_text), header=None, index_col=0)
    df = df.iloc[:, :10]
    df.columns = [f'D{i+1}' for i in range(10)]
    df.index = pd.to_datetime(df.index.astype(str).str.strip(), format='%Y%m')
    df = df.apply(pd.to_numeric, errors='coerce').dropna()
    df = df / 100
    return df

print("Downloading beta-sorted decile portfolios...")
beta_ports = download_beta_portfolios()
beta_ports = beta_ports.loc['2000-01-01':'2023-12-31']

# Align with our factor data
rf_aligned = factors['RF'].reindex(beta_ports.index).dropna()
mkt_aligned = factors['Mkt_RF'].reindex(beta_ports.index).dropna()
common_idx = beta_ports.index.intersection(rf_aligned.index).intersection(mkt_aligned.index)
beta_ports = beta_ports.loc[common_idx]
rf_aligned = rf_aligned.loc[common_idx]
mkt_aligned = mkt_aligned.loc[common_idx]

# ── Step 2: Split into two subperiods ──
split_date = '2011-12-31'
periods = {
    '2000–2011': (beta_ports.loc[:split_date], rf_aligned.loc[:split_date], mkt_aligned.loc[:split_date]),
    '2012–2023': (beta_ports.loc['2012-01-01':], rf_aligned.loc['2012-01-01':], mkt_aligned.loc['2012-01-01':]),
}

fig, axes = plt.subplots(1, 2, figsize=(16, 6))

for ax, (period_name, (bp, rf, mkt)) in zip(axes, periods.items()):
    avg_excess = []
    realized_betas = []
    for col in bp.columns:
        excess = bp[col] - rf
        X = sm.add_constant(mkt.values)
        res = sm.OLS(excess.values, X).fit()
        realized_betas.append(res.params[1])
        avg_excess.append(excess.mean() * 12 * 100)
    
    avg_excess = np.array(avg_excess)
    realized_betas = np.array(realized_betas)
    
    ax.scatter(realized_betas, avg_excess, s=120, c='darkblue', zorder=5,
               edgecolors='white', linewidth=1.5)
    for i, label in enumerate(bp.columns):
        ax.annotate(label, (realized_betas[i], avg_excess[i]),
                    textcoords="offset points", xytext=(8, 5), fontsize=9)
    
    # CAPM line
    mkt_prem = mkt.mean() * 12 * 100
    beta_range = np.linspace(0, max(realized_betas) * 1.1, 100)
    ax.plot(beta_range, beta_range * mkt_prem, 'r--', linewidth=2, label='CAPM', alpha=0.7)
    
    # Empirical fit
    slope, intercept = np.polyfit(realized_betas, avg_excess, 1)
    ax.plot(beta_range, slope * beta_range + intercept, 'b-', linewidth=2.5,
            label=f'Empirical (slope={slope:.1f})', alpha=0.9)
    
    ax.set_xlabel('Realized Beta', fontsize=11)
    ax.set_ylabel('Ann. Excess Return (%)', fontsize=11)
    ax.set_title(f'Empirical SML: {period_name}\n(CAPM slope={mkt_prem:.1f})',
                 fontweight='bold', fontsize=12)
    ax.legend(fontsize=9)
    ax.grid(True, alpha=0.3)

plt.suptitle('Beta Anomaly Subsample Robustness', fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

# ── Step 3: BAB alpha in each subperiod ──
print("\n" + "=" * 60)
print("BAB ALPHA BY SUBPERIOD (FF5 + Momentum controls)")
print("=" * 60)

factor_cols_full = ['Mkt_RF', 'SMB', 'HML', 'RMW', 'CMA', 'UMD']

for period_name, (bp, rf, mkt) in periods.items():
    # Simple BAB = D1 excess - D10 excess
    bab = (bp['D1'] - rf) - (bp['D10'] - rf)
    ff = factors.loc[bab.index, factor_cols_full].dropna()
    bab = bab.loc[ff.index]
    
    X_ff = sm.add_constant(ff.values)
    res_bab = sm.OLS(bab.values, X_ff).fit(cov_type='HAC', cov_kwds={'maxlags': 6})
    
    alpha_ann = res_bab.params[0] * 12 * 100
    print(f"\n  {period_name} ({len(bab)} months)")
    print(f"    Alpha (ann.): {alpha_ann:+.2f}%   t = {res_bab.tvalues[0]:.2f}   "
          f"p = {res_bab.pvalues[0]:.4f}")
    sig = "SIGNIFICANT" if res_bab.pvalues[0] < 0.05 else "NOT significant"
    print(f"    Conclusion:   {sig}")

print(f"""
DISCUSSION
{'='*60}
1. The empirical SML is flat (or negative-sloped) in BOTH subperiods,
   confirming the beta anomaly's robustness. However, the magnitude
   may differ — the 2000–2011 period includes the GFC, which was
   favourable for low-beta stocks.

2. BAB alpha may vary in significance across subperiods. If it weakens
   in 2012–2023, this could reflect:
   - The rise of leveraged ETFs (SSO, UPRO launched ~2006–2009),
     which give retail investors cheap leverage access
   - Zero-commission brokers (Robinhood, 2015+), reducing friction
   - Both developments ease the LEVERAGE CONSTRAINT that Frazzini
     & Pedersen cite as the primary driver of BAB
   - If leverage becomes more accessible, constrained investors no
     longer need to "reach for beta," reducing the overpricing of
     high-beta stocks

3. However, institutional constraints (pension funds, mutual funds)
   remain binding — many fiduciaries still cannot use leverage,
   which may preserve some of the premium even as retail access
   improves.

4. KEY TAKEAWAY: The beta anomaly is robust in direction (the SML is
   consistently flatter than the CAPM predicts) but its magnitude and
   investability remain debated, especially after costs (see Novy-Marx
   & Velikov, 2022).
""")

End of Solutions Manual¶

All 18 exercises from Notebooks 01–04 have been solved above. Key themes across the solutions:

Robust inference matters. Newey-West (HAC) standard errors are essential whenever residuals exhibit autocorrelation or heteroscedasticity — which is essentially always in financial data.
Statistical ≠ Economic significance. Always convert monthly estimates to annualised terms and compare against realistic transaction costs.
Diagnostics before conclusions. Check residual plots, ACF/PACF, ARCH tests, and stationarity before trusting regression output.
Diversification across factors works. Combining low-correlated factors (Value + Profitability + Momentum) materially improves risk-adjusted performance and reduces tail risk.