Tutorial 6 — Regression with Domain Adaptation¶

Most ShiftKit examples use classification tasks, but domain shift affects regression problems just as commonly — a model trained to predict house prices in Northern California or fit a sine wave at one phase will generalise poorly when deployed in a different geographic region or with a shifted input signal.

This tutorial walks through two regression DA experiments using SourceOnlyRegressionTrainer (baseline) and MMDRegressionTrainer (alignment). The same MLPRegressor architecture and DataManager interface are used throughout.

New classes introduced¶

Class	Module	Purpose
`MLPRegressor`	`shiftkit.models`	Flexible MLP with `encode` / `regress` interface
`SourceOnlyRegressionTrainer`	`shiftkit.methods`	MSE baseline, no alignment
`MMDRegressionTrainer`	`shiftkit.methods`	MSE + λ·MMD² feature alignment
`SineWaveDataset`	`shiftkit.data`	Synthetic phase-shifted sine regression
`CaliforniaHousingDataset`	`shiftkit.data`	Geographic CA housing split

All regression trainers track RMSE per epoch (source and target) and report MSE, RMSE, and R² from evaluate().

Example 1 — Sine Wave (Phase Shift)¶

The domain gap¶

The source domain is y = sin(x) sampled uniformly from (−π, π). The target domain has the same shape but is shifted by π/3 (60°):

source: y = sin(x)
target: y = sin(x + π/3)

A model trained only on source data will try to predict the wrong phase on the target, leading to high RMSE. MMD alignment encourages the encoder to map both signals to a shared latent space, reducing the phase-induced gap.

Setup¶

import shiftkit
from shiftkit import DataManager, MLPRegressor
from shiftkit import SourceOnlyRegressionTrainer, MMDRegressionTrainer

dm = DataManager(batch_size=256)
train_src, train_tgt = dm.load("sine_wave")           # phase 0  vs  π/3
test_src,  test_tgt  = dm.load("sine_wave", train=False)

Source-only baseline¶

import copy

model_so = MLPRegressor(input_dim=1, latent_dim=32, hidden_dims=(64, 64))

trainer_so = SourceOnlyRegressionTrainer(
    model_so, train_src, train_tgt, lr=1e-3
)
history_so = trainer_so.fit(epochs=30)

result_so = trainer_so.evaluate(test_tgt, domain="target")
print(f"Source-only → target  RMSE={result_so['rmse']:.4f}  R²={result_so['r2']:.3f}")

MMD-aligned model¶

model_mmd = MLPRegressor(input_dim=1, latent_dim=32, hidden_dims=(64, 64))

trainer_mmd = MMDRegressionTrainer(
    model_mmd, train_src, train_tgt,
    mmd_weight=0.5, warmup_epochs=5, lr=1e-3
)
history_mmd = trainer_mmd.fit(epochs=30)

result_mmd = trainer_mmd.evaluate(test_tgt, domain="target")
print(f"MMD          → target  RMSE={result_mmd['rmse']:.4f}  R²={result_mmd['r2']:.3f}")

Compare training curves¶

import matplotlib.pyplot as plt

epochs = [e["epoch"] for e in history_so]

plt.figure(figsize=(8, 4))
plt.plot(epochs, [e["tgt_rmse"] for e in history_so],  label="Source-only (target)")
plt.plot(epochs, [e["tgt_rmse"] for e in history_mmd], label="MMD (target)")
plt.plot(epochs, [e["src_rmse"] for e in history_mmd], label="MMD (source)", linestyle="--")
plt.xlabel("Epoch")
plt.ylabel("RMSE")
plt.title("Sine wave — target RMSE during training")
plt.legend()
plt.tight_layout()
plt.show()

Typical results¶

Model	Source RMSE	Target RMSE	Target R²
Source-only	~0.06	~0.47	~0.56
MMD (λ=0.5)	~0.07	~0.12	~0.93

The domain gap is large (target RMSE ≈ 8× source) without adaptation. MMD alignment recovers most of the target performance.

Example 2 — California Housing (Geographic Split)¶

The domain gap¶

The California Housing dataset (scikit-learn builtin, no download required) contains census-block-level housing statistics. We split by latitude:

Source (North, lat ≥ 36°): Bay Area, Sacramento, Fresno — ~11 k samples
Target (South, lat < 36°): Greater Los Angeles, San Diego — ~9 k samples

Northern and Southern California have meaningfully different price distributions, housing density, and income patterns — a realistic geographic covariate shift.

The prediction target is median house value (in units of $100 k).

Setup¶

dm = DataManager(batch_size=128)
train_src, train_tgt = dm.load("california_housing")
test_src,  test_tgt  = dm.load("california_housing", train=False)

Both domains are normalised with statistics computed from the full dataset, so feature scales are consistent across domains.

Source-only baseline¶

model_so = MLPRegressor(
    input_dim=8,          # 8 housing features
    latent_dim=64,
    hidden_dims=(128, 64),
    dropout=0.1,
)

trainer_so = SourceOnlyRegressionTrainer(
    model_so, train_src, train_tgt, lr=1e-3
)
history_so = trainer_so.fit(epochs=40)

result_so = trainer_so.evaluate(test_tgt, domain="target")
print(f"Source-only → target  RMSE={result_so['rmse']:.4f}  R²={result_so['r2']:.3f}")

MMD-aligned model¶

model_mmd = MLPRegressor(
    input_dim=8,
    latent_dim=64,
    hidden_dims=(128, 64),
    dropout=0.1,
)

trainer_mmd = MMDRegressionTrainer(
    model_mmd, train_src, train_tgt,
    mmd_weight=0.3, warmup_epochs=5, lr=1e-3
)
history_mmd = trainer_mmd.fit(epochs=40)

result_mmd = trainer_mmd.evaluate(test_tgt, domain="target")
print(f"MMD          → target  RMSE={result_mmd['rmse']:.4f}  R²={result_mmd['r2']:.3f}")

Evaluate on both domains¶

for trainer, label in [(trainer_so, "Source-only"), (trainer_mmd, "MMD")]:
    r_src = trainer.evaluate(test_src, domain="source")
    r_tgt = trainer.evaluate(test_tgt, domain="target")
    print(f"{label:12s}  src R²={r_src['r2']:.3f}  tgt R²={r_tgt['r2']:.3f}")

Typical results¶

Model	Source R²	Target R²	Target RMSE
Source-only	~0.64	~0.55	~0.61
MMD (λ=0.3)	~0.63	~0.61	~0.57

The improvement from MMD is modest but consistent. California Housing has a relatively small geographic gap — the two regions share many feature correlations. More pronounced shifts (e.g. different countries or years) would show larger gains.

Tuning tips¶

Symptom	Likely cause	Try
Target RMSE worse with MMD than without	λ too large — MMD dominates MSE	Reduce `mmd_weight` (try 0.1–0.5)
Source RMSE rises sharply after warmup	Too aggressive alignment	Increase `warmup_epochs`
Both RMSE stay flat	Learning rate too low	Increase `lr` or reduce `hidden_dims`
Target RMSE oscillates	Batch size too small for stable MMD estimate	Increase `batch_size` (≥ 128 recommended)

API reference¶

`MLPRegressor`¶

MLPRegressor(
    input_dim,                  # number of input features (required)
    latent_dim=64,              # bottleneck size (passed to MMD)
    output_dim=1,               # regression output size
    hidden_dims=(128, 64),      # encoder hidden layer sizes
    dropout=0.1,
)

`MMDRegressionTrainer`¶

MMDRegressionTrainer(
    model, source_loader, target_loader,
    mmd_weight=1.0,             # λ for the MMD term
    lr=1e-3,
    warmup_epochs=0,            # source-only pre-training epochs
    device=None,                # auto-detected
    mmd_sigmas=None,            # RBF bandwidths (default: [0.1, 1, 5, 10, 50])
)

evaluate(loader, domain) returns {"domain", "mse", "rmse", "r2", "n_samples"}.

Tutorial 6 — Regression with Domain Adaptation¶

New classes introduced¶

Example 1 — Sine Wave (Phase Shift)¶

The domain gap¶

Setup¶

Source-only baseline¶

MMD-aligned model¶

Compare training curves¶

Typical results¶

Example 2 — California Housing (Geographic Split)¶

The domain gap¶

Setup¶

Source-only baseline¶

MMD-aligned model¶

Evaluate on both domains¶

Typical results¶

Tuning tips¶

API reference¶

MLPRegressor¶

MMDRegressionTrainer¶

`MLPRegressor`¶

`MMDRegressionTrainer`¶