Validation layer
Required columns, dtypes, numeric ranges, categorical domains, and missing-value policies.
I validate incoming tabular batches against a JSON schema, compare them to a frozen reference window (numeric PSI/KS, categorical drift), monitor prediction behaviour, and emit machine-readable reports plus a static HTML roll-up. CI exercises the same logic with pytest.
Artifacts are produced by `run_full_monitoring`; the hosted Streamlit viewer is read-only over those files — not a production observability stack.
Each incoming CSV batch runs through validation against data/metadata/schema_definition.json, drift estimation versus the frozen reference window,
and optional prediction-behaviour checks. Outputs land under artifacts/reports/ and artifacts/drift/ for programmatic inspection or dashboards.
Orchestration lives in python -m src.pipeline.run_full_monitoring. Simulation helpers regenerate deterministic datasets when needed.
Required columns, dtypes, numeric ranges, categorical domains, and missing-value policies.
Population stability (PSI), Kolmogorov–Smirnov on numeric columns, and categorical shift summaries vs reference.
Distribution checks on model scores and derived labels when present — highlighting silent shifts before downstream KPIs react.
Rule metadata surfaces in JSON reports; decision summaries consolidate severity for reviewers.
The hosted Streamlit page reads the committed artifacts only — select a batch, inspect alerts, drift tables, and score histograms without recomputing metrics server-side. Treat it as an exploration aid for reviewers, not an online scoring tier.
Pytest covers drift math, alert thresholds, simulations, and exporters so changes stay grounded. GitHub Actions installs requirements.txt and runs the suite on each push to main.