# Statistical Methodology Note v1.0

Status: reviewer reference for percentile, cap, and tail reporting semantics.

## Scope

This note specifies how Sovrient computes and publishes daily corroboration distribution metrics used in:

- `web/data/ms_zones_<DATE>.json`
- `web/data/ms_zones_latest.json`
- `web/data/sealed_catalog_<DATE>.json`
- `web/data/sealed_catalog_latest.json`

## Percentile Definition

Sovrient percentiles use a nearest-rank ceiling rule on sorted values.

- Input: sorted values `vals` of length `n`
- Percentile parameter: `p` in `[0, 1]`
- Index (0-based): `idx = ceil(p * n) - 1`
- Clamp: `idx` is clamped to `[0, n - 1]`
- Output: `vals[idx]`

Operational mappings:

- `p95 = percentile(values, 0.95)`
- `p99 = percentile(values, 0.99)`

## Temporal Metrics: Bounded vs Uncapped

Two temporal surfaces are published and must not be conflated:

1. Bounded temporal agreement (`temporal_agreement_*`):
- Derived from confirmed-event rollups.
- Enforces deterministic cap `TEMPORAL_AGREEMENT_CAP_SEC = 30.0`.

2. Uncapped temporal tails (`temporal_uncapped_*`):
- Derived from all corroboration comparison rows before DT0 threshold gating.
- Preserves raw tail behavior for methodology auditability.

Cap policy semantics:

- `temporal_cap_role = "ui_reporting_cap"` (cap is a reporting lens)
- `uncapped_metrics_authoritative_for_dispersion = true` (uncapped tails are the dispersion truth surface)
- `cap_pressure_temporal_p95 = temporal_uncapped_sec_p95 / temporal_cap_seconds`
- `cap_pressure_temporal_p99 = temporal_uncapped_sec_p99 / temporal_cap_seconds`

Both surfaces publish:

- `avg`, `p95`, `p99`, `max_observed`
- `cap_hits`, `sample_count`, `cap_hit_rate`

`cap_hit_rate = cap_hits / sample_count` when `sample_count > 0`, otherwise `0`.

## Tail Stability Windows

Rolling uncapped tail stability is computed over daily corroboration logs for:

- `7d`
- `30d`
- `90d`

In `statistics_spec`, window definitions are exposed under:

- `tail_stability_windows_days` (canonical key)
- `rolling_windows_days` (backward-compatible alias)

Each window publishes:

- `days_scanned`
- `days_with_data`
- `sample_count`
- `temporal_uncapped_sec_p95`
- `temporal_uncapped_sec_p99`
- `temporal_uncapped_sec_max_observed`
- `temporal_uncapped_sec_cap_hits`
- `temporal_uncapped_sec_cap_hit_rate`

## Tail Segmentation

Uncapped temporal tails are segmented by:

- `region_bucket`
- `source_pair`

Inclusion threshold:

- `TAIL_SEGMENT_MIN_SAMPLES = 3`

Outputs include both total and displayed counts:

- `region_bucket_count_total`
- `source_pair_count_total`
- `region_bucket_count_displayed`
- `source_pair_count_displayed`

## Threshold Sensitivity Disclosure (Advisory)

`threshold_sensitivity` publishes a non-gating disclosure for the M>=4.0 cliff:

- `thresholds.gul_mag_min`
- `near_threshold_counts.within_0_05`
- `near_threshold_counts.within_0_10`
- `expected_flip_risk.using_mag_drift_p95`
- `expected_flip_risk.using_mag_drift_p99`

Boundary:

- advisory only
- not settlement logic
- not trigger logic

## Rounding Rules

- Percentiles and maxima: 5 decimals
- Cap hit rates: 6 decimals

## Determinism and Ordering Invariants

- For a given date, `ms_zones` export must complete before `sealed_catalog` export.
- `sealed_catalog` carries input artifact hashes (`ms_zones_sha256`, `dsd_sha256`) to bind provenance.
- Changes to cap values, percentile algorithm, windows, or segmentation thresholds require:
  - method note version update
  - changelog entry
  - redeploy with fresh trust probe receipt

## Versioning

- Method version: `SOVRIENT_STATS_METHOD_V1.0`
- Canon binding: `SOVOS_CANON_V1`
