Chapter 5 Schema & Contract Management


I. Chapter Purpose & Scope

in pipelines; define reserved keys such as schema_ref/compat_mode/evolution_policy, standardize contract registration, shadow comparison, and release gates; ensure consistency with Dataset/Model Cards, the Metrology chapter, and citation anchors.schemas and data contractsFix the versioning, compatibility, evolution policy, and validation workflow of

II. Terminology & Dependencies


III. Fields & Structure (Normative)

contract:

schema_ref: "contracts/<name>@vX.Y" # versioned schema reference (required)

compat_mode: "forward|backward|both|break"

evolution_policy:

add_field: "optional-by-default|feature-flag"

remove_field: "forbid|deprecate-then-remove"

change_type: "coercible|forbid"

change_sematic: "requires-shadow-and-signoff"

constraints:

primary_key: ["<col1>", "<col2?>"]

partition_by: ["<pcol?>"]

unique: [["<colA>","<colB>"]]

not_null: ["<colX>", "<colY>"]

range:

- {col:"<metric>", rule:"[lo,hi]"}

enum:

- {col:"<status>", values:["A","B","C"]}

units: { "<col>":"<SI-unit>" } # aligned with Metrology

validation:

mode: "strict|lenient"

sample: {rows: 10000, strategy:"head|random|stratified"}

significance: {alpha: 0.05}

shadow:

enabled: true

route: "percent:5" # shadow ratio or selector

compare_metrics: ["dq.pass_rate","error_rate","latency_ms.p95"]

lineage_bind:

produce: ["<artifact_path>"]

consume: ["<upstream_schema_ref>"]


IV. Contract Registration & Release Workflow

  1. Registration: record schema_ref in the schema registry with checksum and change summary; first release must include a minimal example and DQ baseline.
  2. Compatibility matrix:
    • forward: downstream accepts upstream additive optional fields;
    • backward: upstream can output a subset for older downstreams;
    • both: bidirectional compatibility;
    • break: breaking changes require shadow comparison and sign-off.
  3. Evolution policy: new fields default optional; removals use a deprecate → remove two-step; type changes only when coercible, with a declared conversion rule.
  4. Release gates: schema validation = pass, DQ = pass, shadow diffs within thresholds, metrology.check_dim=true, citation anchors complete.

V. Schema Design Constraints


VI. Shadow Comparison & Rollback


VII. Machine-Readable (Normative Excerpt)

layers:

- name: "validate"

stages:

- name: "schema.check"

type: "validate.schema"

impl: "I16-2.schema_check"

inputs: ["raw_rows"]

outputs: ["clean_rows"]

contract:

schema_ref: "contracts/raw_rows@v1.2"

compat_mode: "both"

evolution_policy:

add_field: "optional-by-default"

remove_field: "deprecate-then-remove"

change_type: "coercible"

change_sematic: "requires-shadow-and-signoff"

constraints:

primary_key: ["id"]

not_null: ["id","ts"]

enum: [{col:"status", values:["ok","warn","err"]}]

units: {"lat":"deg","lon":"deg","power_w":"W"}

validation:

mode: "strict"

sample: {rows: 50000, strategy:"stratified"}

significance: {alpha: 0.05}

shadow:

enabled: true

route: "percent:5"

compare_metrics: ["dq.pass_rate","error_rate","latency_ms.p95"]

lineage_bind:

produce: ["lake/clean/2025/09/"]

consume: ["contracts/raw_json@v1.2"]


VIII. Lint Rules (Excerpt, Normative)

lint_rules:

- id: SCHEMA.REF_FORMAT

when: "$..schema_ref"

assert: "matches('^contracts/[a-z0-9_\\-]+@v\\d+\\.\\d+$')"

level: error

- id: SCHEMA.COMPAT_ALLOWED

when: "$..compat_mode"

assert: "value in ['forward','backward','both','break']"

level: error

- id: SCHEMA.UNITS_DECLARED

when: "$..constraints.units"

assert: "all_units_in_SI(value)"

level: error

- id: SCHEMA.PK_NOT_NULL

when: "$..constraints"

assert: "primary_key != null and all_not_null(primary_key, not_null)"

level: error

- id: SCHEMA.SHADOW_REQUIRED_ON_BREAK

when: "$..compat_mode"

assert: "value != 'break' or $.shadow.enabled == true"

level: error

- id: SCHEMA.METROLOGY_CHECKDIM

when: "$.pipeline.metrology"

assert: "units == 'SI' and check_dim == true"

level: error


IX. Contract Evolution & Notices


X. Export Manifest & Audit Trail

export_manifest:

version: "v1.0"

artifacts:

- {path:"contracts/raw_rows.schema.json", sha256:"..."}

- {path:"contracts/changelog.md", sha256:"..."}

- {path:"validate/dq.report.jsonl", sha256:"..."}

- {path:"validate/shadow.diff.csv", sha256:"..."}

references:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Metrology v1.0:check_dim"

- "EFT.WP.Data.DatasetCards v1.0:Ch.12"


XI. Chapter Compliance Checklist