Appendix B — Data Specification and I/O


I. One-Sentence Goal

Anchor all data objects and I/O for Early Objects to Template v0.1 (EFT Technical Whitepaper & Engineering Memos — Complete Checklist v0.1). Define schemas, units, serialization, directory layout, I/O contracts, and error semantics so that Catalog/Seeds/Trajectory, Phi_T/grad_Phi_T, L_nu/LC, n_eff, { ell_i }, Delta_T_sigma, {R_env,T_trans,A_sigma}, and both arrival-time forms T_arr/Delta_T_arr are operational, reproducible, and auditable.


II. Scope & Non-Goals


III. Global Constraints & Conventions

  1. Coords/metric/units are mandatory: coords_spec, metric_spec, units_spec must be present; normalize ingress to SI. If inputs arrive in km/ms, map to m/s and log the mapping.
  2. Inline symbols: always use backticks for T_arr, Delta_T_arr, n_eff, c_ref, gamma(ell), Sigma_env, Delta_T_sigma, etc.
  3. Naming isolation: T_fil ≠ T_trans; n ≠ n_eff.
  4. Dimensionality & lower bound: ingress must pass check_dimension. Enforce dim(T_arr)=[T], dim(n_eff)=1, dim(c_ref)=[L][T^-1]. Outputs must satisfy the lower bound T_arr ≥ L_path / c_ref (the general form is equivalent).
  5. Energy consistency at interfaces: every event satisfies R_env + T_trans + A_sigma = 1, and must produce in-band curves with residuals.
  6. Two-form arrival time:
    • Constant pull-out: T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell )
    • General form: T_arr = ( ∫ ( n_eff / c_ref ) d ell )
      Record mode ∈ {constant, general}.

IV. Data Objects & Primary Keys (minimal fields)


Contract (measurement contract)


Catalog (object directory)

{ id, type, z_form, z_obs, env_ref, seed_ref }, plus hash(Catalog)Required:

Seeds/Triggers

priors, seed_samples (incl. seed_rng), triggers:[{event,type,time}], hash(Seeds)Required:

Trajectory (state series)

state_series:[{t, M, R, J, a_bh, SFR, Z, …}], events:[…], hash(Trajectory)Required:

Field (fields & refractive index)


SeaProfile / Interfaces (optional)


Path


Spectral/Obs


RTParams (energy triplet)

in-band curves & clamped intervals for R_env(f), T_trans(f), A_sigma(f)Required:

CalibCref (reference speed calibration)

gamma_ref_id, T_arr_ref_s, n_eff_ref_hash, c_ref_est, u_stat, u_sys, env_blockRequired:

Report/Log

run_id, contract_id, hashes, metrics:{eps_T,eta_T,eta_c,eta_w,tau_switch,GB,u_c}, notesRequired:

V. Serialization & Directory Layout


VI. Field & Unit Rules (key fields)


VII. I/O Contracts (aligned to Template family)

This section anchors Template APIs (not the volume’s implementation). Engineering mappings may be appended as “Template → I70-*”.


End-to-end (object → spectrum → propagation)


Causation & triggers


Energy consistency & interface audit


VIII. Data-Quality Checks (DQC, automated)


IX. Error Semantics (aligned to Template error family)


X. JSONL Examples (minimal viable)

Contract (/contracts/eo.contract.json)

{

"id": "ct-eo-001",

"spec_version": "EFT.WP.Cosmo.EarlyObjects v1.0",

"coords_spec": "Comoving-Spherical",

"units_spec": {"length":"m","time":"s","speed":"m•s^-1","frequency":"Hz"},

"metric_spec": {"type":"FLRW-like","S_k":"sin","a_ref":1.0},

"mode": "constant",

"gauge": {"x_ref":[0,0,0], "t_ref":"2025-01-01T00:00:00Z"},

"boundary_config": {"type":"Dirichlet","Phi_T_far":0},

"tolerances": {"eps_T":1e-9,"eta_T":5e-10,"eta_w":0.03,"tau_switch":5e-12},

"n_eff_dependencies": "F(Phi_T, grad_Phi_T, rho, f)",

"hashes": {

"hash(Catalog)":"aa22bb33",

"hash(SeaProfile)":"77cc11dd",

"hash(Phi_T)":"ab12cd34",

"hash(grad_Phi_T)":"de98fa76",

"hash(gamma)":"ef56ab78",

"hash(code)":"aa11bb22"

}

}

Catalog (/catalog/eo.catalog.json)

{"objects":[{"id":"obj001","type":"BHSeed","z_form":18.2,"z_obs":12.7,"env_ref":"sea_v1","seed_ref":"sd001"}]}

Seeds (/seeds/sd001.seeds.json)

{"id":"sd001","priors":{"M0":{"dist":"lognormal","mu":2e4,"sigma":0.3}},"seed_samples":[{"M0":2.3e4,"R0":1.5e15,"J0":1.0e50}],"seed_rng":20250905}

SeaProfile (/seaprofile/sea.v1.json)

{"layers":[{"model":"tanh","chi_k":1.2e3,"Delta_k":2.0e2,"sigma_k":1.0e2}],"eta_w":0.03,"hash(SeaProfile)":"77cc11dd"}

Path (/paths/p001.path.jsonl)

{"path_id":"p001","gamma":[[0,0,1.1e3],[0,0,1.3e3],[0,0,2.3e3]],"Δell":[2.0e2,1.0e3],"t_hat":[[0,0,1],[0,0,1]],"interface_marks":[1]}

Observations (/obs/p001.obs.jsonl)

{"obs_id":"o001","path_id":"p001","f_hz":1.0e9,"T_arr_obs_s":6.2001e-3,"Delta_T_arr_obs_s":-7.0e-7,"u_stat_s":2.0e-6,"u_sys_s":3.0e-6,"timestamp":"2025-01-01T00:00:00Z"}

{"obs_id":"o002","path_id":"p001","f_hz":1.05e9,"T_arr_obs_s":6.2008e-3,"Delta_T_arr_obs_s":0.0,"u_stat_s":2.0e-6,"u_sys_s":3.0e-6,"timestamp":"2025-01-01T00:00:01Z"}

RTParams (/rtparams/rt.p001.json)

{"R_env":[["9.5e8",0.18],["1.0e9",0.20],["1.05e9",0.19]],

"T_trans":[["9.5e8",0.77],["1.0e9",0.76],["1.05e9",0.78]],

"A_sigma":[["9.5e8",0.05],["1.0e9",0.04],["1.05e9",0.03]]}

CalibCref (/calib/c_ref.json)

{"gamma_ref_id":"p_ref","T_arr_ref_s":6.2000e-3,"n_eff_ref_hash":"99aa33bb",

"c_ref_est":2.99792458e8,"u_stat":5.0e3,"u_sys":1.0e3,

"env_block":{"temp_C":20.0,"clock":"UTC"}}


XI. Typical I/O Workflow Alignment (Template family)

The Template family is authoritative; engineering may add a “Template → I70-*” mapping.


A. Object → spectrum → propagation (E2E)


B. Energy consistency & interface audit


C. Causation & triggers


XII. Data Quality & Audit Checklist (pre-publish self-check)


XIII. Security & Integrity


XIV. Cross-Volume Alignment (data side)


XV. Deliverables