spec_to_rest

Project-level phase ledger, the per-program milestone history (Cats Effect 3 migration, translator soundness, test generation, synthesis), the per-pipeline capability inventory, the live follow-up backlog, and the wontfix decisions of record.

Last updated:

This page is the single source of truth for project status. Every other page in this site focuses on how a pipeline works; this page answers what's done, what's planned, and what's intentionally not coming. It is regenerated from the per-page status content that used to live in Architecture, Verification Engine, Test Generation, and Synthesis Pipeline; those pages now link here for status questions.

Project phases

phase-N issue labels gate each entry. "Shipped" means merged on main and continuously hardened by the workflows under .github/workflows/. "Not started" means an issue thread exists but no code has merged.

PhaseThemeStatus
1Core parser / IR buildShipped. ANTLR4 grammar at parser/Spec.g4; Parse.parseSpec -> Builder.buildIR returns IO[Either[VerifyError, ServiceIR]].
2Convention engine (M1-M10)Shipped. Operation classification + naming/path/schema/validate at modules/convention/; see Convention Engine.
3python-fastapi-postgres codegen + Alembic + OpenAPIShipped. See python-fastapi-postgres for the deployment-target spec.
4Verification (Z3 + Alloy + Isabelle/HOL universal soundness)Shipped. Pivoted from Lean 4 to Isabelle/HOL via #193; IR canonicalized in #202; production verify routes in-subset checks through the Isabelle-extracted translator (#192). See Verification capabilities below.
5Test generation (native, per-target)Largely shipped. M5.2 stateful, M5.3 structural, M5.4 conformance runner, M5.6 custom strategies, M5.8 sensitive fields, M5.9 transitions, M5.10 built-in predicates all on main. M5.11 multi-target native conformance shipped — the Backend.scala seam emits the suite in each target's own language (pytest+Hypothesis+Schemathesis / Vitest+fast-check / go test+rapid) with Python as the byte-identical differential oracle (#278, #279, #280; closes #265). See Test generation follow-ups.
6LLM / Dafny synthesis (CEGIS)Shipped through M6.7. Operation classification (#31), Dafny signature generation (#32), LLM integration (#28), CEGIS loop (#29), compile --with-synthesis (#27), graduated fallback (#30), DafnyPro-style hint augmentation (#229). See Synthesis follow-ups.
7Multi-target codegen, Go/chi, TS/ExpressIn progress. Go/chi target shipped in #33; see go-chi-postgres. TS/Express target shipped in #35; see ts-express-postgres. CLI polish shipped in #36: compile --dry-run, diff, test, --color/--no-color, exit-code reference — see CLI Reference. Distribution shipped in #34: 4-platform release binaries (linux-amd64, macos-arm64, macos-amd64, windows-amd64), GHCR Docker image (ghcr.io/hardmax71/spec-to-rest), and a composite GitHub Action — see Install. Diff migrations across all three targets shipped in #56: snapshot-driven Alembic / golang-migrate / Prisma deltas with reversible up/down — see Migrations. Alternative database dialects shipped in #58: python-fastapi-sqlite and python-fastapi-mysql profiles via a pluggable Dialect strategy (per-dialect SQLAlchemy types, triggers, partial-index handling, connection/compose wiring).
8Authentication DSLNot started. #53-#55.

Recent milestones

History of internal milestone identifiers and the issue/PR they shipped through. Use this to map a section title like ## Hint-Augmentation (M6.7) back to the activation issue and closure date. Older completed milestones are summarised; newer ones get a one-line context.

M_CE.*, Cats Effect 3 migration (umbrella #95)

MilestoneThemeClosed
M_CE.1 (#96)Cats Effect 3 build dep + munit-cats-effect test harnessshipped
M_CE.2 (#97)Typed VerifyError ADT across parse/build/translate/backendshipped
M_CE.3 (#98)Resource[IO, _]-managed Z3 / Alloy / dump-sink backendsshipped
M_CE.4 (#99)Pipeline entries return IO[Either[VerifyError, _]]; Consistency.runConsistencyChecks returns IO[ConsistencyReport]shipped
M_CE.5 (#100)Parallel check dispatch (parTraverseN, --parallel <n>)shipped
M_CE.6 (#101)Cancel-aware Resource lifecycle for Z3 and Alloy backendsshipped
M_CE.7 (#102)spec-to-rest as CommandIOApp, structured IO[ExitCode], SIGINT wired to fiber cancellationshipped
M_CE.8 (#103)Single-API pipeline, synchronous wrappers removed, tests on CatsEffectSuiteshipped
M_CE.9 (#104)Concurrency docs + JMH ParallelVerifyBench with fixtures/golden/bench/parallel_verify.csvshipped

M_L.*, Translator soundness (umbrella #88, pivot #193)

Issue #88 closed 2026-04-26 after decomposition into M_L.0-M_L.4 + the global-proof umbrella #170. The Isabelle/HOL pivot #193 sealed the production cutover. See research/10 for the trust-chain framing.

MilestoneThemeClosed
M_L.0 (#126)Scope, scaffolding, contributor handoffshipped
M_L.1 (#127)IR denotational semantics for verified subset (Semantics.thy)shipped
M_L.2 (#128)Translator soundness theorem for verified subsetshipped
M_L.3 (#129)Per-run certificate emissionshipped, then deleted post-pivot because universal soundness made it vestigial
M_L.4 (#130)Subset expansion (a-l, including M_L.4.l IndexRel widen #210)shipped
Isabelle pivot (#193)Lean -> Isabelle/HOL + Code_Target_Scala; universal soundness theorem closes with zero sorryshipped
IR canonicalization (#202)Full input-language IR in Isabelle; lower :: expr_full => expr option projection; flat case-class extractionshipped
Trust dimension (#205)Per-check trust: "sound" | "best-effort"; new exit code 4 (ExitCodes.Trust); JSON schemaVersion 1->2shipped
Production routing (#192)Translator.translateExpr routes in-subset shapes through SpecRestGenerated.translate; hand-written translator only fires on out-of-subset declaration-level expressionsshipped
Codegen drift gate (#231)isabelle-build.yml regenerates SpecRestGenerated.scala and diffs against the committed copy on every PRshipped

M5.*, Test generation

MilestoneThemeClosed
M5.2Stateful tests (Hypothesis state machine)shipped
M5.3Structural tests (Schemathesis)shipped
M5.4Conformance runner (tests/run_conformance.py)shipped
M5.6Custom strategies (per-field @strategy)shipped
M5.8Sensitive fields (PII redaction in test logs)shipped
M5.9Transition tests (from/to/when transition rules)shipped
M5.10Built-in predicates (isValidURI, regex matches, len())shipped
M5.11 (#278 / #279 / #280)Native multi-target conformance — Backend.scala seam (ExprBackend/StrategyBackend/HarnessTemplates); ts-express Vitest+fast-check, go-chi go test+rapid, structural-lite for both; Python byte-identical oracle (closes #265)shipped
Temporal predicates (#86)Runtime artifacts for always(P) / eventually(P)shipped
Guarded positive transitions (#152)GuardSatisfier for predicate-conditional transitionsshipped
Per-status bundles for transition-driven entities (#153)Schemathesis bundle routing by statusshipped
Body and query inputs on via operations (#155)Operation-input plumbing for transition testsshipped

M6.*, LLM synthesis (umbrella #95-context)

MilestoneThemeClosed
M6.1 (#31)Operation classification, DIRECT_EMIT vs LLM_SYNTHESISshipped
M6.2 (#32)Dafny signature generation; inspect --format dafny[-prompt]shipped
M6.3 (#28)LLM integration + prompt engineering; synth tryshipped
M6.4 (#29)CEGIS feedback loop wired to dafny verify --log-format json; synth verifyshipped
M6.5 (#27)compile --with-synthesis, Dafny -> target translation, splice into Python projectshipped
M6.6 (#30)Graduated fallback, L1 prompt ladder + L3 model escalation + L4 skeleton + L5 report; L2 deferredshipped
M6.7 (#229)DafnyPro-style hint augmentation in repair prompts; replaces the originally planned #227 L2 path (closed wontfix-with-pivot, see research/12)shipped

Other shipped infrastructure

  • Cross-solver verdict agreement (#131), native cvc5 cross-check job in CI replays each per-check .smt2 and asserts agreement with the recorded rawStatus.
  • Property-based translator tests (#162), TranslatorPropTest.scala runs 50 random IRs/property over the verified subset.
  • Typed __init__(body) on ORM, SecretStr at the schema boundary (#212), Python codegen ergonomics.
  • Isabelle codegen drift gate (#231), isabelle-build.yml re-extracts and diffs against the committed SpecRestGenerated.scala on every PR.

Verification capabilities

Per-feature ledger for the verifier. For the prose explanation of each entry, see Verification Engine.

CapabilityStatus
IR-to-SMT translator (entities, enums, state relations, invariants)shipped
Synchronous Java Z3 backend + spec-to-rest verify CLIshipped
Invariant-satisfiability smoke checkshipped
SMT-LIB artifact export (--dump-smt)shipped
Invariant consistency checking (per-op requires, dead ops)shipped
Primitive-type alias refinement (type Positive = Int where ...)shipped
Cardinality on state relations (#store, uninterpreted >= 0)shipped
Invariant preservation checking (operation pre/post => invariant)shipped
Prime (X'), Pre(X), With record update, inline y in {...}shipped
Two-world state encoding + smart frame synthesisshipped
Cardinality pre/post delta axioms (insert / delete / identity)shipped
Counterexample-driven diagnostics (spans, entity/state decoding)shipped
Category-based error reporting + per-category suggestion hintsshipped
Verify-as-gate (refuse codegen on failed verification)shipped (#78), compile runs verify by default; --ignore-verify opts out
Machine-readable (--json / --json-out) diagnostics outputshipped
Richer suggested-fix templates (op/invariant/field-aware; opt-out via --no-suggestions)shipped
Human-readable "why this fails" narration (opt-out via --no-narration)shipped (#89), covers invariant_violation_by_operation, contradictory_invariants, unreachable_operation
Structural spec lints (type mismatch, unused entity, ...)shipped, check runs L01-L06 (#81); L04 ships a syntactic over-approximation, SAT-based overlap is candidate verify-side work
Per-check VC dump (--dump-vc <dir>), Z3 SMT-LIB and Alloy .als artifactsshipped
Unsat-core extraction (--explain), surface contributing spec spans on unsat diagnosticsshipped (Z3 always; Alloy when minisat.prover is bundled)
Native Alloy CLI cross-check job in CIshipped
Native cvc5 cross-check job in CI (cross-solver verdict agreement)shipped (#131), replays each per-check .smt2 through native cvc5 and asserts agreement with the recorded rawStatus
Set-literal membership (x in {A, B, C}), set algebra (union, intersect, minus, subset), non-state set membership via (select ...)shipped (Z3)
Standalone set comprehension as equality RHS (s = {x in D | P})shipped (Z3, via extensional sort-coerced quantifier axiom; requires binder element sort to match receiver element sort)
Powerset (^s), existential binder (some t in ^s | P(t))shipped (Alloy, bounded scope; default 5)
Powerset (^s), universal binder (all t in ^s | P(t), no t in ^s | ...)sharp error, Alloy rejects as higher-order-non-skolemizable regardless of body; rewrite as existential or as a first-order statement about s
Alloy-routed requires / enabled / preservation (invariants or ensures using ^)shipped, two-world preservation encoding (State + StatePost sigs, frame synthesis, negated post-invariant). Note: Alloy cannot discharge preservation when the invariant's negation is itself a universal over a powerset
Alloy source export (--dump-alloy, --dump-alloy-out)shipped
Temporal always(P), P holds in every state satisfying the invariantsshipped (Alloy, bounded scope; implemented as I and not P unsat-check). Runtime artifacts: temporal_always_<name> per-op behavioral test + @invariant() block in stateful test + x-temporal OpenAPI extension (see Test Generation section 4)
Temporal eventually(P), some state satisfies P and the invariantsshipped (Alloy, bounded scope; implemented as I and P sat-check). Runtime artifacts: stateful state machine emits a _eventually_seen_<name> flag in _reset, an @invariant() observer that flips it when P holds, and a teardown(self) method that asserts the flag (see Test Generation section 4). Strictly weaker than the Alloy proof, bounded by per-case trace length
Temporal fairness(op)not supported in v1, sharp error; requires trace-based verification via Alloy's var-sig mode (future work). No runtime artifacts emitted; testgen records a _testgen_skips.json entry instead
Cats Effect 3 build dependency + munit-cats-effect test harnessshipped in M_CE.1 (#96)
Typed VerifyError ADT across parse/build/translate/backendshipped in M_CE.2 (#97)
Resource[IO, _]-managed Z3 / Alloy / dump-sink backendsshipped in M_CE.3 (#98)
Pipeline entries (Parse.parseSpec, Builder.buildIR, Z3/Alloy Translator.*, WasmBackend.check, AlloyBackend.check) return IO[Either[VerifyError, _]]shipped in M_CE.4 (#99)
Consistency.runConsistencyChecks returns IO[ConsistencyReport] (per-check failures remain data inside the report, not in the Either channel)shipped in M_CE.4 (#99)
Parallel check dispatch (parTraverseN, --parallel <n>)shipped in M_CE.5 (#100)
Cancel-aware Resource lifecycle for Z3 and Alloy backends (per-check timeouts enforced inside the solvers)shipped in M_CE.6 (#101)
spec-to-rest as CommandIOApp, structured IO[ExitCode] subcommands, SIGINT wired to fiber cancellationshipped in M_CE.7 (#102)
Single-API pipeline, every Parse, Builder, Translator, WasmBackend, AlloyBackend, and Consistency entry point returns IO; synchronous wrappers removed; test suites migrated to CatsEffectSuite with per-module SpecFixtures.loadIR helpersshipped in M_CE.8 (#103)
Concurrency docs (docs/content/docs/pipelines/concurrency.mdx) + JMH ParallelVerifyBench with checked-in CSV at fixtures/golden/bench/parallel_verify.csvshipped in M_CE.9 (#104)
Property-based tests for the Z3/Alloy translators (translation determinism on identical IRs, free-identifier resolution against TranslatorArtifact, structural invariant preservation between Z3 and Alloy) via scalacheck-effect-munitshipped (#162), verify/test/TranslatorPropTest.scala runs 50 random IRs/property over the verified subset from research/10
Mechanically verified translator soundness in Isabelle/HOL: universal soundness theorem in proofs/isabelle/SpecRest/Soundness.thy closes with zero sorry over the verified subset; Code_Target_Scala extracts translate/eval/smt_eval plus the canonical IR ADT to ir/generated/SpecRestGenerated.scalashipped via the Isabelle pivot (#193); IR canonicalized in #202; CI runs isabelle build SpecRest per PR via .github/workflows/isabelle-build.yml. Per-run cert emission deleted post-pivot, universal soundness made it vestigial.
Per-check Trust dimension, each CheckResult carries trust: "sound" | "best-effort" driven by the extracted Isabelle lower projection; best-effort checks short-circuit to Skipped (category=soundness_limitation) before backend dispatch; new exit code 4 (ExitCodes.Trust)shipped (#205), JSON schemaVersion 1->2
Production verify path routes in-subset checks via the extracted (Isabelle-verified) translator, Translator.translateExpr calls lower(enums, e); on Some it routes through SpecRestGenerated.translate >>> SmtTermToZ3, on None falls back to the hand-written translator (used only by declaration-level expressions and out-of-subset shapes). Best-effort checks at the verify level always skip with soundness_limitation (no flag)shipped (#192)

Synthesis follow-ups

Open follow-ups for the M6 synthesis pipeline. For the prose context, see Synthesis Pipeline.

ConcernTracked in
Operation decomposition (L2 fallback)#227 closed wontfix-with-pivot, see research/12; replaced by M6.7 hint-augmentation
Dafny ServiceState reconciliation with SQLAlchemy / Postgresfollow-up
Go / Java targets for --with-synthesis (Python only today)follow-up
Few-shot examples machine-verified by dafny verify in CIfollow-up
Counterexample formatting (Z3 model -> human-readable)follow-up
Cross-family escalation (Anthropic -> OpenAI) within one --fallback runfollow-up

Test generation follow-ups

For the prose context, see Test Generation.

LimitTracked under
State-dependent preconditions on non-transition ops still skip ensures testsresidual tail of M5.9; not currently scoped
Guards outside > / >= / < / <= / = literal / != none / their conjunctionsextend GuardSatisfier in Behavioral.scala to cover further patterns case-by-case
Multi-target test gen (ts-express Vitest+fast-check, go-chi go test+rapid)shippedM5.11 (#139) via #278/#279/#280, closing #265
Default --with-tests after M5.4M5.12 (#140)

Closed not-planned

Decisions of record. These items have closed issues with explicit rationale; they're not on the roadmap and will not be picked up unless the underlying constraint changes.

  • Full Z3 proof-term export (Alethe / Z3-native), #90, closed 2026 not-planned. Z3 has no Alethe export and its native proof format buries quantifier instantiation in opaque steps; superseded by cross-solver agreement (#131) for solver-side defense in depth and by translator-soundness work (#88) for translator-side trust.
  • L2 operation decomposition for graduated fallback, #227, closed 2026-05-10 wontfix-with-pivot. Replaced by M6.7 hint-augmentation per the literature analysis in research/12 (DafnyComp 7% ceiling, Pass@4 plateau, +16pp from hint augmentation in DafnyPro POPL 2026).

Cross-references

On this page