Verification Engine
Edit on GitHubZ3 + Alloy 6 verification of .spec files before code generation
Last updated:
At a glance
The verification engine model-checks a .spec file against its own invariants before any code is
generated. Two solvers are wired in, each owning a disjoint class of checks:
- Z3 SMT solver (via
tools.aqua:z3-turnkey) handles all first-order-expressible checks: invariant satisfiability, consistency, preservation, dead-op detection. Unbounded proof; fast. - Alloy 6 (in-process via
org.alloytools:org.alloytools.alloy.core) handles checks whose expression uses supported second-order constructs (currently powerset^s) or temporal declarations. Bounded-scope search only; scope 5 by default, tunable via--alloy-scope.
Every verification check is routed to exactly one backend based on structural inspection of
the expression (there is no overlap), and each check line is tagged [z3] or [alloy] in the
CLI output so the attribution is visible.
For the per-feature shipped/planned/not-supported inventory of the verifier (44 rows covering temporal, powerset, set algebra, JSON output, narration, cross-solver, trust dimension, etc.), see Roadmap Verification capabilities.
Preservation VC shape
For every (operation, invariant) pair the engine asks Z3 whether the negated post-invariant is satisfiable under the pre-invariant, the operation's contract, and a synthesised frame:
unsat the invariant is preserved; sat a counterexample exists and the reporter decodes
it; unknown the solver gave up (treated as a failure).
Total checks per spec, with operations, invariants, temporal declarations:
Non-overlap routing (Z3 vs. Alloy)
Each check has exactly one owner. A small Classifier inspects the expression involved and
returns either VerifierTool.Z3 or VerifierTool.Alloy:
def classifyInvariant(inv: InvariantDecl): VerifierTool =
if inv.expr contains powerset (^s)
then VerifierTool.Alloy
else VerifierTool.Z3Rule is deterministic and structural, no runtime fall-through, no "try Z3 then Alloy". Each
CheckResult carries a tool field that surfaces in CLI output as [z3] or [alloy]. Specs
whose invariants are pure first-order (the vast majority) never invoke Alloy.
What "bounded" means for Alloy routing
Alloy's decision procedure explores instances within a scope, an upper bound on the number
of atoms per signature. Default is 5; tune with --alloy-scope N. Finding a counterexample
within scope real bug. No counterexample no bug up to scope N; it is not an unbounded
proof. The CLI line reads e.g.:
OK [alloy] global sat 23msThe sat means "a model was found within scope 5"; increasing --alloy-scope makes the search
deeper but more expensive.
--dump-alloy
Mirrors --dump-smt. Emits the generated Alloy .als source to stdout or to a file. Useful
for debugging translator output, feeding the source into a standalone Alloy Analyzer, or
snapshot-testing translator stability.
Temporal block (pragmatic v1)
Services may declare temporal properties next to invariants:
temporal everyUserReachable:
eventually(some u in users | u.active)
temporal alwaysOK:
always(all u in users | u.verified)Semantics in v1 (bounded single-state, not trace-based):
always(P)is checked by asking whether is satisfiable; if Alloy finds any -satisfying state where is false within the current scope, the property is violated.eventually(P)is checked by asking whether is satisfiable; if Alloy cannot find any -satisfying state where holds within the current scope, the property is unreachable.fairness(op)is not supported and raises a translator error; implementing it requires trace-based verification via Alloy 6'svar-sig mode, deferred to a future milestone.
The bounded-scope caveat applies: absence of a counterexample is not a proof for all sizes.
Increase --alloy-scope for deeper search.
Language convention
Inside ensures clauses, unprimed state references are pre-state; primed references (x')
are post-state; pre(x) is an explicit-but-equivalent way to name the pre-state.
Smart frame synthesis
For each state slot X, the translator classifies how the operation's ensures mentions it:
| Pattern | Synthesised axiom |
|---|---|
X' = RHS | none; the ensures fully specifies X' |
k not in X' (no other mention) | and |
X'[k].f = v | plus ; other fields of equal the corresponding pre-state fields; |
| No Prime mention | and |
| Unclassified Prime mention | no axiom (conservative; the under-constrained post-state admits more models, so the solver may return sat on the VC negation, surfacing as a spurious preservation failure the user investigates) |
Cardinality delta synthesis
A structural pattern-matcher over ensures recognises three shapes on state relations and emits
the corresponding cardinality-pre-post relationship:
- or frame-implied
Any other arithmetic shape leaves card_X_post uninterpreted, which is sound but weak;
affected preservation VCs may come back unknown.
dom(X) = dom(Y) lowers extensionally
dom(X) = dom(Y) and dom(X) != dom(Y) where both sides are Call(dom, [state-relation])
lower to (and its negation). This is load-bearing for invariants like
metadataConsistent in url_shortener.spec.
Reading diagnostics
On failure the CLI prints a spec-aware multi-line diagnostic per failing check. Diagnostics
carry a category, a primary source span, related spans, an optional counterexample, and an
opt-in suggestion.
Categories
| Category | When it fires | Typical remedy |
|---|---|---|
contradictory_invariants | global check is unsat | two invariants overlap-but-contradict, narrow one's range |
unsatisfiable_precondition | <Op>.requires is unsat | requires can never be true in isolation, relax an input predicate |
unreachable_operation | <Op>.enabled is unsat while <Op>.requires is sat | invariants block every valid pre-state, relax invariants or tighten the input type |
invariant_violation_by_operation | preservation VC is satisfied (i.e. invariant fails post) | tighten ensures so the invariant's constrained fields are pinned by = or a range predicate |
solver_timeout | Z3 returns unknown | raise --timeout, simplify the invariant, or split a heavy quantifier |
translator_limitation | IR construct not yet supported by the verifier | skip the affected check or narrow the invariant so the unsupported construct doesn't appear |
backend_error | solver crash / init failure | re-run with -v; if reproducible, file an issue with --dump-smt output |
Worked example: preservation violation
service BrokenUrlShortener {
entity UrlMapping {
click_count: Int
}
state {
metadata: Int -> lone UrlMapping
totalClicks: Int
}
operation Tamper {
input: code: Int
requires:
code in metadata
ensures:
metadata'[code].click_count = pre(metadata)[code].click_count - 100
}
operation Drain {
requires:
true
ensures:
totalClicks' = -1
}
invariant clickCountNonNegative:
all c in metadata | metadata[c].click_count >= 0
invariant totalClicksNonNegative:
totalClicks >= 0
}✘ <spec>: 2 failure(s), 0 skipped in 9 consistency checks (<elapsed>)
✘
✘ <spec>:22:2: error: operation 'Drain' violates invariant 'totalClicksNonNegative'
related: <spec>:33:2 (invariant 'totalClicksNonNegative' declared here)
Counterexample:
entities:
UrlMapping#0 { click_count = 0 }
UrlMapping#1 { click_count = 0 }
pre-state:
metadata = {}
totalClicks = 0
post-state:
metadata' = {}
totalClicks' = -1
Why this violates the invariant:
1. Invariant 'totalClicksNonNegative' requires:
(totalClicks >= 0)
2. Operation 'Drain' computes 'totalClicks' from:
(- 1)
3. The solver picked pre(totalClicks) = 0, producing post-state totalClicks' = -1.
4. The post-state value violates the bound on 'totalClicks' from invariant 'totalClicksNonNegative'.
hint: 'Drain' violates 'totalClicksNonNegative'. Tighten 'ensures' so the fields 'totalClicksNonNegative' constrains are pinned by '=' or a range predicate; see counterexample.
✘
✘ <spec>:12:2: error: operation 'Tamper' violates invariant 'clickCountNonNegative'
related: <spec>:30:2 (invariant 'clickCountNonNegative' declared here)
Counterexample:
inputs:
code = 1
entities:
UrlMapping#0 { click_count = -1 }
UrlMapping#1 { click_count = 99 }
UrlMapping#2 { click_count = -1 }
pre-state:
metadata = { 1 → UrlMapping#1 }
totalClicks = 0
post-state:
metadata' = { 1 → UrlMapping#0 }
totalClicks' = 0
Why this violates the invariant:
1. Invariant 'clickCountNonNegative' requires:
(all c in metadata | (metadata[c].click_count >= 0))
2. Operation 'Tamper' computes 'click_count' from:
(pre(metadata)[code].click_count - 100)
3. The solver picked code = 1, pre(metadata)[1].click_count = 99, producing post-state metadata'[1].click_count = -1.
4. The post-state value violates the bound on 'click_count' from invariant 'clickCountNonNegative'.
hint: 'Tamper' violates 'clickCountNonNegative' on field(s) 'click_count' — see counterexample. Tighten 'ensures' with a range predicate or a constructor that initialises click_count correctly.The hint names the violating operation, the violated invariant, and the field(s) the
invariant constrains, extracted from the IR rather than re-rendered. Suggestions are
capped at 200 chars and can be turned off entirely with --no-suggestions (the
suggestion field then comes back as null in JSON output as well). The phrasing is
intentionally non-stable text intended for humans; downstream tooling should switch on
the category enum, not on the suggestion string.
The numbered "Why this violates" block is the narration layer added in #89: a
deterministic structural walk over the violated invariant's IR, the relevant ensures
clause, and the decoded counterexample. Steps 1 and 2 pretty-print IR (fully
parenthesized, readability cost is small, ambiguity cost is none); step 3 reads
input/pre/post values directly off the model (no interpreter, no arithmetic
re-evaluation); step 4 is a fixed phrasing keyed to the invariant name. Narration is
populated for invariant_violation_by_operation, contradictory_invariants, and
unreachable_operation. Other categories surface narrative: null in JSON. Suppress
with --no-narration (mirrors --no-suggestions).
The decoder enumerates the model's entity universe, evaluates field functions, and shows
pre-/post-state relations side by side. Uninterpreted sort values like UrlMapping!val!0 are
rewritten to the readable UrlMapping#N labels.
Worked example: contradictory invariants
service Contradiction {
invariant: 1 >= 10
invariant: 1 <= 5
invariant: 1 >= 10 and 1 <= 5
}✘ <spec>: 1 failure(s), 0 skipped in 1 consistency checks (<elapsed>)
✘
✘ <spec>:2:2: error: invariants are jointly unsatisfiable — no valid state exists
related: <spec>:3:2 (invariant 'inv_1')
related: <spec>:4:2 (invariant 'inv_2')
Why these invariants conflict:
1. The verifier could not satisfy all invariants jointly.
2. Invariants involved: inv_0, inv_1, inv_2.
(Run with --explain to see the contributing pair.)
hint: The invariant set is jointly unsatisfiable; for example, review 'inv_0', 'inv_1', 'inv_2' for a pair whose range constraints cannot overlap (e.g., 'x >= 10' alongside 'x <= 5'); narrow or drop one.No counterexample is emitted here: the engine concluded unsat on the invariant conjunction
alone, so there's no model to decode. Pass --explain to extract the unsat core and sharpen
the related: list to the minimal contradictory subset; with --explain off, the related
spans cover the full invariant set.
Using the CLI
service UrlShortener {
// --- Type Definitions ---
type ShortCode = String where len(value) >= 6 and len(value) <= 10
and value matches /^[a-zA-Z0-9]+$/
type LongURL = String where len(value) > 0 and isValidURI(value)
type BaseURL = String where isValidURI(value)
// --- Entities ---
entity UrlMapping {
code: ShortCode
url: LongURL
created_at: DateTime
click_count: Int where value >= 0
invariant: isValidURI(url)
}
// --- State ---
state {
store: ShortCode -> lone LongURL
metadata: ShortCode -> lone UrlMapping
base_url: BaseURL
}
// --- Operations ---
operation Shorten {
input: url: LongURL
output: code: ShortCode, short_url: String
requires:
isValidURI(url)
ensures:
code not in pre(store)
store' = pre(store) + {code -> url}
short_url = base_url + "/" + code
#store' = #pre(store) + 1
metadata'[code].url = url
metadata'[code].click_count = 0
}
operation Resolve {
input: code: ShortCode
output: url: LongURL
requires:
code in store
ensures:
url = store[code]
store' = store
metadata'[code].click_count = pre(metadata)[code].click_count + 1
}
operation Delete {
input: code: ShortCode
requires:
code in store
ensures:
code not in store'
code not in metadata'
#store' = #pre(store) - 1
}
operation ListAll {
output: entries: Set[UrlMapping]
requires:
true
ensures:
entries = { m in metadata | true }
store' = store
}
// --- Global Invariants ---
invariant allURLsValid:
all c in store | isValidURI(store[c])
invariant metadataConsistent:
dom(store) = dom(metadata)
invariant clickCountNonNegative:
all c in metadata | metadata[c].click_count >= 0
// --- Convention Overrides ---
conventions {
Shorten.http_method = "POST"
Shorten.http_path = "/shorten"
Shorten.http_status_success = 201
Resolve.http_method = "GET"
Resolve.http_path = "/{code}"
Resolve.http_status_success = 302
Resolve.http_header "Location" = output.url
Delete.http_method = "DELETE"
Delete.http_path = "/{code}"
Delete.http_status_success = 204
ListAll.http_method = "GET"
ListAll.http_path = "/urls"
ListAll.http_status_success = 200
}
}⚠ <spec>: 4/21 checks passed; 17 skipped (17 soundness coverage) (<elapsed>)
⚠
⚠ <spec>:87:2: warning: check 'global' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:62:2: warning: check 'Delete.enabled' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:62:2: warning: check 'Delete.preserves.allURLsValid' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:62:2: warning: check 'Delete.preserves.metadataConsistent' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:62:2: warning: check 'Delete.preserves.clickCountNonNegative' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:74:2: warning: check 'ListAll.enabled' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:74:2: warning: check 'ListAll.preserves.allURLsValid' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:74:2: warning: check 'ListAll.preserves.metadataConsistent' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:74:2: warning: check 'ListAll.preserves.clickCountNonNegative' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:49:2: warning: check 'Resolve.enabled' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:49:2: warning: check 'Resolve.preserves.allURLsValid' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:49:2: warning: check 'Resolve.preserves.metadataConsistent' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:33:2: warning: check 'Shorten.requires' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:33:2: warning: check 'Shorten.enabled' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:33:2: warning: check 'Shorten.preserves.allURLsValid' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:33:2: warning: check 'Shorten.preserves.metadataConsistent' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…
⚠
⚠ <spec>:33:2: warning: check 'Shorten.preserves.clickCountNonNegative' skipped: outside lower's coverage
hint: The check involves a construct that is outside the formally verified subset captured by the Isabelle 'lower' projection. Narrow the spec to lowerable shapes — or, if the construct is essential to you…Add -v for stage timing, per-check timing, and the full check table. Each per-check line
carries its routed backend tag ([z3] or [alloy]):
service SafeCounter {
state {
count: Int
}
operation Increment {
requires:
true
ensures:
count' = count + 1
}
operation Decrement {
requires:
count > 0
ensures:
count' = count - 1
}
invariant countNonNegative:
count >= 0
} Parsed in <elapsed>ms
Built IR in <elapsed>ms
Timeout: <elapsed>ms
Alloy scope: 5
Max parallel: <cores>
✔ <spec>: 7/7 consistency checks passed (<elapsed>)
✔ [z3] [sound] global sat <elapsed>ms
✔ [z3] [sound] Decrement.requires sat <elapsed>ms
✔ [z3] [sound] Decrement.enabled sat <elapsed>ms
✔ [z3] [sound] Decrement.preserves.countNonNegative sat <elapsed>ms
✔ [z3] [sound] Increment.requires sat <elapsed>ms
✔ [z3] [sound] Increment.enabled sat <elapsed>ms
✔ [z3] [sound] Increment.preserves.countNonNegative sat <elapsed>msSkipped checks (translator coverage gap) print as a separate WARN translator limitation on check '<id>': <reason> warning block before the surviving entries in the per-check table;
they don't appear inline as a skipped row.
Options
Prop
Type
Exit codes
| Code | Meaning |
|---|---|
0 | All translated checks passed (sat); no checks were skipped. |
1 | Specification or spec-file issue: at least one ran check failed (unsat / preservation sat / timeout), or the spec could not be read / parsed / IR-built. |
2 | Translator coverage gap: the verifier hit an unsupported construct, either when translating the whole spec (e.g., under --dump-smt) or while running individual checks. In normal check runs affected checks surface as skipped; other checks may still pass. |
3 | Backend error (solver init failure or solver crash). |
4 | Soundness coverage gap: at least one check was skipped because its source exprs fall outside the Isabelle lower projection, and no higher-precedence outcome (Backend / Violation / Translator) was raised. |
Precedence (highest -> lowest): Backend (3) -> Violation (1) -> Translator (2) -> Trust (4) -> Ok (0). A backend crash always wins over a translator gap; a violation always wins over a soundness gap.
Exit 2 is a partial-success signal: the spec may still be correct, but the verifier could
not fully discharge every check. Skipped checks print as warnings with a translator_limitation
diagnostic so CI / editors can surface the coverage gap alongside successes. Exit 4 is the same
shape but driven by the verified subset: the skipped checks carry a soundness_limitation
diagnostic, signalling the spec uses constructs the formal soundness theorem does not yet cover
(rather than constructs the translator does not handle). Verify routes every check that does
run through the extracted (Isabelle-verified) translator; checks that would route through the
hand-written translator at the check-body level always skip instead.
Inspecting the SMT-LIB
--dump-smt emits a standalone .smt2 file you can hand to any SMT solver, useful for
cross-solver sanity checks, bug reports, or pasting into the
Z3 web playground.
$ spec-to-rest verify --dump-smt-out /tmp/out.smt2 fixtures/spec/url_shortener.spec
Wrote SMT-LIB to /tmp/out.smt2
$ z3 /tmp/out.smt2
satCI runs exactly this cross-check against the checked-in
fixtures/golden/smt/url_shortener.smt2 snapshot using apt-get install -y z3, if the
emitter ever produces invalid SMT-LIB or loses satisfiability, CI fails loudly.
Machine-readable output (--json / --json-out)
spec-to-rest verify --json <spec> runs the full engine and emits a structured JSON report
to stdout instead of the human-readable text. --json-out <file> writes the same JSON to a
file and keeps stdout clean. Either flag can compose with --explain (core spans land in the
JSON) and --dump-vc <dir> (VC artifacts go to the dir, JSON to stdout/file).
Exit codes (text and JSON modes alike): 0 pass, 1 violation, 2 translator gap, 3
backend error, 4 soundness gap (at least one check was skipped because its source exprs fall
outside the verified subset). Backend errors take precedence over violations and over soundness
gaps; the JSON complements the exit signal rather than replacing it.
Combining --json / --json-out with --dump-smt / --dump-alloy is rejected with a clear
error: the dump flags short-circuit before any checks run, so there is no report to serialize.
Top-level schema
{
"schemaVersion": 2,
"specFile": "fixtures/spec/url_shortener.spec",
"ok": false,
"totalMs": 842.0,
"checks": [ /* ... */ ]
}Each entry in checks mirrors the internal CheckResult 1:1:
{
"id": "Tamper.preserves.clickCountNonNegative",
"kind": "preservation",
"tool": "z3",
"trust": "best-effort",
"operationName": "Tamper",
"invariantName": "clickCountNonNegative",
"status": "unsat",
"durationMs": 142.0,
"detail": "operation 'Tamper' does not preserve invariant '...', counterexample found",
"sourceSpans": [{ "startLine": 11, "startCol": 2, "endLine": 19, "endCol": 3 }],
"diagnostic": {
"level": "error",
"category": "invariant_violation_by_operation",
"message": "operation 'Tamper' violates invariant 'clickCountNonNegative'",
"primarySpan": { "startLine": 11, "startCol": 2, "endLine": 19, "endCol": 3 },
"relatedSpans": [{ "span": {...}, "note": "invariant '...' declared here" }],
"counterexample": {
"entities": [{ "sortName": "UrlMapping", "label": "UrlMapping#0",
"rawElement": "UrlMapping!val!0",
"fields": [{ "name": "click_count",
"value": { "display": "-1", "entityLabel": null } }] }],
"stateRelations": [{ "stateName": "metadata", "side": "pre",
"entries": [{ "key": {...}, "value": {...} }] }],
"stateConstants": [],
"inputs": [{ "name": "code", "value": { "display": "2", "entityLabel": null } }]
},
"suggestion": "Tighten the 'ensures' clause so...",
"narrative": "Why this violates the invariant:\n 1. Invariant 'clickCountNonNegative' requires:\n (all c in metadata | (metadata[c].click_count >= 0))\n ...",
"coreSpans": []
}
}Passing checks have diagnostic: null. Checks without a counterexample (e.g. global
unsat) have counterexample: null. coreSpans is an empty array when --explain is off
or the backend can't provide a core. narrative is null outside the three categories that
support narration (see Reading diagnostics) or when --no-narration
is set.
Stable enum tokens
These snake_case strings are the contract consumers depend on. Breaking changes bump
schemaVersion.
| Field | Values |
|---|---|
kind | global, requires, enabled, preservation, temporal |
status | sat, unsat, unknown, skipped |
tool | z3, alloy |
trust | sound, best-effort |
level | error, warning |
category | contradictory_invariants, unsatisfiable_precondition, unreachable_operation, invariant_violation_by_operation, solver_timeout, translator_limitation, backend_error, soundness_limitation |
Example consumer queries
# Did verification pass?
spec-to-rest verify --json spec.spec | jq .ok
# List every failing check ID.
spec-to-rest verify --json spec.spec | jq '.checks[] | select(.status != "sat" and .status != "skipped") | .id'
# Extract the first preservation counterexample's inputs.
spec-to-rest verify --json spec.spec \
| jq '.checks[] | select(.diagnostic.category == "invariant_violation_by_operation") | .diagnostic.counterexample.inputs'The JSON is complementary to --dump-vc: --dump-vc emits
the raw solver inputs for replay; --json emits the decoded diagnostic data for consumption
by editors, CI bots, and aggregators.
Verification certificates
The verifier can write per-check VC artifacts for replay and audit, and can extract unsat cores for diagnosing failure causes.
Per-check VC dump (--dump-vc)
spec-to-rest verify --dump-vc out/ <spec> writes one file per check into out/, plus
out/verdicts.json:
out/
Z3-routed checks emit SMT-LIB; Alloy-routed checks emit .als. Each file is the exact
input that was passed to the live backend, replayable with a fresh z3 -in or the Alloy
CLI to re-derive the verdict against an independent solver / version. The verdicts.json
index records the verdict, duration, and tool for each check.
The optional z3-cross-check CI job iterates fixtures/golden/vc/*/verdicts.json,
replays each .smt2 file listed under an entry with tool == "z3" through native Z3,
and asserts agreement with the entry's rawStatus. The symmetric alloy-cross-check
job iterates the same verdicts.json files, selects entries with tool == "alloy",
replays each entry's .file through the native Alloy 6.2.0 CLI (java -jar org.alloytools.alloy.dist.jar exec), parses receipt.json (presence of a solution
key on the single run command = sat, absence = unsat), and asserts the same
agreement. The third sibling job, cvc5-cross-check, replays the same Z3-tool
.smt2 files through native cvc5 (cvc5 --finite-model-find --tlimit-per=30000)
to catch divergence between two independent SMT engines (defense in depth against
solver-specific soundness bugs, since neither solver has had a CVE the other shared).
A cvc5 unknown is treated as a skip (not all checks decide in 30s on cvc5's default
strategy); a definite verdict that disagrees with the recorded rawStatus fails the
job and uploads the offending .smt2 artifact. Regenerate the goldens by running
sbt "cli/run verify <spec> --dump-vc fixtures/golden/vc/<name>".
rawStatus records what the solver actually returned for the SMT-LIB (sat / unsat /
unknown). outcome records the user-facing verdict after preservation-style inversion (a
preservation check is "preserved = sat outcome" when the solver returned unsat = no
counterexample). The cross-check uses rawStatus; the CLI report uses outcome.
Unsat-core extraction (--explain)
spec-to-rest verify --explain <spec> re-runs the engine in core-extracting mode. For each
check whose outcome is unsat, the diagnostic grows an unsat core (contributing assertions): section listing the spec source positions of the assertions that participated
in the contradiction:
spec.spec:42:3: error: invariants are jointly unsatisfiable, no valid state exists
unsat core (contributing assertions):
spec.spec:14:3 contributing invariant
spec.spec:30:3 contributing invariantMechanism per backend:
- Z3 uses
Solver.assertAndTrackwith one tracker constant per assertion; afterunsat,getUnsatCore()returns the tracker names whose originating assertions form the core. - Alloy switches
SATFactorytominisat.prover(the only proof-emitting solver bundled in Pardinus 6.2.0), enablescoreGranularity = 1+coreMinimization = 1, and readsA4Solution.highLevelCoreafterunsat. Each returnedPosis mapped back to a spec span via the line-map produced during.alsrendering.
--explain adds modest overhead (Z3 switches to assert-and-track; Alloy switches to a
core-capable SAT backend and minimizes). Use it when investigating failures, not on every run.
Known limitation: Alloy's highLevelCore is empty for contradictions that depend on
arithmetic encoding (e.g. cardinality constraints #s = N). Those still produce a correct
unsat verdict; only the per-fact attribution is unavailable. Relational / quantifier-driven
contradictions return populated cores.
Mechanically verified translator (Isabelle/HOL)
The translator's correctness is mechanically validated by the universal soundness theorem in
proofs/isabelle/SpecRest/Soundness.thy. The theorem closes with zero sorry for the
verified subset (atoms; logical/arithmetic/comparison operators; state-relation
membership/cardinality/lookup/subset; set literals and set-valued operations; FieldAccess on
entity-typed values; single-state Prime/Pre collapse; quantifiers over enums and
state-relations; record-update via Skolem encoding).
When verify returns UNSAT for an in-subset obligation, that verdict reflects a property of
the spec, not a coincidence between the translator and Z3.
Isabelle's Code_Target_Scala extracts the verified translate, eval, and smt_eval
functions plus the canonical IR ADT to ~2.4 kLoC of idiomatic Scala 3 (BigInt-mapped) at
ir/generated/SpecRestGenerated.scala. The Scala
layer's translate is no longer hand-written; it is the extracted Isabelle definition.
Since #202, the IR ADT consumed by every module is also extracted.
CI builds the proofs every PR via .github/workflows/isabelle-build.yml (~1m 30s for
isabelle build SpecRest).
Trust closure: Isabelle/HOL kernel + Z3 driver + the extracted-Scala output. The previously
emitted per-run --emit-cert Lake bundles were deleted post-pivot; universal soundness
covers what they validated, structurally.
Delivered through #88 (closed 2026-04-26 after decomposition into M_L.0-M_L.4 + the #170 global-proof umbrella) and the Isabelle/HOL pivot #193.
What's intentionally not exported
Full Z3 proof terms (Alethe-format, machine-checkable by carcara / veriT) are not
exported. Solver.getProof() is reachable through the JNI binding but format / replay tooling
is unsettled and there is no concrete consumer today. Tracked separately in
#90.
Timeout, cancellation, and parallel dispatch
--timeout maps to a solver-native deadline (Z3's timeout parameter, Alloy's
Future.get(timeout, MILLISECONDS)); there is no outer IO.timeoutTo wrapper. The
solver_timeout diagnostic above is the only verifier-specific surface , the inner
deadline expiry produces a CheckOutcome.Unknown carrying that category. 0 disables the
solver-native deadline; the 30 s default is conservative headroom for heavier preservation
checks on large specs.
--parallel n (or 0 for serial) and the cancel-on-Ctrl+C / SIGINT path live entirely in
the effect layer: per-check fibers via parTraverseN, Resource[IO, WasmBackend] /
Resource[IO, AlloyBackend] for backend lifecycle, Context.interrupt() wired into
onCancel, and a CommandIOApp shutdown hook for signal handling. The full mechanics
(JMH numbers, native-memory caveats, the cancellation contract, and the Context.interrupt()
mid-call abort details) are documented in
Concurrency & Cancellation.
Set theory
Set[T] maps to Z3's built-in set theory (extensional Array T Bool under the hood). The
SMT-LIB emitter and the Z3 Java backend agree on the same surface:
| Spec form | SMT-LIB |
|---|---|
Set[Int] type | (Set Int) |
x in {A, B, C} | (or (= x A) (= x B) (= x C)), exact lowering |
x not in {A, B, C} | (not (or ...)) |
x in (a union b) etc. | (select <expr> x) |
a union b | (union a b) |
a intersect b | (intersection a b) |
a minus b | (setminus a b) |
a subset b | (subset a b) |
{ } (empty set) | ((as const (Set T)) false), requires receiver context |
{ e1, e2 } (non-empty lit) | (store (store ((as const (Set T)) false) e1 true) e2 true) |
Set-literal membership (in {A, B, C}) lowers to a disjunction of equalities rather than a
select against a skolem set; this is exact and avoids introducing uninterpreted
infrastructure for the common enum-membership pattern.
{} as a standalone expression is rejected (the element sort can't be inferred from the
literal itself); assign it to a typed receiver or constrain it by equality.
Standalone set comprehensions (entries = { x in D | P }) remain a translator throw: the
comprehension's element sort is the binder's sort (e.g. a relation's key sort), which
typically mismatches the receiver's declared type and would produce a sort-mismatch rather
than a useful verdict. Use the inline membership form instead: y in { x in S | P }.
Extending the translator
Adding a new IR expression kind is a three-file change:
verify/z3/Translator.scala, add a case totranslateExpr; declare any helper uninterpreted functions lazily viactx.declareFunc. Spans flow automatically thanks to the wrappingtranslateExprcall.verify/z3/SmtLib.scala, add a case torenderExprthat produces valid SMT-LIB text.verify/z3/Backend.scala, add a case torenderExprthat produces acom.microsoft.z3.BoolExpr/Arith/Expras appropriate.
Unit-test the new case in verify/test/z3/SmtLibGoldenTest.scala
(Z3Script shape + SMT-LIB render, no solver), and add a scenario to
verify/test/z3/BackendTest.scala if the case requires solver
round-tripping. End-to-end behaviour goes in verify/test/ConsistencyTest.scala
or verify/test/z3/CounterExampleTest.scala. Update fixtures/golden/smt/url_shortener.smt2 by dumping the new SMT
via sbt "cli/run verify --dump-smt-out fixtures/golden/smt/url_shortener.smt2 fixtures/spec/url_shortener.spec"
when the snapshot legitimately changes.
Verify-as-gate in compile
spec-to-rest compile runs the verification engine as a pre-codegen gate. If any check
reports a failure, compile exits non-zero and writes no files, the output directory stays
untouched (not even created). For a deployment profile that claims "verified correct", the
gate is the guarantee that what ships matches what was checked.
The gate shares one code path with spec-to-rest verify: same Consistency.runConsistencyChecks
call, same diagnostic reporter, same exit-code mapping (0 / 1 / 2 / 3). A spec that verify
exits 1 on will make compile exit 1 before any emit runs. A spec that verify clears
(exit 0) compiles normally.
Escape hatch
When you need to emit code against a spec that doesn't (yet) pass verification (active
debugging, a known translator-coverage gap, a demo on a counterexample spec), pass
--ignore-verify:
$ spec-to-rest compile --ignore-verify --out /tmp/out fixtures/spec/unsat_invariants.spec
WARN proceeding without verification (--ignore-verify)
OK wrote 26 files to /tmp/outThe warning prints on every --ignore-verify run; the flag is conspicuous by design, not a
silent passthrough.
Gate defaults and tuning
The gate uses verify's defaults: 30 s per-check timeout, Alloy scope 5, host-CPU parallelism.
There are no --timeout / --parallel / --alloy-scope flags on compile today. If the
defaults don't fit your spec, run spec-to-rest verify --timeout X --parallel N <spec> first
to confirm green, then spec-to-rest compile --ignore-verify <spec>. File an issue if the
pattern comes up often and the gate deserves its own tuning knobs.
Exit codes
The gate inherits verify's contract unchanged:
| Code | Meaning |
|---|---|
| 0 | Gate passed. Codegen ran; files written. |
| 1 | One or more checks reported unsat or unknown. No files written. |
| 2 | Translator coverage gap (Skipped checks with TranslatorLimitation). |
| 3 | Backend error (Z3 / Alloy crash or setup failure). |
A spec that exits 2 blocks codegen because the gate cannot prove the spec correct; use
--ignore-verify if you accept the coverage gap.
See also
- Convention Engine, the other half of the M1-M10 pipeline.
- Architecture, where verification sits in the overall compiler.
- Research: Spec Verification, the full verification design (pipeline, Alloy/Quint backends, counterexample shape).