spec_to_rest
Research

DSL Compiler Frameworks

Edit on GitHub

Parser and IDE framework evaluation (ANTLR4 vs Langium vs others)

Last updated:

Post-Audit Update (April 2026): The original analysis below recommended Langium as the primary parser/IDE framework. After a devil's advocate audit (documented in audit/06_langium_risks.md before it was folded into this update), the recommendation has changed to ANTLR4 via antlr-ng in TypeScript. The Langium analysis is preserved in full as valuable reference material, the framework remains a viable future option if IDE support becomes the top priority and the project matures further. See the Updated Recommendation section below for the revised decision and rationale.

Key audit findings that drove the change:

  1. Typir cannot handle our type system. It lacks refinement types, relation types, generics, and quantified expressions. We must build ~80% of our type checker ourselves regardless, eliminating Typir as a load-bearing advantage.
  2. Langium AI is effectively vaporware. Version 0.0.2, 20 GitHub stars, incompatible with Langium 4.0, and the "AI features that leverage LLMs directly" are listed as "coming soon." Not a differentiating factor.
  3. Fastbelt (Go-based successor) signals strategic drift. TypeFox is building a Go-based high-performance successor that is 21-33x faster than Langium. This raises questions about where TypeFox's best engineers will focus over the next 2-3 years.
  4. Community is very thin. 985 stars, 22 contributors (mostly TypeFox employees). If we hit a framework bug, we may need to debug and patch Langium ourselves.
  5. Z3 WASM works but is 2-5x slower than native. For performance-critical constraint solving, a native Z3 subprocess is preferable, with WASM as a fallback for environments where native binaries are unavailable.
  6. ANTLR4/antlr-ng provides a stronger foundation. 17k+ GitHub stars, battle-tested ALL(*) parsing, massive community, no framework lock-in, and the antlr-ng TypeScript port is production-ready and blessed by Terence Parr. We build our own type checker (which we had to do anyway) and add LSP support incrementally when needed.

Context

We are building a compiler for a formal specification DSL for REST services. The DSL requires: entities, state declarations, operations with pre/postconditions, invariants, and a type system. The compiler needs: a parser, type checker, IDE support (LSP), error reporting, and code generation (to OpenAPI, test harnesses, Z3 constraints, etc.).

This document evaluates seven language engineering frameworks for building this compiler.


1. Langium (TypeScript-based)

What it is: An open-source language engineering framework built entirely in TypeScript by TypeFox (the creators of Xtext). It is the spiritual successor to Xtext, designed from the ground up for the web and Node.js ecosystem. Currently at version 4.0, released mid-2025.

GitHub: 985 stars, 92 forks, actively maintained (last update April 2026) npm: ~254K weekly downloads; 82+ dependent packages

Grammar definition

Langium uses an EBNF-like grammar language (.langium files) that simultaneously defines:

  • The concrete syntax (what the user types)
  • The abstract syntax tree (TypeScript interfaces generated automatically)
  • Cross-references (name resolution linkage)

Key grammar features:

  • Parser rules: Person: 'person' name=ID;
  • Assignments: = (single), += (array), ?= (boolean flag)
  • Cross-references: person=[Person:ID], resolves references by name
  • Cardinalities: ? (optional), * (zero-or-more), + (one-or-more)
  • Alternatives: | for choice
  • Unordered groups: & operator for properties in any order
  • Tree-rewriting actions: For left-recursive expression patterns
  • Infix operators (4.0): Declarative precedence/associativity definitions that parse ~50% faster
  • Guard conditions: Parameterized rules with conditional sections
  • Data type rules: QualifiedName returns string: ID ('.' ID)*;
  • Rule fragments: Reusable grammar patterns

Example grammar sketch for a service DSL:

grammar ServiceSpec

entry Specification:
    (entities+=Entity | services+=Service | invariants+=Invariant)*;

Entity:
    'entity' name=ID '{'
        (fields+=Field)*
    '}';

Field:
    name=ID ':' type=TypeRef;

TypeRef:
    primitive=PrimitiveType | reference=[Entity:QualifiedName]
    | collection=CollectionType;

CollectionType:
    kind=('List' | 'Set' | 'Map') '<' typeArgs+=TypeRef (',' typeArgs+=TypeRef)* '>';

PrimitiveType:
    name=('String' | 'Integer' | 'Boolean' | 'DateTime' | 'UUID');

Service:
    'service' name=ID '{'
        ('state' '{' (stateDecls+=StateDecl)* '}')?
        (operations+=Operation)*
    '}';

StateDecl:
    name=ID ':' type=TypeRef ('=' init=Expression)?;

Operation:
    'operation' name=ID '(' (params+=Parameter (',' params+=Parameter)*)? ')'
    (':' returnType=TypeRef)?
    '{'
        ('requires' '{' preconditions+=Condition* '}')?
        ('ensures' '{' postconditions+=Condition* '}')?
    '}';

Parameter:
    name=ID ':' type=TypeRef;

Condition:
    expression=Expression ';';

Invariant:
    'invariant' name=ID 'on' target=[Entity:QualifiedName]
    '{' expression=Expression '}';

// Expressions would use infix operators in Langium 4.0
infix BinaryExpr on PrimaryExpr:
    '&&' | '||'
    > '==' | '!=' | '<' | '>' | '<=' | '>='
    > '+' | '-'
    > '*' | '/';

What you get for free

CapabilityQualityNotes
ParserExcellentChevrotain-based, outperforms ANTLR in JS benchmarks
Type-safe ASTExcellentTypeScript interfaces generated from grammar rules
Linking & scopingGoodCross-reference resolution with customizable scoping
LSP serverExcellentDeeply integrated; code completion, diagnostics, find references, hover, rename, formatting
VS Code extensionScaffoldYeoman generator creates complete extension project
CLIScaffoldGenerated alongside the language server
Web workerScaffoldFor browser-based editors
Workspace managementGoodMulti-file projects, incremental updates
Validation frameworkGoodRegister custom checks per AST node type
Code generation utilitiesBasicText generation helpers with source-map support

What you must build yourself

  • Type checker: Langium does not include a built-in type system engine. However, Typir (also by TypeFox) is a companion library (typir-langium on npm) providing type inference, assignability checking, and validation hooks that integrate into Langium's lifecycle. It is usable but still maturing.
  • Code generators: You write TypeScript functions that traverse the AST. Langium provides utilities for text generation with traceability.
  • Solver integration: No built-in support, but z3-solver npm package (official Microsoft Z3 WASM bindings) integrates naturally into the same TypeScript project.

Maturity assessment

  • Version 1.0 released 2023; version 4.0 released mid-2025 with significant features
  • Built by TypeFox, the same company behind Xtext, Theia, and Eclipse Sprotty
  • Eclipse Foundation project since 2023
  • Used in production across multiple companies (specific names not publicly listed)
  • Langium AI (announced April 2025, updated June 2025): A toolbox for grounding LLMs on DSL knowledge, provides evaluation pipelines, document splitting respecting syntactic boundaries, and BNF-derived constrained decoding for LLM token output. Directly relevant to our LLM integration needs.

Learning curve

Moderate. Any developer fluent in TypeScript can be productive within 1-2 weeks. The grammar language is intuitive for anyone familiar with BNF/EBNF. The Yeoman generator scaffolds a complete project in minutes.


2. Xtext (eclipse/JVM-based)

What it is: The mature predecessor to Langium, built on Java/Eclipse/EMF. Has been the industry standard for DSL engineering since ~2010. Still maintained (requires Java 17+, Eclipse 2024-03+).

GitHub: 823 stars, 330 forks

What xtext gives you vs Langium

AspectXtextLangium
Maturity15+ years, battle-tested5 years, rapidly maturing
Feature completenessMore complete (formatting, serialization)Catching up; most features present
Grammar languageVery similar EBNF-like syntaxNearly identical, evolved from Xtext's
ASTEMF-based EObjectsPlain TypeScript objects
Type systemXbase integration for Java-like type systemsTypir library (newer, less mature)
Code generationXtend templates (powerful)TypeScript functions
LSP supportAdded later, some architectural frictionNative, deeply integrated
Startup time~4 seconds (JVM cold start)~1 second
IDEEclipse-native + LSP for othersVS Code native + LSP for others
Web deploymentPossible but complexFirst-class (web workers)

Is eclipse still required?

For language development: practically yes. The Xtext tooling (grammar editor, generator) runs inside Eclipse. You can build standalone language servers and CLIs that don't require Eclipse at runtime, but the development workflow is Eclipse-centric. Attempts to decouple fully from Equinox have faced architectural challenges (the xtext.ide bundle still depends on org.eclipse.core.runtime).

LSP support quality

Functional but not native. Xtext's architecture was designed for Eclipse's own editor framework; LSP was retrofitted. LSP4J is used but has dependency entanglements with IDE-specific code. Works for common operations but can have gaps compared to the native Eclipse experience.

Performance for large files

This is a critical weakness. Xtext's CST consumes ~80% of memory. Full workspace builds require loading every resource. EMF objects lack thread-safe guarantees, limiting parallelization. For files exceeding 1MB, response times can exceed 1000ms.

Recommendation for xtext

Do not choose Xtext for a new project in 2026. TypeFox (who created both) explicitly recommends Langium for new projects. Xtext is in maintenance mode. The technology stack (Java, EMF, Eclipse) adds overhead without proportional benefits for our use case. The only reason to choose Xtext would be if you needed Xbase (Java-like type system integration), but Typir is filling that gap for Langium.


3. Spoofax (TU delft)

What it is: An academic language workbench from the Programming Languages group at TU Delft. Unique in offering declarative meta-languages for every aspect of language definition.

GitHub: 163 stars (Spoofax 2), 14 stars (Spoofax 3/PIE); last Spoofax 2 release: v2.5.23 (April 2025)

Architecture: Three declarative meta-languages

  1. SDF3 (Syntax Definition Formalism 3): Declarative syntax specification with disambiguation, layout sensitivity, and error recovery. More powerful than BNF-based approaches, supports scannerless parsing (no separate lexer), which means it can handle language composition without ambiguity.

  2. Statix: The most unique component. A constraint-based meta-language for static semantics using scope graphs.

    • You declare type-checking rules as constraints over terms
    • Name binding is modeled via scope graphs, a formalism where scopes are nodes and edges represent containment, import, and inheritance relationships
    • Type-checking and name resolution are unified: resolving a name and checking its type happen in the same constraint-solving framework
    • Supports generics (demonstrated with Featherweight Generic Java), structural records, parametric polymorphism
    • The specification is itself statically checked
  3. Stratego: A term-rewriting language for transformations. Used for code generation and interpretation through pattern-matching rewrite rules with strategy combinators.

What makes it unique

Statix/scope graphs is genuinely novel. Instead of hand-coding a type checker, you declaratively specify what your type system is, and the solver handles the checking. For a DSL with entities, operations, pre/postconditions, and invariants, Statix would let you express the type rules very naturally:

typeOfExpr(s, FieldAccess(e, f)) = T :-
    typeOfExpr(s, e) == ENTITY(entityScope),
    resolveField(entityScope, f) == T.

This is intellectually elegant and would produce very high-quality error messages since the constraint solver knows exactly which constraint failed.

Practical concerns

AspectAssessment
IDE supportEclipse plugins only (generated from specs)
LSPNo standalone LSP server
DistributionEclipse plugin or standalone JVM application
Learning curveSteep, SDF3 + Statix + Stratego are three separate meta-languages
CommunitySmall, primarily academic (~20-30 active contributors)
DocumentationImproving but still patchy; Spoofax 3 docs are incomplete
Spoofax 3 status"Experimental, work-in-progress, rather than recommended for production"
Z3 integrationDifficult, JVM-based, would need JNI bindings
LLM integrationNo ecosystem support
Web deploymentNot supported

Industrial use

One notable case study: OIL (Open Interaction Language), an industrial DSL for control software. Research found Spoofax more productive than Python for implementing this DSL, especially for editor services. However, this remains exceptional.

Verdict on spoofax

The Statix type system specification approach is the most powerful and theoretically sound of any framework evaluated here. However, the practical tradeoffs are severe: Eclipse-only IDE, no LSP, tiny community, steep learning curve across three meta-languages, no web or VS Code story, and Spoofax 3 is not production-ready. We should study Statix's scope graph approach as intellectual inspiration for our type checker design but not adopt Spoofax as our framework.


4. Jetbrains MPS

What it is: A projectional editing environment for DSLs. Instead of text-based editing with parsing, users directly edit the AST, which is projected as text, tables, diagrams, or mixed notations.

GitHub: 1,644 stars, 311 forks

How projectional editing differs

In traditional DSL tools: User types text -> parser converts to AST -> tools operate on AST. In MPS: User edits AST directly -> MPS projects it as whatever notation you choose.

Key consequences:

  • No parsing required: The AST is the source of truth
  • No ambiguity: Multiple languages can be freely composed without grammar conflicts
  • Rich notations: Tables, images, math notation, GUI widgets can be part of the syntax
  • No syntax errors: The editor only allows structurally valid edits

Git integration

This is the major practical problem. Since code is stored as XML (not human-readable text):

  • Standard git diff and git merge are nearly useless on MPS model files
  • MPS provides custom diff/merge tools that work on the AST level using UUIDs
  • Merge conflicts must be resolved inside MPS, rather than in any text editor or GitHub UI
  • Code review on GitHub/GitLab is impractical, you cannot read the XML diffs
  • Teams must use MPS's built-in VCS integration for effective collaboration

Scalability

  • Single-root elements handle up to ~4,000 lines without issues
  • Tested with ~100,000 lines of C code
  • Adoption curve: "a few days for most users to become accustomed"

Practical assessment for our use case

FactorAssessment
User experienceUsers must install MPS or a standalone MPS-based IDE (~500MB+)
DistributionStandalone IDE or MPS plugin (heavyweight)
Learning curveSteep for language designers; moderate for end users
Git workflowSeverely impacted, no standard code review
Web deploymentNot supported natively
LSPNot applicable (no text-based editing)
Z3 integrationPossible via Java/Kotlin, but unconventional
LLM integrationDifficult, LLMs generate text, not AST operations
CommunityModerate but niche; heavily JetBrains-dependent

Verdict on MPS

Not suitable for our use case. Our DSL will be text-based (to integrate with standard developer workflows, version control, code review, CI/CD, LLM generation). The Git story alone is disqualifying. MPS excels for DSLs used by non-programmers (business analysts, domain experts) who benefit from rich visual notations, but that is not our target audience.


5. Racket #lang + rosette

What it is: Racket is a language-oriented programming environment where creating new languages (#lang) is a first-class capability. Rosette extends Racket with solver-aided features backed by Z3.

GitHub: Rosette, 688 stars, 81 forks

Building a DSL as a #lang

In Racket, a "language" is defined by providing a reader (parser) and a module expander. You can create #lang spec-rest and have it be a fully custom syntax. What you get:

  • DrRacket IDE support: Syntax coloring, REPL, debugging, documentation
  • Macro system: The most powerful macro system in any language, can implement arbitrary syntax transformations
  • Module system: First-class language composition
  • Test framework: Built-in

The rosette/Z3 opportunity

This is where Racket becomes uniquely interesting for our project:

  • Write an interpreter for our spec language in Rosette
  • Automatically get verification ("does this precondition ever fail?"), synthesis ("generate an implementation satisfying the postcondition"), and debugging ("find a counterexample")
  • Z3 integration is built-in and deeply optimized
  • Cloudflare uses Rosette in production for DNS policy verification (topaz-lang)

Cloudflare's experience is instructive:

  • Built a Lisp-like DSL (topaz-lang) for DNS policies
  • Rosette verifies satisfiability, reachability, and exclusivity of policies
  • 7 programs verify in ~6 seconds; 50 programs in ~300 seconds
  • Limitation: Must maintain a Racket interpreter synchronized with the production Go interpreter
  • Limitation: String manipulation beyond equality is unsupported in verification mode

Practical concerns

FactorAssessment
IDE supportDrRacket (specialized IDE) + limited VS Code via racket-langserver
LSPBasic, racket-langserver exists but is not feature-rich
DistributionRequires Racket installation (~200MB)
Learning curveSteep, Racket, macros, Rosette, S-expressions
User syntaxS-expression based unless you build a custom reader (significant effort)
CommunityActive but academic-leaning (~5K GitHub stars for Racket)
Error messagesGood within Racket; custom readers need custom error handling
Code generationManual, Racket has string templating but nothing like Langium's infra
LLM integrationNo ecosystem support; LLMs struggle with S-expressions
PerformanceRosette/Z3 verification can be slow for complex specs

Verdict on racket/rosette

The wrong primary framework but the right verification backend. Building our entire compiler in Racket would mean: (a) forcing an S-expression syntax or building a custom reader from scratch, (b) losing LSP/VS Code integration quality, (c) requiring all users to install Racket, (d) making LLM integration much harder. However, Rosette's solver-aided capabilities are directly relevant to our verification needs. The optimal approach: build the compiler in Langium, generate Z3 constraints in TypeScript using the z3-solver npm package, and consider Rosette only if we need synthesis or symbolic execution capabilities that are hard to replicate directly.


6. Tree-sitter + custom tooling

What it is: tree-sitter is an incremental parsing library used by many editors (VS Code, Neovim, Helix, Zed, GitHub). You define a grammar in JavaScript, and it generates a C parser.

GitHub: 24,500 stars, 2,544 forks (by far the most popular tool in this comparison)

What tree-sitter gives you

  • Incremental parsing: Re-parses only changed regions; sub-millisecond updates
  • Error recovery: Always produces a valid tree, even with syntax errors
  • Syntax highlighting: Query-based highlighting that editors consume directly
  • Code folding: Structure-based folding
  • Indentation: Can derive indentation rules
  • Basic structural navigation: Parent/child/sibling traversal

What tree-sitter does NOT give you

  • No AST types: The parse tree is a generic CST; you must define and build your own typed AST
  • No cross-reference resolution: No name binding, no scoping
  • No type checking: Zero semantic analysis
  • No LSP server: tree-sitter is NOT a language server, it provides parsing only
  • No validation framework: You must build all diagnostics from scratch
  • No code generation infrastructure: Nothing
  • No code completion: Beyond syntax-driven suggestions

How much work is "everything else"?

This is the critical question. Estimated effort for a DSL of our complexity:

Componenttree-sitter providesEffort to build
Grammar / parserYes1-2 weeks
Typed AST from CSTNo2-3 weeks
Name resolution / scopingNo3-4 weeks
Type checkerNo4-6 weeks
Validation / error reportingNo2-3 weeks
LSP server (completion, hover, go-to-def, diagnostics, rename)No6-8 weeks
Code generationNo(Same regardless of parser)
Total additional effort beyond parser17-24 weeks

With Langium, most of the "No" items above are provided or scaffolded, reducing the effort to customization rather than implementation.

Verdict on tree-sitter

Not appropriate as the primary framework. tree-sitter is a parser, rather than a language workbench. For our DSL, we would spend months building infrastructure that Langium provides out of the box. However, tree-sitter could be useful as a secondary artifact: we could generate a tree-sitter grammar from our Langium grammar to provide syntax highlighting in editors that use tree-sitter natively (Neovim, Helix, Zed).


7. ANTLR4 + custom tooling

What it is: The most widely-used parser generator in the world. Generates parsers in Java, C#, Python, JavaScript, TypeScript, Go, C++, and more from a single grammar.

GitHub: 18,809 stars, 3,430 forks

What ANTLR4 provides

  • Parser + lexer generation from a .g4 grammar file
  • Visitor and listener patterns for AST traversal
  • Multiple target languages (11+)
  • Excellent error reporting with error recovery strategies
  • Large grammar repository (grammars-v4 on GitHub)
  • Mature, battle-tested (20+ years of development)
  • ALL(*) parsing algorithm (handles most grammars without ambiguity issues)

ANTLR4 vs Langium: The key difference

ANTLR4 gives you a parser. Langium gives you a parser + typed AST + scoping + linking + LSP server + VS Code extension + validation framework + code generation utilities.

ComponentANTLR4Langium
ParserYesYes (Chevrotain, equally capable)
Typed ASTNo (generic parse tree)Yes (generated TypeScript interfaces)
Cross-reference resolutionNoYes
ScopingNoYes (customizable)
LSP serverNoYes (comprehensive)
VS Code extensionNoYes (scaffolded)
Validation frameworkNoYes
Code generation helpersNoYes (with source maps)
Type system libraryNoTypir (companion)
LLM integration toolkitNoLangium AI

Langium uses Chevrotain (not ANTLR) internally but provides a grammar language very similar to ANTLR's. The key philosophical difference: ANTLR is a parser generator; Langium is a language workbench.

When ANTLR4 still makes sense

  • You need the parser in a non-TypeScript language (Java, C#, Python)
  • You already have an ANTLR grammar and want to reuse it
  • You want maximum control over every component
  • You need to generate parsers for multiple target platforms

Effort comparison

For our DSL with full IDE support:

  • ANTLR4 + custom everything: ~20-28 weeks
  • Langium: ~8-12 weeks (grammar + custom type checker + code generators)

Verdict on ANTLR4

Choose Langium instead. ANTLR4 is an excellent tool, but Langium wraps an equally capable parser with all the additional infrastructure we need. The only scenario where ANTLR4 would be preferable is if we needed multi-language parser targets, but our compiler will be TypeScript.


Comparative assessment matrix

CriterionLangiumXtextSpoofaxMPSRacket/Rosettetree-sitterANTLR4
Time to working parser1-2 days2-3 days3-5 days1-2 weeks1-2 weeks1-2 weeks3-5 days
Time to full IDE support2-3 weeks3-4 weeks4-6 weeksBuilt-in (but MPS-only)MonthsMonthsMonths
Error message qualityGood (customizable)GoodExcellent (constraint-based)N/A (no parse errors)VariableBasicGood
IDE support OOBExcellent (VS Code)Good (Eclipse native)Good (Eclipse only)Excellent (MPS only)Fair (DrRacket)Syntax onlyNone
Z3/solver integrationGood (z3-solver npm)Possible (JNI)DifficultPossible (Java)Excellent (native)ManualDepends on target
LLM integrationExcellent (Langium AI)NoneNonePoorNoneNoneNone
Distributionnpm package / VS Code extJVM jar / Eclipse pluginEclipse pluginStandalone IDE (~500MB)Racket installNative binaryDepends on target
Learning curveModerate (TypeScript)Moderate-High (Java/EMF)Steep (3 meta-languages)Steep (projectional)Steep (Racket/macros)Low (JS grammar)Low-Moderate
Community sizeMedium-growing (985 GH stars)Medium-declining (823 GH stars)Small-academic (163 GH stars)Medium-niche (1.6K GH stars)Small-academic (688 GH stars)Large (24.5K GH stars)Large (18.8K GH stars)
Maintenance statusActive (v4.0, 2025)Maintenance modeActive but academicActive (JetBrains)Active but slowVery activeActive

Updated recommendation

Primary choice: ANTLR4 via antlr-ng (typescript)

This section supersedes the original Langium recommendation below, which is preserved for reference. The change was driven by a devil's advocate audit that stress-tested every Langium advantage against primary sources.

ANTLR4 via antlr-ng is the chosen parser framework. The implementation language is TypeScript. We build our own type checker. Z3 integration uses native subprocess for performance-critical checks, with WASM (z3-solver npm) as fallback.

Why ANTLR4/antlr-ng over Langium

  1. Community and ecosystem stability: ANTLR has 17k+ GitHub stars, thousands of contributors, extensive documentation and books, and a 20+ year track record. Langium has 985 stars and 22 contributors, mostly TypeFox employees. If we hit a parser bug in ANTLR, we have a massive community to draw on.

  2. No framework lock-in: Langium couples grammar, AST, scoping, LSP, and validation into a single proprietary framework. Migration away from Langium requires rewriting all of these. With ANTLR4, each concern is a separate, replaceable component.

  3. Typir cannot carry our type system: The audit confirmed that Typir (Langium's companion type system library) lacks refinement types, relation types, generics, and quantified expressions, all of which our DSL requires. We must build ~80% of the type checker ourselves regardless, eliminating Typir as a meaningful advantage.

  4. Langium AI is not a factor: At v0.0.2 with 20 GitHub stars and incompatible with Langium 4.0, Langium AI cannot be counted as a differentiating feature. Framework-agnostic tools (Outlines, LMQL, Guidance) provide grammar-constrained LLM generation without parser framework coupling.

  5. Strategic risk from Fastbelt: TypeFox's announcement of Fastbelt (a Go-based successor that is 21-33x faster than Langium) signals that Langium has fundamental architectural limitations TypeFox themselves acknowledge. The best TypeFox engineers may shift focus to Fastbelt.

  6. antlr-ng is production-ready for TypeScript: The antlr-ng project is a full TypeScript port of ANTLR4, blessed by Terence Parr, with a runtime that is 9-35% faster than other JS/TS ANTLR runtimes.

  7. Z3 via native subprocess is faster: Rather than accepting the 2-5x WASM overhead for all Z3 operations, we use native Z3 as the primary backend (via subprocess) and fall back to WASM only when native binaries are unavailable. This gives us full native performance for verification.

What we give up (and how we mitigate it)

Langium advantage lostMitigation
Integrated LSP serverBuild incrementally using vscode-languageserver when IDE support becomes a priority. Our primary users interact via CLI initially.
Cross-reference resolutionBuild custom name resolution on our own IR. Study Statix scope graphs for design inspiration.
Generated TypeScript ASTDefine our own IR types (we were planning this anyway for the abstraction boundary).
VS Code extension scaffoldGenerate a basic extension with TextMate grammar from ANTLR; enhance later.

Architecture

spec-rest-lang/
  src/
    grammar/
      Spec.g4                    # ANTLR4 grammar definition
      generated/                 # antlr-ng generated parser + lexer
    ast/
      ir-types.ts                # Our own IR type definitions
      cst-to-ir.ts              # Parse tree -> IR conversion
    analysis/
      type-checker.ts            # Custom type checker (built from scratch)
      scope-resolver.ts          # Name resolution and scoping
      validator.ts               # Domain-specific validation rules
    cli/
      generator-openapi.ts       # OpenAPI code generation
      generator-tests.ts         # Test harness generation
      generator-z3.ts            # Z3 constraint generation
    solver/
      z3-native.ts               # Z3 via native subprocess (primary)
      z3-wasm.ts                 # Z3 via WASM (fallback)
  test/                          # Language tests

Secondary tools

  • Z3 native binary: Primary solver backend via subprocess for verification
  • z3-solver (npm/WASM): Fallback solver for environments without native Z3
  • vscode-languageserver: For future LSP server implementation
  • tree-sitter grammar (optional, later): Generate from ANTLR grammar for Neovim/Helix/Zed users
  • Study Statix/scope graphs: As design inspiration for our type checker's name resolution strategy

Estimated timeline

PhaseDurationDeliverable
Grammar design + ANTLR parser1-2 weeksParsing spec files, parse tree available
IR design + CST-to-IR conversion1-2 weeksTyped IR with all DSL constructs
Name resolution + scoping2-3 weeksCross-references resolve (entity refs, type refs)
Type checker4-5 weeksFull type checking with inference for expressions
Validation rules1-2 weeksDomain-specific error messages
OpenAPI code generator2-3 weeksGenerate OpenAPI specs from DSL
Z3 constraint generator2-3 weeksVerify pre/postconditions, invariants
Test harness generator2-3 weeksGenerate test suites
LSP server (if needed)3-4 weeksBasic IDE support (completion, hover, diagnostics)
Total16-27 weeksFull compiler with optional IDE support

The timeline is modestly longer than the original Langium estimate (14-23 weeks) because we build name resolution and scoping ourselves. However, this is offset by eliminating framework migration risk, annual Langium version churn, and the need to work around Langium's assumptions about AST structure.

When to reconsider Langium

Langium remains a viable future option if all of the following conditions hold:

  1. VS Code IDE support with rich completions, hover, and rename becomes a top-3 user requirement (not just nice-to-have).
  2. Langium reaches version 5.0+ with a stable API and a clear commitment from TypeFox to continued investment (not superseded by Fastbelt).
  3. Typir matures to support refinement types and generics, reducing our custom type checker burden.
  4. The community grows beyond TypeFox employees to include independent contributors.

Original recommendation (superseded)

NOTE: The recommendation below was the original analysis before the devil's advocate audit. It is preserved as reference material. The active recommendation is ANTLR4 via antlr-ng above.

Original primary choice: Langium

Langium was initially assessed as the clear winner for our project. The reasoning:

  1. Fastest path to a complete tool: Langium provides parser, AST, scoping, linking, LSP, VS Code extension, CLI, and validation out of the box. We focus engineering effort on our unique concerns: the type checker, code generators, and solver integration.

  2. TypeScript ecosystem alignment: Our compiler, IDE extension, and CLI will all be TypeScript. Z3 is available as an npm package (z3-solver). LLM APIs are trivially accessible. Everything composes naturally.

  3. Langium AI for LLM integration: TypeFox has built exactly the toolkit we need, constrained decoding that forces LLM output to conform to our DSL grammar, evaluation pipelines that use our parser/validator to score LLM output quality, and document splitting that respects syntactic boundaries for RAG.

  4. Distribution simplicity: Ship as an npm package (CLI) + VS Code extension (IDE). Users install with npm install -g specrest and code --install-extension specrest. No JVM, no Eclipse, no special IDE.

  5. Type system support via Typir: While we will need to build a custom type checker for our specific needs (entity types, operation signatures, expression types in conditions), Typir provides the scaffolding for type inference and assignability that integrates into Langium's validation lifecycle.

  6. Performance is adequate: For specification files (typically <10K lines), Langium's performance is excellent. The known performance problems (files >1MB, workspaces with thousands of files) do not apply to our use case.

  7. Future-proofed: TypeFox is working on a Go-based high-performance successor for cases that hit Langium's limits. If we ever need it, migration will be supported. But we almost certainly will not need it.

Original architecture

spec-rest-lang/
  src/
    language/
      spec-rest.langium          # Grammar definition
      spec-rest-validator.ts     # Custom validation rules
      spec-rest-type-checker.ts  # Type system (using Typir)
      spec-rest-scope-provider.ts # Custom scoping rules
    cli/
      generator-openapi.ts       # OpenAPI code generation
      generator-tests.ts         # Test harness generation
      generator-z3.ts            # Z3 constraint generation
    lsp/
      spec-rest-completion.ts    # Custom completion providers
      spec-rest-hover.ts         # Custom hover information
    ai/
      langium-ai-config.ts       # LLM grounding configuration
  extension/                     # VS Code extension
  test/                          # Language tests

Original secondary tools

  • z3-solver (npm): For pre/postcondition verification, invariant checking
  • Langium AI: For LLM-assisted spec generation and validation
  • tree-sitter grammar (optional, later): Generate from Langium grammar for Neovim/Helix/Zed users
  • Study Statix/scope graphs: As design inspiration for our type checker's name resolution strategy

Original estimated timeline

PhaseDurationDeliverable
Grammar design + basic parser1-2 weeksParsing spec files, AST available
Scoping + linking1-2 weeksCross-references resolve (entity refs, type refs)
Type checker3-4 weeksFull type checking with inference for expressions
Validation rules1-2 weeksDomain-specific error messages
LSP polish (completion, hover)1-2 weeksRich IDE experience
OpenAPI code generator2-3 weeksGenerate OpenAPI specs from DSL
Z3 constraint generator2-3 weeksVerify pre/postconditions, invariants
Test harness generator2-3 weeksGenerate test suites
Langium AI integration1-2 weeksLLM-assisted spec writing
Total14-23 weeksFull compiler with IDE support

This compared to 25-40+ weeks if we chose ANTLR4 or tree-sitter and built everything from scratch. However, the post-audit assessment found this comparison overstated the gap: much of Langium's "free" infrastructure (type checking, custom scoping, validation) still required substantial custom implementation for our DSL's complexity.


Sources

On this page