MLIR Evaluation
Edit on GitHubAssessment of LLVM MLIR for the spec-to-REST DSL compiler
Last updated:
Research assessment of LLVM's MLIR (Multi-Level Intermediate Representation) as infrastructure for building the spec-to-REST compiler. Covers architecture, dialect creation, parsing story, boilerplate costs, learning curve, non-ML uses, alternatives, and a final recommendation.
Table of contents
- What MLIR Actually Is
- Does MLIR Help with Parsing?
- Custom Dialect Creation: What It Takes
- The C++ Requirement
- Non-ML Uses of MLIR
- What MLIR Gives That Simpler Approaches Don't
- REST/Web Service DSLs on MLIR
- Learning Curve
- xDSL: The Python Alternative
- Fit Assessment for Spec-to-REST
- Recommendation
- Sources
1. What MLIR actually is
MLIR is a compiler infrastructure framework, rather than a compiler, rather than a language, rather than a parser. It provides:
- A generic SSA-based IR format with operations, regions, blocks, and values
- An extensible dialect system where you define your own operations, types, and attributes as first-class IR constructs
- A transformation/pass infrastructure for writing optimization and lowering passes
- A progressive lowering model where high-level domain operations are incrementally lowered through intermediate dialects down to machine code (via LLVM IR)
- Built-in dialects for common patterns:
arith,func,scf(structured control flow),affine(loop nests),memref(memory),llvm(LLVM IR)
The key architectural insight: MLIR lets multiple levels of abstraction coexist in a single IR. A
program can contain toy.transpose next to affine.for next to llvm.call, each from a
different dialect, at a different abstraction level.
Despite the name, the "ML" in MLIR does not stand for Machine Learning. Chris Lattner (MLIR's creator) and the LLVM community have explicitly stated it was conceived as general-purpose compiler infrastructure. The official MLIR website describes it as "a novel approach to building reusable and extensible compiler infrastructure" with no ML-specific limitation.
2. Does MLIR help with parsing?
No. MLIR provides zero parsing infrastructure for custom DSLs.
This is the single most important finding for our use case. MLIR's role begins after parsing is complete:
The orange-tinted stages are the part you build yourself; MLIR's contribution starts at the AST → MLIR boundary.
Evidence from the official Toy language tutorial (the canonical MLIR learning path):
"The code for the lexer is fairly straightforward; it is all in a single header:
examples/toy/Ch1/include/toy/Lexer.h. The parser can be found inexamples/toy/Ch1/include/toy/Parser.h."
The Toy tutorial uses a hand-written recursive descent parser in C++. MLIR provides no lexer generator, no parser generator, no grammar specification mechanism, and no AST construction utilities. The MLIR Toy tutorial Chapter 1 is entirely about the hand-written parser; Chapter 2 is titled "Emitting Basic MLIR", this is where MLIR begins.
ANTLR + MLIR bridge
There is an experimental community project that generates MLIR dialects from ANTLR4 grammars (discussed on LLVM Discourse, author: leothaud). It automatically:
- Transforms an ANTLR4 grammar into an MLIR dialect representing the AST
- Generates an ANTLR4-based frontend that targets this MLIR dialect
However, this project supports only a subset of ANTLR4 features and is experimental. The community reception noted it could be useful but suggested IRDL (Intermediate Representation Definition Language) for broader portability.
Implication for spec-to-REST
Our spec language needs a parser regardless. Whether we use MLIR or not, we still need to build a lexer, parser, and AST, using ANTLR, tree-sitter, pest, hand-written recursive descent, or similar. MLIR does not reduce this work at all.
3. Custom dialect creation: What it takes
Creating an MLIR dialect requires a substantial amount of scaffolding across multiple files, build systems, and languages.
3.1 Files required for a minimal dialect
mlir/include/mlir/Dialect/Foo/
FooDialect.td # Dialect declaration (TableGen)
FooOps.td # Operation definitions (TableGen)
FooTypes.td # Custom type definitions (TableGen, optional)
FooDialect.h # C++ header (partly generated)
FooOps.h # C++ header (partly generated)
mlir/lib/Dialect/Foo/IR/
FooDialect.cpp # C++ implementation
FooOps.cpp # Operation implementations
CMakeLists.txt # Build configuration
mlir/lib/Dialect/Foo/Transforms/
FooTransforms.td # Rewrite rules (TableGen, optional)
CMakeLists.txt # Build configuration3.2 Tablegen (ods) definitions
The dialect is declared in TableGen's ODS (Operation Definition Specification) format:
// FooDialect.td
def Foo_Dialect : Dialect {
let name = "foo";
let cppNamespace = "foo";
}Operations are defined declaratively:
// FooOps.td
def ConstantOp : Foo_Op<"constant"> {
let summary = "constant operation";
let arguments = (ins F64ElementsAttr:$value);
let results = (outs F64Tensor);
let hasVerifier = 1;
let builders = [
OpBuilder<(ins "DenseElementsAttr":$value)>
];
let assemblyFormat = "$value attr-dict `:` type($input)";
}Custom types require additional definitions:
// FooTypes.td -- a parameterized type
def Foo_PolyType : TypeDef<Foo_Dialect, "Polynomial"> {
let parameters = (ins "int":$degreeBound);
let assemblyFormat = "`<` $degreeBound `>`";
}3.3 Cmake configuration
add_mlir_dialect(FooOps foo)
add_mlir_doc(FooOps FooDialect Dialects/ -gen-dialect-doc)
add_mlir_dialect_library(MLIRFoo
FooDialect.cpp
FooOps.cpp
DEPENDS
MLIRFooOpsIncGen
LINK_LIBS PUBLIC
MLIRIR
MLIRSupport
)3.4 C++ implementation
Even with TableGen generating most boilerplate, you still write C++ for:
- Dialect initialization and registration
- Custom verification logic (if
hasVerifier = 1) - Custom builders beyond what TableGen generates
- Lowering passes (the
ConversionPatternimplementations) - Custom type/attribute storage classes
3.5 Effort estimate
Based on multiple tutorials and real-world examples:
| Component | Estimated lines | Language |
|---|---|---|
| TableGen dialect + ops (5-10 ops) | 100-300 | ODS/TableGen |
| C++ dialect implementation | 200-500 | C++ |
| CMake build configuration | 30-60 | CMake |
| Lowering pass (per target) | 300-1000 | C++ |
| Total minimum viable dialect | ~700-2000 | Mixed |
The MLIR-Forge project found that individual dialect components required 56-1,519 lines of code, with each taking a developer less than one week to implement, but these developers already knew MLIR.
Jeremy Kun's tutorial on building a polynomial dialect noted that progression sped up over time but
acknowledged confusion with TableGen's class vs def distinction and uncertainty about optimal
file organization. He described the generated files as "multi-thousand line implementation files."
4. The C++ requirement
MLIR is written in C++ and requires C++ for dialect definitions. This is non-negotiable in upstream MLIR.
What this means practically
- Build times. Compiling MLIR from source takes ~1 hour on a laptop, ~10 minutes on a desktop. This is a one-time cost, but iterating on dialect changes requires incremental rebuilds of generated C++ code.
- Toolchain. Requires a full LLVM/Clang toolchain, CMake, and TableGen.
- Developer profile. The official MLIR introduction page assumes "knowledge of C++ and advanced Python, along with passing familiarity with NVIDIA CUDA."
- Integration with Python/Rust/TS. Our spec-to-REST compiler is designed around Python (primary), Rust (alternative), or TypeScript as implementation languages. Using MLIR means either (a) rewriting in C++, (b) maintaining a C++ MLIR component that communicates with our main codebase via IPC/FFI, or (c) using xDSL (Python, see Section 9).
Is it a dealbreaker?
For this project: almost certainly yes. The implementation architecture document (07) evaluated five languages and chose Python as the primary implementation language for its Z3 integration, LLM API support, development velocity, and distribution story. Introducing a mandatory C++ component for MLIR would:
- Add a second implementation language and build system (CMake alongside pip/poetry)
- Require C++ expertise on the team
- Slow iteration speed for IR changes
- Complicate distribution (native binaries vs. pure Python)
- Provide no benefit for our actual bottleneck (parsing, constraint solving, code generation via templates)
5. Non-ML uses of MLIR
MLIR is genuinely used far beyond machine learning. Notable examples:
5.1 CIRCT, hardware design
The CIRCT project (Circuit IR Compilers and Tools) applies MLIR to hardware design, replacing traditional RTL tools with MLIR-based compilation. Modern hardware DSLs like Chisel are moving their backends to MLIR. This is a major non-ML success story, demonstrating MLIR's generality for representing hardware description languages.
5.2 Flang, fortran compiler
LLVM's new Fortran compiler (Flang) uses MLIR for its high-level IR (FIR, Fortran IR). This enables powerful transformations for array operations, loop optimizations, and OpenMP parallelism. Flang already achieves performance on par with GCC's Fortran compiler. In 2024, AMD announced its next-gen Fortran compiler will be based on Flang/MLIR.
5.3 Other non-ML uses
From the official MLIR users page:
| Project | Domain |
|---|---|
| CIRCT | Hardware design (EDA) |
| Flang | Fortran HPC compiler |
| ClangIR | C/C++ intermediate representation |
| Concrete / HEIR | Fully homomorphic encryption |
| P4HIR | Network packet processor programming |
| JSIR | JavaScript analysis / malicious code detection |
| MARCO | Modelica language compiler |
| Substrait MLIR | Database query plan representation |
| DSP-MLIR | Digital signal processing |
| Mojo | Python-compatible systems language |
| Firefly | Erlang/Elixir to WebAssembly compiler |
| Pylir | Python ahead-of-time compiler |
| Verona | Concurrent ownership research language |
| Zaozi | Hardware eDSL in Scala 3 |
5.4 Pattern
Every successful non-ML use of MLIR shares a common trait: the domain involves computational operations that benefit from optimization, lowering, and eventually code generation to machine instructions. Hardware synthesis, HPC loop nests, cryptographic circuits, database query plans, these all have optimization-rich compilation pipelines where multi-level IR is genuinely valuable.
6. What MLIR gives that simpler approaches don't
What MLIR provides
- Multi-level IR coexistence. Different abstraction levels in one representation, with well-defined lowering between them
- Verification infrastructure. Built-in operation verification, type checking, and trait-based constraints
- Pass infrastructure. Sophisticated pass management, scheduling, and dependency tracking for IR transformations
- Rewrite pattern system. Declarative (DRR/PDLL) and programmatic pattern matching for IR transformations
- SSA form. Automatic SSA construction and dominance analysis
- Serialization. Textual and bytecode IR formats with round-tripping
- Ecosystem. Access to LLVM backend for native code generation
- Community. Active development, conferences, weekly public meetings
What our compiler actually needs
| Compiler stage | What we need | MLIR helps? |
|---|---|---|
| Parsing | Lexer + parser for spec DSL | No |
| AST construction | Typed AST from parse tree | No |
| Semantic analysis | Type checking, scope resolution | Partially (type system) |
| IR construction | Service-specific IR (entities, ops, invariants) | Partially (generic IR) |
| Constraint solving | Z3 for invariant checking | No |
| Convention mapping | Entities -> HTTP routes, DB schemas | No |
| LLM integration | Synthesis of operation bodies | No |
| Code generation | Template-based multi-target emission | No (overkill) |
| Test generation | Property-based test synthesis | No |
MLIR would only partially help with two stages (semantic analysis and IR construction), and in both cases our domain-specific needs (relational constraints, pre/postconditions, REST-specific concepts like HTTP methods, status codes, pagination) don't map naturally to MLIR's computation-oriented IR model.
The core mismatch
MLIR is designed for computational IRs, representations of programs that compute values through sequences of operations with data dependencies. Its SSA form, dominance trees, and region structure are designed for analyzing and optimizing computation.
Our spec language is declarative and structural, it describes entities, their relationships, behavioral contracts (pre/postconditions), and invariants. There is no "computation" to optimize. The IR is a structured data model that drives template-based code generation, rather than a computational graph that gets progressively lowered to machine instructions.
Concretely, our IR looks like this:
@dataclass
class ServiceIR:
entities: List[EntityDecl] # Sig-like declarations
operations: List[OperationDecl] # With requires/ensures
invariants: List[InvariantDecl] # Global constraints
state: StateDecl # Mutable state definitionThis is a data structure. MLIR's infrastructure for operation scheduling, memory analysis, loop transformations, vectorization, and LLVM lowering is irrelevant here.
7. REST/web service dsls on MLIR
None found. After extensive searching, there are zero examples of REST API, web service, or HTTP-related DSLs built on MLIR. This is not an oversight, it reflects the fundamental mismatch described in Section 6.
The closest adjacent projects:
- Substrait MLIR represents database query plans (data processing, rather than web services)
- JSIR analyzes JavaScript code (code analysis, rather than service specification)
- P4HIR handles network packet processing (low-level networking, not HTTP APIs)
Existing REST/API specification tools (OpenAPI, TypeSpec, Smithy, Ballerina) all use their own purpose-built parsers and IRs. None use MLIR or any general-purpose compiler IR framework, because API specifications are structural/declarative, rather than computational.
8. Learning curve
The learning curve for MLIR is widely acknowledged as steep.
Community assessment
- Stephen Diehl's introduction to MLIR opens with "You probably shouldn't" learn MLIR, acknowledging it serves a niche audience
- The MLIR ecosystem "has a steep learning curve, which can intimidate new developers and hinders adoption"
- "Building a new dialect or pass often means delving into MLIR's internals (C++ templates, TableGen definitions, etc.) with sparse documentation"
- Google's engineers writing ML kernels in MLIR found it "a productivity challenge," leading to the creation of the Mojo language for a higher-level syntax
Specific pain points
- TableGen is its own language, you must learn ODS, which is a DSL embedded in TableGen, itself a record-based DSL. So you are learning a DSL-within-a-DSL to define your DSL.
- C++ templates, MLIR's C++ layer uses heavy template metaprogramming
- Build system, CMake + TableGen code generation adds complexity
- Sparse documentation, many intermediate topics lack documentation; you often read source code
- Moving target, MLIR APIs evolve rapidly; code from tutorials may not compile against current HEAD
Estimated timeline
For a developer experienced in compilers but new to MLIR:
- 1-2 weeks: Complete Toy tutorial, understand basic concepts
- 2-4 weeks: Build a trivial custom dialect with a few operations
- 1-3 months: Build a useful dialect with custom types, lowering passes, and transformations
- 3-6 months: Become productive at debugging MLIR issues and extending the dialect
For a developer NOT experienced in compilers or C++:
- Add 2-4 months to each estimate above
9. Xdsl: The Python alternative
xDSL is a Python-native compiler toolkit that is 1:1 compatible with MLIR's textual IR format. It deserves separate consideration because it removes the C++ barrier.
What xdsl offers
- Pure Python:
pip install xdsl, no C++ compilation needed - MLIR-compatible IR format. Same textual representation, can exchange IR with MLIR
- Python dialect definitions. Define operations, types, and transformations in Python instead of TableGen/C++
- Rapid prototyping. Designed for "fast prototyping of MLIR concepts before upstreaming to MLIR itself"
- IRDL support. Dialects can be defined using IRDL (Intermediate Representation Definition Language) for portability
Compilation speed
"Compiling MLIR requires two orders of magnitude more time than xDSL, taking almost 1 hour on a laptop and 10 minutes on a desktop, compared to the few seconds the xDSL setup needs on both machines."
Does xdsl change the recommendation?
xDSL removes the C++ barrier but not the fundamental mismatch. Even with Python dialect definitions, we would still be using SSA-based IR infrastructure designed for computational optimization on a declarative specification language. The infrastructure would provide:
- SSA form (irrelevant, our IR has no data flow to track)
- Pass scheduling (marginally useful, we have a fixed pipeline)
- Operation verification (somewhat useful, but Python dataclasses + Pydantic do this)
- Textual IR format (nice-to-have for debugging, but not essential)
The question isn't "can we make it work" but "does it earn its keep." xDSL adds a dependency and a conceptual framework (dialects, operations, regions, blocks) that maps awkwardly to our domain.
10. Fit assessment for spec-to-REST
Scoring (1-5, where 5 = perfect fit)
| Criterion | Score | Reasoning |
|---|---|---|
| Parsing support | 1/5 | Zero. Must build parser separately regardless |
| IR suitability | 2/5 | SSA-based computational IR for a declarative spec language |
| Code generation model | 1/5 | Progressive lowering to LLVM IR vs. template-based emission |
| Language compatibility | 1/5 | C++ required; our compiler is Python |
| Learning curve | 2/5 | Steep, 1-3 months to productive |
| Ecosystem value | 2/5 | Rich but irrelevant (LLVM backend, affine analysis, etc.) |
| Distribution impact | 1/5 | Adds native compilation dependency to Python project |
| Domain precedent | 1/5 | Zero REST/web service DSLs use MLIR |
| Overall | 1.4/5 | Poor fit |
Why MLIR works for others but not for us
Projects that benefit from MLIR share these characteristics:
- Computational domain. Operations transform data through computation (arithmetic, memory access, control flow)
- Optimization-rich pipeline. Multiple optimization passes that benefit from SSA analysis, dominance, loop analysis
- Hardware targeting. Need to lower to specific hardware (GPUs, TPUs, FPGAs, ASICs)
- Multi-level abstraction. Genuine benefit from representing the same program at different abstraction levels simultaneously
Our spec-to-REST compiler has none of these characteristics. Our pipeline is:
Spec text -> Parse -> Typed IR -> Constraint check (Z3) -> Convention mapping -> Template emissionThis is a translation pipeline, rather than a compilation pipeline. We translate specifications into code using conventions and templates. We don't optimize computations, we don't lower through abstraction levels, and we don't target hardware.
11. Recommendation
Do not use MLIR for the spec-to-REST compiler.
MLIR is an impressive piece of engineering, genuinely useful for computational compiler infrastructure, and has proven its value in hardware design (CIRCT), HPC (Flang), encryption (Concrete/HEIR), and many other domains. But it is the wrong tool for this job.
What to use instead
The existing architecture in document 07 is correct:
- Parser. ANTLR, tree-sitter, pest (Rust), or hand-written recursive descent, all are appropriate for our grammar complexity
- IR. Python dataclasses (or Rust structs / TypeScript interfaces) forming a typed AST and service IR. No SSA, no regions, no blocks, just a clean data model
- Transformations. Straightforward Python functions that walk the IR and apply convention rules
- Code generation. Jinja2 (Python) / askama (Rust) / EJS (TypeScript) template engines
- Verification. Z3 via its Python bindings (the primary Z3 interface)
When MLIR would make sense
If the spec-to-REST project evolved to include:
- Runtime optimization of generated service code (JIT compilation of hot paths)
- Hardware targeting (generating FPGA-accelerated request processing)
- Formal verification via compilation (lowering specs through proof-carrying code)
- A general-purpose programming language as the spec language
...then MLIR might become relevant. But for a specification-to-code translation tool, it adds complexity without proportionate value.
The one lesson worth taking from MLIR
MLIR's dialect system demonstrates that multi-level IR with explicit abstraction boundaries is a powerful design pattern. We can adopt this idea cheaply:
# Three IR levels, plain Python, no MLIR needed
@dataclass
class SpecIR:
"""Level 1: Direct representation of spec language constructs"""
entities: List[EntityDecl]
operations: List[OperationDecl]
invariants: List[InvariantDecl]
@dataclass
class ServiceIR:
"""Level 2: REST-aware intermediate representation"""
routes: List[RouteDecl]
models: List[ModelDecl]
db_schema: List[TableDecl]
validations: List[ValidationRule]
@dataclass
class TargetIR:
"""Level 3: Language-specific, ready for template emission"""
files: List[FileDecl]
dependencies: List[Dependency]
config: Dict[str, Any]This gives us explicit abstraction levels and a clear lowering pipeline, the architectural insight from MLIR, without the C++ build system, TableGen DSL, SSA infrastructure, or months of learning curve.
12. Sources
- Creating a Dialect - MLIR Official Tutorial
- Chapter 2: Emitting Basic MLIR - Toy Tutorial
- Chapter 1: Toy Language and AST - Toy Tutorial
- Operation Definition Specification (ODS)
- Defining Dialects - MLIR
- Users of MLIR
- MLIR Part 1 - Introduction to MLIR - Stephen Diehl
- MLIR, Defining a New Dialect - Jeremy Kun
- MLIR Tutorial: Custom Dialect and Lowering - Dhamo Dharan
- BrilIR: An MLIR Dialect for Bril - Cornell CS 6120
- ANTLR4-to-MLIR Bridge - LLVM Discourse
- DSP-MLIR: A DSL and MLIR Dialect for Digital Signal Processing - PLDI 2025
- CIRCT - Circuit IR Compilers and Tools
- Flang and MLIR - arXiv:2409.18824
- xDSL - Python Compiler Toolkit
- xDSL GitHub Repository
- MLIR-Forge: A Modular Framework for Language Smiths
- Modular: What About the MLIR Compiler Infrastructure
- MLIR: A Compiler Infrastructure for the End of Moore's Law - Lattner et al.
- MLIR: Scaling Compiler Infrastructure for Domain Specific Computation
- Hands-on Practical: Creating a Custom Dialect
- How to Build Your Own MLIR Dialect - FOSDEM 2023