PRD-007 — Python Developer Tooling & Code Quality Standards¶
| Field | Value |
|---|---|
| Document ID | PRD-007 |
| Version | 1.0 |
| Status | DRAFT |
| Date | March 2026 |
| Author | Engineering Team |
| Parent | PRD-001 |
| Related Docs | PRD-003 (BugTriageState schema), PRD-006 (Pydantic validation patterns) |
Philosophy¶
One tool per concern, all from the Astral stack (uv + ruff + ty) — consistent, fast, Rust-based, zero config
drift. Standards are enforced by tooling, not code review. If ruff passes and ty passes, the code is correct by
definition of the project's quality bar.
The pyproject.toml is the single source of truth for packaging, tool config, and dependency declarations. No
setup.py, no requirements.txt, no .flake8, no mypy.ini.
Python Version¶
- Minimum: Python 3.12 (managed by
uv) .python-versionfile pins the Python minor version (3.12);uvselects the latest3.12.xpatchrequires-python = ">=3.12"declared inpyproject.toml
Python 3.12 enables:
type X = ...type alias syntax (PEP 695)- Full PEP 695 generic syntax (
class Foo[T]: ...) ExceptionGroupfor structured exception handlingtomllibin stdlib (no external dep for TOML parsing)
Package & Environment Management: uv¶
uv replaces pip, venv, pip-tools, and pipx in a single binary. It is the only tool used to manage
Python environments and dependencies on this project.
Key Commands¶
uv sync # install project + dev dependencies (dev group is default)
uv sync --all-groups # install project + all dependency groups (dev + test)
uv sync --only-dev # install dev tooling only — no project or runtime deps (CI linting)
uv add fastapi # add a runtime dependency to [project].dependencies
uv add --group dev ruff # add a dev dependency to [dependency-groups].dev
uv run pytest # run pytest in the managed venv
uv run ruff check . # run ruff in the managed venv
uv run ty check src/ # run ty in the managed venv
pyproject.toml — uv-owned sections¶
[project]
name = "agent-ops-dashboard"
version = "0.1.0"
requires-python = ">=3.12"
[tool.uv]
# uv-specific settings (index configuration, etc.)
Dependency Groups (PEP 735)¶
Groups are declared in [dependency-groups] (PEP 735), not [project.optional-dependencies]. This is the
uv-preferred approach and avoids the semantic misuse of optional deps for developer tooling.
[project]
dependencies = [
# Runtime — always installed in production
"fastapi>=0.115",
"pydantic>=2.7",
"pydantic-settings>=2.3",
"langgraph>=0.2",
"langchain>=0.3",
"langchain-openai>=0.2",
"langserve>=0.3",
"arq>=0.26",
"redis>=5.0",
"httpx>=0.27",
"uvicorn[standard]>=0.30",
"langsmith>=0.1",
]
[dependency-groups]
dev = [
"ruff>=0.6",
"ty>=0.0.1a1", # Astral type checker — minimum version while stabilising; uv.lock pins the exact build
"pre-commit>=3.8",
]
test = [
"pytest>=8.3",
"pytest-asyncio>=0.24",
"httpx>=0.27", # AsyncClient for FastAPI test client
"pytest-cov>=5.0",
]
Why three groups¶
| Group | Purpose | Installed in |
|---|---|---|
dependencies |
Ships in production — the running application | prod + CI + dev |
dev |
Linters, type checker, pre-commit — developer tooling | dev VMs only |
test |
Test runtime — pytest, coverage | CI test runners + dev |
Linting & Formatting: ruff¶
ruff replaces: black, isort, flake8, pyupgrade, pydocstyle, flake8-annotations. One binary, one
config block in pyproject.toml, sub-millisecond on incremental runs.
[tool.ruff]
target-version = "py312"
line-length = 100
[tool.ruff.lint]
select = [
"E", # pycodestyle errors
"W", # pycodestyle warnings
"F", # pyflakes
"I", # isort
"UP", # pyupgrade — enforces Python 3.12+ syntax
"D", # pydocstyle — 100% docstring coverage on public API
"ANN", # flake8-annotations — all functions must be typed
"RUF", # ruff-specific rules
"S", # flake8-bandit security lint (subset)
"PLC0415", # import-outside-toplevel — no local imports inside functions
]
ignore = [
"D100", # missing module docstring — optional at module level
"D104", # missing package docstring
"ANN101", # self annotation — never required
"ANN102", # cls annotation — never required
]
[tool.ruff.lint.pydocstyle]
convention = "google"
[tool.ruff.lint.isort]
known-first-party = ["agent_ops_dashboard"]
[tool.ruff.format]
quote-style = "double"
indent-style = "space"
Docstring coverage via D rules¶
All public classes, methods, and functions must have a Google-style docstring. Private (_prefixed) items are
exempt. This gives 100% coverage on the public API surface without noise on internals.
Type Checking: ty¶
ty is Astral's Rust-based type checker — same team as ruff/uv. As of 2025 it is in active alpha; the
project pins it with >=0.0.1a1 and accepts minor churn during stabilisation.
Run: uv run ty check src/
Why ty over mypy/pyright¶
| Criterion | ty | mypy / pyright |
|---|---|---|
| Toolchain alignment | Same Astral ecosystem as ruff/uv | Separate teams and config surfaces |
| Performance | Rust core — significantly faster | Python (mypy) / Node (pyright) |
| uv integration | Native | External install required |
| Maturity | Alpha — may hit edge-case gaps | Mature, broad plugin ecosystem |
Trade-off: ty is alpha. If a blocking issue is encountered, fall back to pyright (already in the Astral
orbit via Pylance). Document the blocker in this PRD when that decision is made.
Python 3.12+ Type System Standards¶
Forbidden Patterns¶
Ruff-enforced¶
Violations are caught automatically — no manual review required.
| Forbidden | Replacement | Rule |
|---|---|---|
from typing import List, Dict, Tuple, Set |
list, dict, tuple, set (builtins) |
UP035 |
Optional[X] |
X \| None |
UP007 |
Union[X, Y] |
X \| Y |
UP007 |
from typing import TypeAlias + X: TypeAlias = ... |
type X = ... (PEP 695) |
UP040 |
Any |
Specific type, TypeVar, or Protocol |
ANN401 |
| Untyped function parameter | Full annotation required | ANN001 |
| Untyped function return | Full annotation required | ANN201 |
| Local import (inside a function or method body) | Move to module-level | PLC0415 |
Architecture policy (not a ruff rule — enforced in code review)¶
| Forbidden | Resolution |
|---|---|
TYPE_CHECKING / if TYPE_CHECKING: (any use) |
Extract shared types to a dedicated models.py or types.py module that neither side of a dependency cycle imports. TYPE_CHECKING is a symptom of import cycles or over-eager imports — fix the architecture, do not guard the import. |
Try-catch spam — try/except blocks scattered across service functions for cross-cutting concerns (logging, failure events, observability) |
Cross-cutting concerns belong in one service-wide handler: a @worker_error_handler decorator applied in WorkerSettings, a FastAPI @app.exception_handler, or a Starlette middleware class. Individual business-logic functions must not handle exceptions they cannot recover from. |
Try-catch-reraise — catching an exception only to log/publish it then raise again |
Never catch-and-reraise inside a business function. Let the exception propagate to the service-wide handler. One try-except per cross-cutting concern, declared once. |
Multiple nested try/except blocks in a single function |
If retry logic genuinely requires catching exceptions, extract it into a dedicated helper with a single try/except inside a loop (max attempts). The caller remains exception-free. |
isinstance(), cast(), type(), getattr()/setattr() for runtime type dispatch, or any reflection (__class__, __dict__, vars(), dir()) |
Completely forbidden. Write well-typed code: declare precise types at function boundaries so the type is always known statically. If you feel the need to check a type at runtime, the function signature is wrong — tighten it. Use typing.overload or a Protocol if you need to express a union of calling conventions. The only permitted introspection is structured pattern matching (match/case) on Literal or Enum values. Single approved exception: @model_validator(mode="before") — Pydantic calls this validator with object by design (the input may be a dict, another model instance, or any other type). The guard if not isinstance(data, dict): return data is Pydantic-mandated boilerplate required to safely handle non-dict construction paths (e.g. constructing a model from another model instance). This is the one and only approved use of isinstance. |
Dependency injection standards: PRD-012 — Backend Architecture & DI covers the full rules for FastAPI
Depends()usage, async resource lifecycle, ARQ worker DI, dependency module organization, and test override patterns.
Note: ruff's TCH001/TCH002/TCH003 rules do the opposite — they push imports under if TYPE_CHECKING:.
Those rules are disabled in this project's ruff config (TCH is absent from select) because they encourage
the pattern we forbid.
Still-valid typing imports (not deprecated)¶
These have no builtin replacements and remain correct to import from typing:
Annotated, TypeVar, ParamSpec, TypeVarTuple, Protocol, overload, ClassVar, Final, Literal,
TypeGuard, Never, Self, Unpack
TYPE_CHECKING is explicitly excluded. Annotation-only imports must be at module level unconditionally.
from __future__ import annotations (PEP 563) makes annotation evaluation lazy (annotations are stored
as strings, so forward references never raise NameError), but the import statements themselves still
execute at module load time — it does not eliminate import overhead. If a module-level import is
problematic, that is an architecture signal: fix the dependency, do not guard the import.
No Any¶
ANN401 is enabled. The only valid escape hatch is object (the true top type) when a genuine heterogeneous
container is needed. Annotate with a comment explaining why Any cannot be avoided if the linter is suppressed
via # noqa: ANN401.
Comment Policy¶
Inline comments inside function bodies are forbidden except for one purpose: explaining how a non-obvious implementation works — a quirk, a subtle invariant, or a non-obvious contract that the code alone cannot convey.
Narrating what the code does is never allowed. If a line needs a comment to explain what it does, rewrite the line so it is self-explanatory (better name, extracted function, etc.).
| Allowed | Forbidden |
|---|---|
| Docstrings at the top of a class or function | # Validate state before a validation call |
# getdel: atomic fetch-and-delete guarantees single-use |
# Issue access token before jwt.encode(...) |
# noqa: ANN401 — heterogeneous mapping, no bound type |
# Step 1: fetch user / # Step 2: store token |
7 * 24 * 3600, # 7-day TTL — matches refresh token lifetime |
# Call the GitHub API |
This applies equally to TypeScript/JavaScript in the frontend: same rule, same exceptions.
Docstring Standards¶
Convention: Google style (enforced by ruff D + convention = "google").
Required on¶
- All public classes (
D101) - All public methods (
D102) - All public functions (
D103) __init__methods when the class docstring does not describe args (D107)
Template¶
def fetch_issue(url: GitHubIssueUrl, token: str) -> GitHubIssue:
"""Fetch a GitHub issue via the REST API.
Args:
url: Validated GitHub issue URL.
token: Personal access token with `repo` scope.
Returns:
Parsed issue data.
Raises:
GitHubAPIError: If the API returns a non-2xx response.
"""
TypedDict vs Pydantic BaseModel: Decision Guide¶
The problem with TypedDict¶
TypedDict provides only static type hints — no runtime validation, no serialization helpers, no default values
without NotRequired boilerplate, no computed fields, no frozen immutability, and dict-access syntax
(state["field"]) instead of attribute access (state.field).
Rule: use the right tool for the layer¶
| Use case | Type to use | Reason |
|---|---|---|
LangGraph state (BugTriageState) |
Pydantic BaseModel |
PRD-003 deliberately chose BaseModel over TypedDict to leverage @model_validator(mode="before") for checkpoint migration (renaming fields across schema versions). LangGraph accepts BaseModel state and handles partial dict returns from nodes cleanly — no full model reconstruction per node update is required. |
| API request / response bodies | Pydantic BaseModel |
Runtime validation, automatic 422 response, .model_dump() |
Internal structured data (AgentFinding, HumanExchange, TriageReport) |
Pydantic BaseModel |
Serialization to/from Redis, validation, attribute access |
Supervisor LLM output (SupervisorDecision) |
Pydantic BaseModel |
.with_structured_output() accepts TypedDict / JSON schema too, but BaseModel is preferred: returns a validated object (not a raw dict), attribute access, validation errors surface cleanly (PRD-003 §Supervisor Output Schema) |
| Simple config / constants | dataclass(frozen=True) |
No runtime dep, immutable, attribute access |
Implication for PRD-003¶
AgentFinding, HumanExchange, and TriageReport use Pydantic BaseModel for serialisation,
validation, and attribute access.
BugTriageState also uses Pydantic BaseModel — a deliberate choice driven by the need for
@model_validator(mode="before") to handle checkpoint migration when fields are renamed or their
types change across schema versions. LangGraph accepts BaseModel state natively: nodes return
partial dicts (only changed keys), which LangGraph merges into the checkpointed state without
requiring full model reconstruction. The rationale that "BaseModel requires full model
reconstruction per node update" is incorrect — LangGraph handles partial dict returns cleanly
regardless of whether the state type is TypedDict, dataclass, or BaseModel.
Example: correct boundary¶
from __future__ import annotations
from pydantic import BaseModel, Field, model_validator
class BugTriageState(BaseModel):
issue_url: str
findings: list[AgentFinding] = Field(default_factory=list)
report: TriageReport | None = Field(default=None)
@model_validator(mode="before")
@classmethod
def migrate_from_checkpoint(cls, data: object) -> object:
if not isinstance(data, dict):
return data
# migration branches here
return data
class AgentFinding(BaseModel):
agent_name: str
summary: str
relevant_files: list[str]
confidence: float
class TriageJobResponse(BaseModel):
job_id: str
status: str
report: TriageReport | None = None
Pre-commit & CI¶
Pre-commit (fast, local)¶
ruff runs on every commit via pre-commit hooks. ty is excluded from pre-commit — type checking is too slow
for a blocking commit hook on large diffs.
# .pre-commit-config.yaml
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.0
hooks:
- id: ruff # lint + autofix
args: [ --fix ]
- id: ruff-format # format
CI pipeline (GitHub Actions)¶
# .github/workflows/ci.yml (relevant steps)
- name: Install dependencies
run: uv sync --group dev --group test
- name: Lint
run: uv run ruff check .
- name: Format check
run: uv run ruff format --check .
- name: Type check
run: uv run ty check src/
- name: Test
run: uv run pytest --cov=src tests/
All four checks must pass for a PR to merge. There is no manual override — fix the code.
pytest Configuration¶
[tool.pytest.ini_options]
asyncio_mode = "auto" # pytest-asyncio: no @pytest.mark.asyncio needed
testpaths = ["tests"]
addopts = "--strict-markers"
[tool.coverage.run]
source = ["src"]
omit = ["tests/*"]
[tool.coverage.report]
fail_under = 80
asyncio_mode = "auto"¶
All async def test_* functions are automatically treated as async tests. No per-test decorator required.
Consistent with the project's async-first architecture (FastAPI, ARQ, LangGraph async).
Coverage threshold¶
80% line coverage is the minimum for CI to pass. New features must include tests that keep coverage above this
floor. Coverage reports are generated per-run; fail_under = 80 is a hard gate, not a suggestion.