PRD-007 — Python Developer Tooling & Code Quality Standards¶

Field	Value
Document ID	PRD-007
Version	1.0
Status	DRAFT
Date	March 2026
Author	Engineering Team
Parent	PRD-001
Related Docs	PRD-003 (BugTriageState schema), PRD-006 (Pydantic validation patterns)

Philosophy¶

One tool per concern, all from the Astral stack (uv + ruff + ty) — consistent, fast, Rust-based, zero config drift. Standards are enforced by tooling, not code review. If ruff passes and ty passes, the code is correct by definition of the project's quality bar.

The pyproject.toml is the single source of truth for packaging, tool config, and dependency declarations. No setup.py, no requirements.txt, no .flake8, no mypy.ini.

Python Version¶

Minimum: Python 3.12 (managed by uv)
.python-version file pins the Python minor version (3.12); uv selects the latest 3.12.x patch
requires-python = ">=3.12" declared in pyproject.toml

Python 3.12 enables:

type X = ... type alias syntax (PEP 695)
Full PEP 695 generic syntax (class Foo[T]: ...)
ExceptionGroup for structured exception handling
tomllib in stdlib (no external dep for TOML parsing)

Package & Environment Management: `uv`¶

uv replaces pip, venv, pip-tools, and pipx in a single binary. It is the only tool used to manage Python environments and dependencies on this project.

Key Commands¶

uv sync                          # install project + dev dependencies (dev group is default)
uv sync --all-groups             # install project + all dependency groups (dev + test)
uv sync --only-dev               # install dev tooling only — no project or runtime deps (CI linting)
uv add fastapi                   # add a runtime dependency to [project].dependencies
uv add --group dev ruff          # add a dev dependency to [dependency-groups].dev
uv run pytest                    # run pytest in the managed venv
uv run ruff check .              # run ruff in the managed venv
uv run ty check src/             # run ty in the managed venv

pyproject.toml — uv-owned sections¶

[project]
name = "agent-ops-dashboard"
version = "0.1.0"
requires-python = ">=3.12"

[tool.uv]
# uv-specific settings (index configuration, etc.)

Dependency Groups (PEP 735)¶

Groups are declared in [dependency-groups] (PEP 735), not [project.optional-dependencies]. This is the uv-preferred approach and avoids the semantic misuse of optional deps for developer tooling.

[project]
dependencies = [
    # Runtime — always installed in production
    "fastapi>=0.115",
    "pydantic>=2.7",
    "pydantic-settings>=2.3",
    "langgraph>=0.2",
    "langchain>=0.3",
    "langchain-openai>=0.2",
    "langserve>=0.3",
    "arq>=0.26",
    "redis>=5.0",
    "httpx>=0.27",
    "uvicorn[standard]>=0.30",
    "langsmith>=0.1",
]

[dependency-groups]
dev = [
    "ruff>=0.6",
    "ty>=0.0.1a1", # Astral type checker — minimum version while stabilising; uv.lock pins the exact build
    "pre-commit>=3.8",
]
test = [
    "pytest>=8.3",
    "pytest-asyncio>=0.24",
    "httpx>=0.27", # AsyncClient for FastAPI test client
    "pytest-cov>=5.0",
]

Why three groups¶

Group	Purpose	Installed in
`dependencies`	Ships in production — the running application	prod + CI + dev
`dev`	Linters, type checker, pre-commit — developer tooling	dev VMs only
`test`	Test runtime — pytest, coverage	CI test runners + dev

Linting & Formatting: `ruff`¶

ruff replaces: black, isort, flake8, pyupgrade, pydocstyle, flake8-annotations. One binary, one config block in pyproject.toml, sub-millisecond on incremental runs.

[tool.ruff]
target-version = "py312"
line-length = 100

[tool.ruff.lint]
select = [
    "E", # pycodestyle errors
    "W", # pycodestyle warnings
    "F", # pyflakes
    "I", # isort
    "UP", # pyupgrade — enforces Python 3.12+ syntax
    "D", # pydocstyle — 100% docstring coverage on public API
    "ANN", # flake8-annotations — all functions must be typed
    "RUF", # ruff-specific rules
    "S", # flake8-bandit security lint (subset)
    "PLC0415", # import-outside-toplevel — no local imports inside functions
]
ignore = [
    "D100", # missing module docstring — optional at module level
    "D104", # missing package docstring
    "ANN101", # self annotation — never required
    "ANN102", # cls annotation — never required
]

[tool.ruff.lint.pydocstyle]
convention = "google"

[tool.ruff.lint.isort]
known-first-party = ["agent_ops_dashboard"]

[tool.ruff.format]
quote-style = "double"
indent-style = "space"

Docstring coverage via `D` rules¶

All public classes, methods, and functions must have a Google-style docstring. Private (_prefixed) items are exempt. This gives 100% coverage on the public API surface without noise on internals.

Type Checking: `ty`¶

ty is Astral's Rust-based type checker — same team as ruff/uv. As of 2025 it is in active alpha; the project pins it with >=0.0.1a1 and accepts minor churn during stabilisation.

[tool.ty]
python-version = "3.12"

[tool.ty.rules]
# All rules on by default in strict mode

Run: uv run ty check src/

Why `ty` over `mypy`/`pyright`¶

Criterion	ty	mypy / pyright
Toolchain alignment	Same Astral ecosystem as ruff/uv	Separate teams and config surfaces
Performance	Rust core — significantly faster	Python (mypy) / Node (pyright)
uv integration	Native	External install required
Maturity	Alpha — may hit edge-case gaps	Mature, broad plugin ecosystem

Trade-off: ty is alpha. If a blocking issue is encountered, fall back to pyright (already in the Astral orbit via Pylance). Document the blocker in this PRD when that decision is made.

Python 3.12+ Type System Standards¶

Forbidden Patterns¶

Ruff-enforced¶

Violations are caught automatically — no manual review required.

Forbidden	Replacement	Rule
`from typing import List, Dict, Tuple, Set`	`list`, `dict`, `tuple`, `set` (builtins)	UP035
`Optional[X]`	`X \\| None`	UP007
`Union[X, Y]`	`X \\| Y`	UP007
`from typing import TypeAlias` + `X: TypeAlias = ...`	`type X = ...` (PEP 695)	UP040
`Any`	Specific type, `TypeVar`, or `Protocol`	ANN401
Untyped function parameter	Full annotation required	ANN001
Untyped function return	Full annotation required	ANN201
Local import (inside a function or method body)	Move to module-level	PLC0415

Architecture policy (not a ruff rule — enforced in code review)¶

Forbidden	Resolution
`TYPE_CHECKING` / `if TYPE_CHECKING:` (any use)	Extract shared types to a dedicated `models.py` or `types.py` module that neither side of a dependency cycle imports. `TYPE_CHECKING` is a symptom of import cycles or over-eager imports — fix the architecture, do not guard the import.
Try-catch spam — `try/except` blocks scattered across service functions for cross-cutting concerns (logging, failure events, observability)	Cross-cutting concerns belong in one service-wide handler: a `@worker_error_handler` decorator applied in `WorkerSettings`, a FastAPI `@app.exception_handler`, or a Starlette middleware class. Individual business-logic functions must not handle exceptions they cannot recover from.
Try-catch-reraise — catching an exception only to log/publish it then `raise` again	Never catch-and-reraise inside a business function. Let the exception propagate to the service-wide handler. One try-except per cross-cutting concern, declared once.
Multiple nested `try/except` blocks in a single function	If retry logic genuinely requires catching exceptions, extract it into a dedicated helper with a single `try/except` inside a loop (max attempts). The caller remains exception-free.
`isinstance()`, `cast()`, `type()`, `getattr()`/`setattr()` for runtime type dispatch, or any reflection (`__class__`, `__dict__`, `vars()`, `dir()`)	Completely forbidden. Write well-typed code: declare precise types at function boundaries so the type is always known statically. If you feel the need to check a type at runtime, the function signature is wrong — tighten it. Use `typing.overload` or a `Protocol` if you need to express a union of calling conventions. The only permitted introspection is structured pattern matching (`match`/`case`) on `Literal` or `Enum` values. Single approved exception: `@model_validator(mode="before")` — Pydantic calls this validator with `object` by design (the input may be a `dict`, another model instance, or any other type). The guard `if not isinstance(data, dict): return data` is Pydantic-mandated boilerplate required to safely handle non-dict construction paths (e.g. constructing a model from another model instance). This is the one and only approved use of `isinstance`.

Dependency injection standards: PRD-012 — Backend Architecture & DI covers the full rules for FastAPI Depends() usage, async resource lifecycle, ARQ worker DI, dependency module organization, and test override patterns.

Note: ruff's TCH001/TCH002/TCH003 rules do the opposite — they push imports under if TYPE_CHECKING:. Those rules are disabled in this project's ruff config (TCH is absent from select) because they encourage the pattern we forbid.

Still-valid `typing` imports (not deprecated)¶

These have no builtin replacements and remain correct to import from typing:

Annotated, TypeVar, ParamSpec, TypeVarTuple, Protocol, overload, ClassVar, Final, Literal, TypeGuard, Never, Self, Unpack

TYPE_CHECKING is explicitly excluded. Annotation-only imports must be at module level unconditionally. from __future__ import annotations (PEP 563) makes annotation evaluation lazy (annotations are stored as strings, so forward references never raise NameError), but the import statements themselves still execute at module load time — it does not eliminate import overhead. If a module-level import is problematic, that is an architecture signal: fix the dependency, do not guard the import.

No `Any`¶

ANN401 is enabled. The only valid escape hatch is object (the true top type) when a genuine heterogeneous container is needed. Annotate with a comment explaining why Any cannot be avoided if the linter is suppressed via # noqa: ANN401.

Comment Policy¶

Inline comments inside function bodies are forbidden except for one purpose: explaining how a non-obvious implementation works — a quirk, a subtle invariant, or a non-obvious contract that the code alone cannot convey.

Narrating what the code does is never allowed. If a line needs a comment to explain what it does, rewrite the line so it is self-explanatory (better name, extracted function, etc.).

Allowed	Forbidden
Docstrings at the top of a class or function	`# Validate state` before a validation call
`# getdel: atomic fetch-and-delete guarantees single-use`	`# Issue access token` before `jwt.encode(...)`
`# noqa: ANN401 — heterogeneous mapping, no bound type`	`# Step 1: fetch user` / `# Step 2: store token`
`7 * 24 * 3600, # 7-day TTL — matches refresh token lifetime`	`# Call the GitHub API`

This applies equally to TypeScript/JavaScript in the frontend: same rule, same exceptions.

Docstring Standards¶

Convention: Google style (enforced by ruff D + convention = "google").

Required on¶

All public classes (D101)
All public methods (D102)
All public functions (D103)
__init__ methods when the class docstring does not describe args (D107)

Template¶

def fetch_issue(url: GitHubIssueUrl, token: str) -> GitHubIssue:
    """Fetch a GitHub issue via the REST API.

    Args:
        url: Validated GitHub issue URL.
        token: Personal access token with `repo` scope.

    Returns:
        Parsed issue data.

    Raises:
        GitHubAPIError: If the API returns a non-2xx response.
    """

TypedDict vs Pydantic BaseModel: Decision Guide¶

The problem with TypedDict¶

TypedDict provides only static type hints — no runtime validation, no serialization helpers, no default values without NotRequired boilerplate, no computed fields, no frozen immutability, and dict-access syntax (state["field"]) instead of attribute access (state.field).

Rule: use the right tool for the layer¶

Use case	Type to use	Reason
LangGraph state (`BugTriageState`)	Pydantic `BaseModel`	PRD-003 deliberately chose `BaseModel` over `TypedDict` to leverage `@model_validator(mode="before")` for checkpoint migration (renaming fields across schema versions). LangGraph accepts `BaseModel` state and handles partial dict returns from nodes cleanly — no full model reconstruction per node update is required.
API request / response bodies	Pydantic `BaseModel`	Runtime validation, automatic 422 response, `.model_dump()`
Internal structured data (`AgentFinding`, `HumanExchange`, `TriageReport`)	Pydantic `BaseModel`	Serialization to/from Redis, validation, attribute access
Supervisor LLM output (`SupervisorDecision`)	Pydantic `BaseModel`	`.with_structured_output()` accepts `TypedDict` / JSON schema too, but `BaseModel` is preferred: returns a validated object (not a raw dict), attribute access, validation errors surface cleanly (PRD-003 §Supervisor Output Schema)
Simple config / constants	`dataclass(frozen=True)`	No runtime dep, immutable, attribute access

Implication for PRD-003¶

AgentFinding, HumanExchange, and TriageReport use Pydantic BaseModel for serialisation, validation, and attribute access.

BugTriageState also uses Pydantic BaseModel — a deliberate choice driven by the need for @model_validator(mode="before") to handle checkpoint migration when fields are renamed or their types change across schema versions. LangGraph accepts BaseModel state natively: nodes return partial dicts (only changed keys), which LangGraph merges into the checkpointed state without requiring full model reconstruction. The rationale that "BaseModel requires full model reconstruction per node update" is incorrect — LangGraph handles partial dict returns cleanly regardless of whether the state type is TypedDict, dataclass, or BaseModel.

Example: correct boundary¶

from __future__ import annotations

from pydantic import BaseModel, Field, model_validator


class BugTriageState(BaseModel):
    issue_url: str
    findings: list[AgentFinding] = Field(default_factory=list)
    report: TriageReport | None = Field(default=None)

    @model_validator(mode="before")
    @classmethod
    def migrate_from_checkpoint(cls, data: object) -> object:
        if not isinstance(data, dict):
            return data
        # migration branches here
        return data


class AgentFinding(BaseModel):
    agent_name: str
    summary: str
    relevant_files: list[str]
    confidence: float


class TriageJobResponse(BaseModel):
    job_id: str
    status: str
    report: TriageReport | None = None

Pre-commit & CI¶

Pre-commit (fast, local)¶

ruff runs on every commit via pre-commit hooks. ty is excluded from pre-commit — type checking is too slow for a blocking commit hook on large diffs.

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.6.0
    hooks:
      - id: ruff          # lint + autofix
        args: [ --fix ]
      - id: ruff-format   # format

CI pipeline (GitHub Actions)¶

# .github/workflows/ci.yml (relevant steps)
- name: Install dependencies
  run: uv sync --group dev --group test

- name: Lint
  run: uv run ruff check .

- name: Format check
  run: uv run ruff format --check .

- name: Type check
  run: uv run ty check src/

- name: Test
  run: uv run pytest --cov=src tests/

All four checks must pass for a PR to merge. There is no manual override — fix the code.

pytest Configuration¶

[tool.pytest.ini_options]
asyncio_mode = "auto"         # pytest-asyncio: no @pytest.mark.asyncio needed
testpaths = ["tests"]
addopts = "--strict-markers"

[tool.coverage.run]
source = ["src"]
omit = ["tests/*"]

[tool.coverage.report]
fail_under = 80

asyncio_mode = "auto"¶

All async def test_* functions are automatically treated as async tests. No per-test decorator required. Consistent with the project's async-first architecture (FastAPI, ARQ, LangGraph async).

Coverage threshold¶

80% line coverage is the minimum for CI to pass. New features must include tests that keep coverage above this floor. Coverage reports are generated per-run; fail_under = 80 is a hard gate, not a suggestion.