CI/CD Pipeline¶
The project uses GitHub Actions to automate code quality checks, security scanning, testing, image publishing, and documentation deployment. The pipeline is split across several workflow files that trigger independently based on path filters, so only relevant checks run for each change.
Pipeline overview¶
graph LR
subgraph "Code Quality (lightweight)"
Ruff["Ruff Linting"]
MyPy["MyPy Type Check"]
Grimp["Grimp Orphan Modules"]
ESLint["ESLint + Svelte Check"]
end
subgraph "Security"
Bandit["Bandit SAST"]
SBOM["SBOM & Grype"]
end
subgraph "Stack Tests"
UnitBE["Backend Unit"]
UnitFE["Frontend Unit"]
Build["Build & Push Images"]
E2E_BE["Backend E2E"]
E2E_FE["Frontend E2E"]
UnitBE --> Build
UnitFE --> Build
Build --> E2E_BE
Build --> E2E_FE
end
subgraph "Docker Scan & Promote"
Scan["Trivy Scan (4 images)"]
Promote["Promote SHA → latest"]
Scan --> Promote
end
subgraph "Release & Deploy"
Release["CalVer Tag + GitHub Release"]
Deploy["SSH Deploy to Production"]
Release --> Deploy
end
subgraph "Documentation"
Docs["MkDocs Build"]
Pages["GitHub Pages"]
end
Push["Push / PR"] --> Ruff & MyPy & Grimp & ESLint & Bandit & SBOM & UnitBE & UnitFE & Docs
Build -->|main, all tests pass| Scan
Promote -->|main, scans pass| Release
Docs -->|main only| Pages
The three heavyweight workflows are Stack Tests (builds images, runs all tests), Docker Scan & Promote
(scans images with Trivy and promotes to latest), and Release & Deploy (creates CalVer releases and deploys to
production). They're chained: Docker Scan & Promote triggers after Stack Tests succeeds on main, and Release & Deploy
triggers after Docker Scan & Promote succeeds, forming a build-test-scan-promote-release-deploy pipeline where
production only updates when everything passes.
Workflow files¶
| Workflow | File | Trigger | Purpose |
|---|---|---|---|
| Stack Tests | .github/workflows/stack-tests.yml |
Push/PR to main, tags v* |
Unit tests, image build, E2E tests |
| Docker Scan & Promote | .github/workflows/docker.yml |
After Stack Tests completes on main |
Trivy scan + promote SHA tag to latest |
| Release & Deploy | .github/workflows/release-deploy.yml |
After Docker Scan & Promote completes on main |
CalVer release + SSH deploy to production |
| SBOM & Supply Chain | .github/workflows/sbom-compliance.yml |
Push/PR to main, weekly schedule |
SPDX SBOM generation + Grype vulnerability scan |
| Ruff Linting | .github/workflows/ruff.yml |
Push/PR to main |
Python code style and import checks |
| MyPy Type Checking | .github/workflows/mypy.yml |
Push/PR to main |
Python static type analysis |
| Frontend CI | .github/workflows/frontend-ci.yml |
Push/PR to main (frontend changes) |
ESLint + Svelte type check |
| Security Scanning | .github/workflows/security.yml |
Push/PR to main |
Bandit SAST |
| Dead Code Detection | .github/workflows/grimp.yml |
Push/PR to main |
Grimp orphan module detection |
| Documentation | .github/workflows/docs.yml |
Push/PR (docs/, mkdocs.yml) |
MkDocs build and GitHub Pages deploy |
Composite actions¶
Shared steps are extracted into reusable composite actions under .github/actions/. This eliminates duplication between
the backend and frontend E2E jobs, which both need k3s and the full docker compose stack but set it up differently.
| Action | File | Purpose |
|---|---|---|
| E2E Boot | .github/actions/e2e-boot/action.yml |
GHCR login, background image pull + infra pre-warm, k3s install |
| E2E Ready | .github/actions/e2e-ready/action.yml |
Finalize k3s, start compose stack, health checks |
The split is intentional. Frontend E2E needs to install Node.js and Playwright browsers between boot and ready, overlapping that work with k3s installation to save wall-clock time. Backend E2E calls them back-to-back since it has no setup to overlap.
Stack Tests (the main workflow)¶
This is the core testing workflow. It builds all 4 container images, pushes them to GHCR with immutable SHA-based tags, then runs E2E tests on separate runners that pull images from the registry.
graph TD
subgraph "Phase 1: Fast feedback"
A["Backend Unit Tests"]
B["Frontend Unit Tests"]
end
subgraph "Phase 2: Build"
C["Build & Push 4 Images to GHCR"]
end
subgraph "Phase 3: E2E (parallel runners)"
D["Backend E2E
(k3s + full stack)"]
E["Frontend E2E Shard 1/2
(k3s + Playwright)"]
F["Frontend E2E Shard 2/2
(k3s + Playwright)"]
end
A --> C
B --> C
C --> D & E & F
style A fill:#e8f5e9
style B fill:#e8f5e9
style C fill:#e1f5fe
style D fill:#fff3e0
style E fill:#fff3e0
style F fill:#fff3e0
Phase 1: Unit tests¶
Backend and frontend unit tests run in parallel. They need no infrastructure and complete quickly. If either fails, the image build is skipped entirely.
Phase 2: Build and push¶
All 4 images are built on a single runner and pushed to GHCR with an immutable sha-<7chars> tag:
| Image | Source |
|---|---|
base |
backend/Dockerfile.base |
backend |
backend/Dockerfile |
cert-generator |
cert-generator/Dockerfile |
frontend |
frontend/Dockerfile |
Workers reuse the backend image with different command: overrides in docker-compose, so no separate worker images
are needed. All 4 images are scanned by Trivy and promoted to latest in the
Docker Scan & Promote workflow.
The base image is cached separately as a zstd-compressed tarball since its dependencies rarely change. The backend
image uses docker/build-push-action@v6 with GHA layer cache (scope=backend, mode=max), so intermediate layers
are reused when only application code changes. Utility and frontend images also use GHA layer caching.
All 4 images are pushed to GHCR in parallel, with each push tracked by PID so individual failures are reported:
declare -A PIDS
for name in base backend cert-generator frontend; do
docker push "$IMG/$name:$TAG" &
PIDS[$name]=$!
done
FAILED=0
for name in "${!PIDS[@]}"; do
if ! wait "${PIDS[$name]}"; then
echo "::error::Failed to push $name"
FAILED=1
fi
done
[ "$FAILED" -eq 0 ] || exit 1
Fork PRs skip the GHCR push (no write access), so E2E tests only run for non-fork PRs.
Phase 3: E2E tests¶
Backend and frontend E2E tests run on separate runners. Each runner provisions its own k3s cluster and docker compose stack, pulling pre-built images from GHCR.
E2E Boot (.github/actions/e2e-boot)¶
This action kicks off three slow tasks that can overlap:
- GHCR login using
docker/login-action@v3 - Background image pull + infra pre-warm — pulls all compose images then starts infrastructure services
(mongo, redis, kafka) in a background
nohupprocess. The exit status is persisted to/tmp/infra-pull.exitso the next action can check for failures. - k3s install — downloads and installs a pinned k3s version with SHA256 checksum verification (see supply-chain hardening below)
E2E Ready (.github/actions/e2e-ready)¶
This action finalizes the environment after boot tasks complete:
- Finalize k3s — copies kubeconfig, rewrites the API server address to
host.docker.internalso containers inside docker compose can reach the k3s API server, creates theintegr8scodenamespace - Start cert-generator in the background
- Copy test config — uses
config.test.toml(secrets come from env var defaults) - Wait for image pull and infra — blocks until the background pull completes and checks the exit code from
/tmp/infra-pull.exit, failing fast if the background process had errors - Start compose stack with
docker compose up -d --no-build - Health checks — waits for backend (
/api/v1/health/live), and optionally frontend (https://localhost:5001)
Frontend E2E sharding¶
Frontend E2E tests use Playwright with 2 shards running in parallel on separate runners. Between e2e-boot and
e2e-ready, each shard installs Node.js dependencies and Playwright browsers (with caching), overlapping that work
with k3s booting in the background.
e2e-boot (GHCR login + pull + k3s install)
|
├── npm ci + playwright install (overlapped with k3s)
|
e2e-ready (finalize k3s + start stack + health check)
|
└── npx playwright test --shard=N/2
Coverage reporting¶
Each test suite reports coverage to Codecov with separate flags:
backend-unit— backend unit testsbackend-e2e— backend E2E testsfrontend-unit— frontend unit tests (Vitest withlcovoutput)
Log collection on failure¶
When E2E tests fail, logs are collected automatically and uploaded as artifacts:
- All docker compose service logs with timestamps
- Individual service logs for each worker
- Kubernetes events sorted by timestamp (backend E2E only)
Docker Scan & Promote¶
This workflow implements the promotion model: the latest tag is never set during the build. Only this workflow
sets it, and only after all tests pass.
graph LR
ST["Stack Tests
(main, success)"] -->|workflow_run trigger| Scan
Scan["Trivy Scan
(4 images in parallel)"] --> Promote["crane copy
sha-xxx → latest"]
Promote --> Summary["Step Summary"]
Trigger¶
Runs automatically when Stack Tests completes successfully on main. Can also be triggered manually via
workflow_dispatch with an optional SHA input to promote a specific commit.
Scan¶
Uses Trivy (pinned at v0.68.2) to scan all 4 deployed images in parallel via matrix strategy.
Scans for CRITICAL and HIGH severity vulnerabilities with unfixed issues ignored. Results are uploaded as SARIF
files to GitHub's Security tab.
Promote¶
Uses crane to copy manifests at the
registry level (crane copy sha-tag latest), avoiding any rebuild or re-push. This is a fast, atomic operation that
simply re-tags existing image manifests.
Release & Deploy¶
This workflow creates a CalVer-tagged GitHub Release and deploys to production. It chains after Docker Scan & Promote, completing the full pipeline from code push to production deployment.
graph LR
DSP["Docker Scan & Promote
(main, success)"] -->|workflow_run trigger| Release
Release["CalVer Tag
+ GitHub Release"] --> Deploy["SSH Deploy
to Production"]
Deploy --> Summary["Step Summary"]
Trigger¶
Runs automatically when Docker Scan & Promote completes successfully on main. Can also be triggered manually via
workflow_dispatch with an optional skip_deploy flag to create a release without deploying.
CalVer tagging¶
Releases use Calendar Versioning with the format YYYY.M.PATCH:
YYYY— full year (e.g.,2026)M— month without leading zero (e.g.,2for February)PATCH— auto-incrementing counter within the month, starting at0
Examples: 2026.2.0, 2026.2.1, 2026.3.0. The workflow counts existing tags matching the current YYYY.M.* pattern
and increments the patch number. All 4 deployed GHCR images are tagged with the CalVer version using crane (same
registry-level manifest copy as the promote step).
GitHub Release¶
The workflow creates an annotated git tag and a GitHub Release using softprops/action-gh-release@v2 with
generate_release_notes: true, which auto-generates a changelog from merged PRs since the previous release. The release
body includes the commit SHA and docker pull commands for the tagged images.
Production deployment¶
The deploy job SSHs into the production server using appleboy/ssh-action@v1 and runs:
git pull origin main— update config files and compose definitionsdocker login ghcr.io— authenticate with a dedicated read-only PAT (passed viaenvs, not embedded in the script)docker compose pull— pull the latest imagesdocker compose up -d --remove-orphans— recreate changed containers- Health check — polls
/api/v1/health/livewith a 120-second timeout docker image prune— clean up images older than 72 hours
The deploy job is skippable via the skip_deploy input on manual dispatch.
Required secrets¶
| Secret | Purpose |
|---|---|
DEPLOY_HOST |
Production server IP |
DEPLOY_USER |
SSH username |
DEPLOY_SSH_KEY |
Ed25519 private key for SSH |
DEPLOY_GHCR_TOKEN |
GitHub PAT with read:packages scope |
MAILJET_API_KEY |
Mailjet SMTP username (API key) |
MAILJET_SECRET_KEY |
Mailjet SMTP password (secret key) |
MAILJET_FROM_ADDRESS |
Verified sender email for alerts |
GRAFANA_ALERT_RECIPIENTS |
Email(s) that receive alert notifications |
See Deployment — Production deployment for setup instructions.
SBOM & Supply Chain Security¶
The sbom-compliance.yml workflow generates SPDX Software Bills of Materials for both backend
(Python) and frontend (JavaScript) components. It runs on every push/PR to main and weekly on a schedule.
For each component:
- Generate SBOM using anchore/sbom-action — produces an SPDX JSON file listing all direct and transitive dependencies
- Scan SBOM using anchore/scan-action (Grype) — checks for known
vulnerabilities with a
highseverity cutoff - Upload — SBOM artifacts are retained for 5 days; vulnerability results are uploaded as SARIF to GitHub's Security tab
Supply-chain hardening¶
k3s version pinning and checksum verification¶
The k3s installation in CI is hardened against supply-chain attacks:
- Pinned version —
K3S_VERSIONis set as a workflow-level env var (v1.32.11+k3s1), not fetched dynamically - Source pinning — the install script is fetched from the k3s GitHub repository at the exact tagged version
(e.g.,
https://raw.githubusercontent.com/k3s-io/k3s/v1.32.11%2Bk3s1/install.sh), not from theget.k3s.ioCDN - SHA256 verification — the install script is verified against a known checksum before execution:
K3S_TAG=$(echo "$K3S_VERSION" | sed 's/+/%2B/g')
curl -sfL "https://raw.githubusercontent.com/k3s-io/k3s/${K3S_TAG}/install.sh" -o /tmp/k3s-install.sh
echo "$K3S_INSTALL_SHA256 /tmp/k3s-install.sh" | sha256sum -c -
chmod +x /tmp/k3s-install.sh
INSTALL_K3S_VERSION="$K3S_VERSION" ... /tmp/k3s-install.sh
This prevents the common curl | sh anti-pattern where a compromised CDN or MITM could inject malicious code.
GHCR image tags¶
Images are tagged with sha-<7chars> (immutable, tied to a specific commit) during build. The latest tag is only
applied by the Docker Scan & Promote workflow after all tests and security scans pass. This means:
- Every E2E test runs against exactly the images built from that commit
latestis never stale or untested- Any commit's images can be pulled by their SHA tag for debugging
Dependency pinning¶
All GitHub Actions are pinned to major versions (e.g., actions/checkout@v6, docker/build-push-action@v6). Trivy is
pinned to a specific version (aquasecurity/trivy-action@0.33.1) for scan reproducibility.
Linting and type checking¶
Three lightweight workflows run independently since they catch obvious issues quickly.
Backend (Python):
- Ruff checks for style violations, import ordering, and common bugs
- mypy with strict settings catches type mismatches and missing return types
- Grimp detects orphan modules — modules that no other module in the package imports. Unlike symbol-level tools, it catches entire dead files (e.g. removed features) with zero false positives from framework patterns
Frontend (TypeScript/Svelte):
- ESLint checks for code quality issues
svelte-checkverifies TypeScript types and Svelte component correctness
Both use dependency caching (uv for Python, npm for Node.js) to skip reinstallation when lockfiles haven't changed.
Security scanning¶
The security.yml workflow uses Bandit to perform static analysis on Python source
files, flagging issues like hardcoded credentials, SQL injection patterns, and unsafe deserialization. It excludes the
test directory and reports only medium-severity and above findings. Container-level vulnerability scanning with Trivy
runs as part of the Docker Scan & Promote workflow.
Documentation¶
The docs workflow builds this documentation site using MkDocs with
the Material theme. It triggers only when files under docs/,
mkdocs.yml, or the workflow itself change.
On pushes to main, the workflow deploys the built site to GitHub Pages.
Build optimizations¶
Docker layer caching¶
All image builds use docker/build-push-action with GitHub Actions cache. Each service has its own cache scope, preventing pollution between unrelated builds:
- name: Build cert-generator image
uses: docker/build-push-action@v6
with:
context: ./cert-generator
file: ./cert-generator/Dockerfile
load: true
tags: integr8scode-cert-generator:latest
cache-from: type=gha,scope=cert-generator
cache-to: type=gha,mode=max,scope=cert-generator
Base image caching¶
The base image (Python + all pip dependencies) changes infrequently, so it's cached as a zstd-compressed tarball keyed
on Dockerfile.base, pyproject.toml, and uv.lock. On cache hit the image is loaded directly with docker load,
skipping the entire build.
Background infra pre-warm¶
The e2e-boot action pulls all docker compose images and starts infrastructure services in the background while k3s
installs. This overlaps network-bound (image pull) and CPU-bound (k3s compilation) work, saving several minutes per
E2E job.
Frontend Playwright caching¶
Playwright browsers are cached by package-lock.json hash. On cache hit, only system dependencies are installed
(playwright install-deps chromium), skipping the browser download.
Parallel image push¶
All 4 images are pushed to GHCR concurrently using background processes with PID tracking. Each push failure is
reported individually via ::error:: annotations.
Running locally¶
You can run most checks locally before pushing.
cd backend
# Linting
uv run ruff check . --config pyproject.toml
# Type checking
uv run mypy --config-file pyproject.toml --strict .
# Dead code detection (orphan modules)
uv run python scripts/check_orphan_modules.py
# Security scan
uv tool run bandit -r . -x tests/ -ll
# Unit tests only (fast, no infrastructure needed)
uv run pytest tests/unit -v
For E2E tests, use the deployment script to bring up the full stack:
# Start full stack with k8s configured locally
./deploy.sh dev --wait
# Run backend E2E tests inside the running container
docker compose exec -T backend uv run pytest tests/e2e -v
# Run frontend E2E tests
cd frontend && npx playwright test
Or use ./deploy.sh test which handles stack setup, testing, and teardown automatically.