Saga Orchestrator¶
The saga orchestrator is the stateful choreographer for the execution lifecycle. It coordinates multi-step workflows by subscribing to execution events and publishing saga commands, ensuring that complex operations complete correctly or get properly compensated on failure.
graph LR
Kafka[(Kafka)] --> Saga[Saga Orchestrator]
Saga --> Commands[Saga Commands]
Commands --> Kafka
Saga --> Mongo[(MongoDB)]
How it works¶
When an execution request comes in, the saga orchestrator creates a new saga instance and drives it through its lifecycle. Each saga tracks which steps have been completed and what compensation actions are needed if something fails.
The orchestrator issues commands like CreatePodCommand and DeletePodCommand to the K8s Worker.
It watches for completion events (EXECUTION_COMPLETED, EXECUTION_FAILED, EXECUTION_TIMEOUT) and advances
the saga state accordingly. The execution saga specifically stays in RUNNING state after its steps complete,
waiting for these external completion events before transitioning to a terminal state. If a step fails or times
out, it triggers compensation — like deleting a partially created pod.
The clever part is idempotency. The orchestrator reconstructs saga state from events on restart, so it can resume interrupted workflows without duplicate side effects. If a pod was already created, it won't try to create it again.
Saga states¶
stateDiagram-v2
[*] --> Started: execution_requested
Started --> Running: pod_created
Running --> Completed: execution_completed
Running --> Failed: execution_failed
Running --> TimedOut: timeout
Failed --> Compensating: start_compensation
TimedOut --> Compensating: start_compensation
Compensating --> Compensated: cleanup_complete
Completed --> [*]
Compensated --> [*]
Topics¶
- Consumes:
execution_events, saga-related topics - Produces:
saga_commands
Key files¶
| File | Purpose |
|---|---|
run_saga_orchestrator.py |
Entry point |
saga_orchestrator.py |
Core orchestrator logic |
saga_service.py |
Saga state management |
execution_saga.py |
Execution saga definition |
saga_repository.py |
Saga persistence layer |
Deployment¶
saga-orchestrator:
image: ghcr.io/hardmax71/integr8scode/backend:${IMAGE_TAG:-latest}
command: ["python", "workers/run_saga_orchestrator.py"]
The orchestrator runs as a single replica since it's stateful. Event sourcing allows recovery after restarts.