PRD-002 — Frontend & User Experience¶

Field	Value
Document ID	PRD-002
Version	1.0
Status	DRAFT
Date	March 2026
Parent Doc	PRD-001
Related Docs	PRD-003 (Human-in-the-Loop), PRD-005 (LangSmith)

Overview¶

The AgentOps Dashboard frontend is a Jira-inspired single-page application that makes multi-agent AI execution visible, interactive, and controllable in real time. The UI is intentionally modeled after issue tracking tools because:

Developers already understand the mental model (jobs as tickets, agents as assignees, outputs as comments)
Jira-style layouts communicate status, priority, and progress at a glance
The ticket metaphor maps cleanly onto LangGraph's job lifecycle (queued → running → waiting → done)

The frontend connects to the FastAPI backend via Server-Sent Events (SSE) for live streaming and standard REST calls for user actions (submit job, answer agent question, pause/kill agent).

Detailed specs: Design System · Pages & Wireframes · Interactions & A11y

Design Principles¶

Principle	Description
Live by default	Every job view is streaming. There is no "refresh" — the UI updates as agents execute.
Trust through transparency	Users see every agent's reasoning, not just the final answer. Opacity kills trust in AI systems.
Human always in control	Pause, redirect, and kill controls are always visible. Agents cannot take irreversible actions (GitHub write-back) without explicit user approval.
Familiar, not novel	The Jira mental model is intentional. Developers should feel oriented immediately, not confronted with a new paradigm.
Progressive disclosure	High-level status in the job list; full detail in the workspace; raw traces via LangSmith deep-link. Not everything needs to be visible at once.

Application Layout¶

The application is a three-zone layout, always visible simultaneously on desktop (≥1280px wide):

Zone 1 — Job Queue (25%)	Zone 2 — Live Workspace (50%)	Zone 3 — Output Panel (25%)
Ticket cards with statuses	Agent cards streaming in real time	Final triage report
Filter bar	Agent question cards (amber, blocking)	GitHub comment draft
New Job button	Execution timeline	Ticket draft + Post to GitHub
		View in LangSmith link

Purpose¶

Displays all submitted triage jobs as ticket cards, analogous to Jira's issue list. Clicking a job loads it in Zone 2 and 3.

Job Card Structure¶

Each card shows:

┌─────────────────────────────────────────┐
│ ● RUNNING                    #1042      │
│ Auth token expiry causes 500 on /api/me │
│ github.com/org/repo                     │
│ Submitted 4 min ago    3 agents active  │
└─────────────────────────────────────────┘

Field	Description
Status badge	Color-coded: QUEUED (gray), RUNNING (blue, animated), WAITING (amber — needs human input), DONE (green), FAILED (red)
Issue title	Truncated GitHub issue title
Repository	`org/repo` short form
Timestamp	Relative time since submission
Active agents count	Only shown when RUNNING

Status Definitions¶

Status	LangGraph State	Color	Description
QUEUED	Thread created, not started	Gray	Job submitted, waiting for resources
RUNNING	Graph executing	Blue (pulse animation)	At least one agent node is active
PAUSING	Pause flag set, not yet reached supervisor boundary	Blue (static)	Pause requested — waiting for current supervisor iteration to complete
WAITING	`interrupt()` fired	Amber (pulse)	Graph paused — user answer required
DONE	Graph reached END node	Green	All outputs produced
FAILED	Unhandled exception	Red	Job errored; LangSmith trace available

PAUSING is a client-side-only status: set optimistically on POST /jobs/{id}/pause 200 response; cleared when a graph.paused SSE event arrives (transitions to WAITING) or a graph.resumed event arrives (transitions back to RUNNING, e.g. if pause was cancelled).

Interactions¶

Click job card → loads workspace in Zone 2/3
New Job button → opens a modal: paste GitHub issue URL, optional notes to supervisor, submit
Filter bar (top of sidebar): filter by status, repository, or date

Zone 2 — Live Workspace¶

Purpose¶

The center panel is the heart of the product. It shows the active job's execution in real time: each agent's activity, reasoning, tool calls, and outputs stream in as they happen. This is where the Jira analogy extends: the workspace is like a Jira ticket that writes itself, section by section, as agents work.

Workspace Header¶

┌────────────────────────────────────────────────────────────────┐
│  #1042  Auth token expiry causes 500 on /api/me                │
│  github.com/org/repo · Opened by @user · 2 hours ago          │
│                        [↪ Redirect]  [⏸ Pause]  [✕ Kill]  │
└────────────────────────────────────────────────────────────────┘

Agent Cards¶

Each active or completed agent gets an Agent Card in the workspace. Cards appear as agents are spawned and fill in as they execute:

┌────────────────────────────────────────────────────────────────┐
│  🔍  INVESTIGATOR AGENT                    ● DONE  (12s)      │
├────────────────────────────────────────────────────────────────┤
│  Reading issue body...                                         │
│  Identified error: HTTP 500 on authenticated endpoint          │
│  Hypothesis forming: likely JWT validation or middleware issue │
│                                                                │
│  → Passed to: Codebase Searcher, Web Search Agent             │
└────────────────────────────────────────────────────────────────┘

Agent Card states:

State	Visual
Spawning	Card fades in with a shimmer skeleton
Running	Animated "typing" indicator; text streams in token by token
Waiting on tool	Shows tool name: `🔧 Searching codebase for "JWT middleware"...`
Done	Green checkmark; elapsed time shown
Error	Red border; error message; link to LangSmith trace

Agent Question Cards¶

When the supervisor decides to ask the user a question, a question card appears above all other agents, with amber styling to communicate urgency. The entire graph is paused until the user responds.

┌────────────────────────────────────────────────────────────────┐
│  ⚠  SUPERVISOR NEEDS YOUR INPUT               ● WAITING       │
├────────────────────────────────────────────────────────────────┤
│  The error appears in two separate code paths:                 │
│                                                                │
│  (A) auth/middleware.py — JWT token validation                 │
│  (B) db/session.py — Database connection pooling               │
│                                                                │
│  Which code path should agents prioritize for deep analysis?   │
│                                                                │
│  ┌──────────────────────────────────────────────────────────┐ │
│  │ Type your answer here...                                  │ │
│  └──────────────────────────────────────────────────────────┘ │
│                                           [Continue →]         │
└────────────────────────────────────────────────────────────────┘

Full specification in PRD-003.

Execution Timeline¶

Below the agent cards, a compact horizontal timeline shows the sequence of node executions:

flowchart LR
    S([START]) --> INV["investigator\n✓ 12s"]
    INV --> CS["codebase_search\n✓ 18s"]
    CS --> HI["⚠ human_input\n● waiting"]
    HI -.-> WS["web_search\n..."]
    WS -.-> CRIT["critic\n..."]
    CRIT -.-> WR["writer\n..."]
    WR -.-> E([END])

This is the most direct visualization of the LangGraph graph state — nodes light up as they complete.

Zone 3 — Output Panel¶

Purpose¶

The right panel accumulates the final structured outputs as the Writer agent produces them. Content is editable before any GitHub write-back occurs.

Structured Report Card¶

TRIAGE REPORT

Severity:    🔴 HIGH
Category:    Authentication / Token Handling
Confidence:  87%

Root Cause:
  JWT expiry check in auth/middleware.py:L142 does not
  account for timezone offset. Token appears valid locally
  but fails on UTC server time.

Relevant Files:
  · auth/middleware.py (lines 138–155)
  · tests/test_auth.py (missing expiry edge case)
  · config/jwt_settings.py

Similar Past Issues:
  · #891 — Fixed 2024-11 (same root cause, different endpoint)

GitHub Comment Draft¶

A text area, pre-populated by the Writer agent, editable by the user:

## 🤖 AgentOps Triage — Issue #1042

**Severity:** High
**Root Cause:** JWT timezone handling bug in `auth/middleware.py:142`

The token expiry check uses local time instead of UTC...
[full comment text]

---
*Triaged by AgentOps Dashboard · [View full trace](#)*

Ticket Draft¶

A structured form (also pre-filled by Writer agent):

Field	Value
Title	`[Bug] JWT expiry fails on UTC server — auth/middleware.py:142`
Labels	`bug`, `authentication`, `high-priority`
Assignee suggestion	`@backend-team`
Effort estimate	`M (2–4 hours)`

Action Buttons¶

Post Comment to GitHub — posts the comment draft to the original issue (requires GitHub auth)
Create GitHub Issue — creates a new ticket with the ticket draft fields
Copy Report — copies the structured report as markdown
View in LangSmith — deep-link to the full job trace in LangSmith ( see PRD-005)

Streaming Architecture¶

SSE Event Types¶

The FastAPI backend emits the following event types over the SSE stream. Every message is emitted with an SSE id: field set to a per-job monotonically incrementing integer (e.g. id: 42). On reconnect the browser sends this value as the Last-Event-ID header automatically. v1 behavior: the backend reconnects the stream but does not replay missed events — Redis Pub/Sub has no message history, so events emitted during the disconnect window are silently lost. Gapless resume (e.g., via Redis Streams or a DB event log) is a v2 concern. See PRD-008 §Token Expiry During an Active Stream for the token-expiry reconnect flow.

Event	Payload	Frontend Action
`job.started`	`{ job_id, issue_url, issue_title }`	Create job card, open workspace
`agent.spawned`	`{ agent_name, agent_id }`	Create new agent card with skeleton
`agent.token`	`{ agent_id, token }`	Append token to agent card text
`agent.tool_call`	`{ agent_id, tool_name, input }`	Show tool indicator on agent card
`agent.tool_result`	`{ agent_id, tool_name, result_summary }`	Update tool indicator to done
`agent.done`	`{ agent_id, elapsed_ms }`	Mark agent card as done with time
`graph.interrupt`	`{ question, context }`	Show question card, amber status
`graph.resumed`	`{}`	Remove question card, resume status
`graph.node_complete`	`{ node_name }`	Update timeline
`output.token`	`{ section, token }`	Stream into output panel section
`output.section_done`	`{ section }`	Enable edit/copy controls for that section
`job.done`	`{ report, comment_draft, ticket_draft }`	Finalize output panel, green status
`job.failed`	`{ error, langsmith_url }`	Red status, show error card

Concurrent section streaming: The Writer agent uses RunnableParallel, so output.token events for report, comment_draft, and ticket_draft are interleaved — there is no ordering guarantee across sections. The frontend must buffer tokens per section and may not assume a section is complete until it receives output.section_done for that section. job.done is emitted only after all three output.section_done events have been emitted.

Concurrent connections (multiple tabs)¶

Opening the same job in more than one browser tab is safe and fully supported. Each GET /jobs/{job_id}/stream request creates an independent Redis Pub/Sub subscriber on the jobs:{job_id}:events channel. The ARQ worker publishes each event once; Redis delivers a copy to every active subscriber. Consequences:

No duplicate processing — subscribers are read-only; publishing happens once in the worker.
No missed events — each subscriber receives its own copy from Redis from the moment it subscribes.
No interference — closing or losing one tab's connection unsubscribes only that connection's pubsub object; the worker and all other subscribers are unaffected.

This is specified at the backend level in PRD-003 §Streaming to Frontend. Events emitted before a tab connects are not replayed (same v1 limitation as reconnect — see above).

Frontend State Management¶

State is managed with Zustand. Each job has its own state slice:

interface JobState {
    id: string
    status: 'queued' | 'running' | 'pausing' | 'waiting' | 'done' | 'failed'
    agents: AgentCardState[]
    timelineNodes: TimelineNode[]
    pendingQuestion: Question | null
    report: StructuredReport | null
    commentDraft: string
    ticketDraft: TicketDraft | null
    langsmithUrl: string | null
    completedSections: Set<string>   // populated by output.section_done events
}

GitHub Write-Back Flow¶

Write-back to GitHub is always a manual, user-initiated action. The flow:

Writer agent produces comment draft and ticket draft
User reviews and edits both in Zone 3
User clicks "Post Comment to GitHub"
Frontend calls POST /jobs/{id}/post-comment with final comment text
Backend posts to GitHub API using stored OAuth token (Fernet-encrypted in Redis — see PRD-008 §GitHub OAuth Token Management)
UI shows confirmation with link to the GitHub comment

No agent can post to GitHub autonomously. This is a hard constraint in v1.0.

Settings Page¶

A minimal Settings page (/settings) lets users manage their GitHub connection and session.

GitHub Account Panel¶

Element	Behaviour
Connected account row	Shows GitHub avatar, `@login`, and connection date
Disconnect GitHub button	Calls `DELETE /auth/github-token`; disables write-back buttons until re-auth
Re-connect link	Opens the GitHub OAuth flow (`GET /auth/login`) to restore write-back capability

After disconnect, the "Post Comment to GitHub" and "Create GitHub Issue" buttons in the Output Panel are replaced with a "Reconnect GitHub to enable write-back" notice. No triage or analysis functionality is affected — only GitHub write-back is blocked.

Session Panel¶

Element	Behaviour
Log out button	Calls `POST /auth/logout`; invalidates refresh token in Redis, clears cookie, redirects to `/login`

Component Tree¶

graph TD
    App --> Header
    App --> AppLayout
    AppLayout --> JQS["JobQueueSidebar"]
    AppLayout --> LW["LiveWorkspace"]
    AppLayout --> OP["OutputPanel"]

    JQS --> NJM["NewJobModal"]
    JQS --> JFB["JobFilterBar"]
    JQS --> JC["JobCard × N"]

    LW --> WH["WorkspaceHeader"]
    LW --> ET["ExecutionTimeline"]
    LW --> AQC["AgentQuestionCard (conditional)"]
    LW --> ACL["AgentCardList"]
    ACL --> AC["AgentCard × N"]
    WH --> PB["PauseButton"]
    WH --> KB["KillButton"]

    OP --> TRC["TriageReportCard"]
    OP --> GCE["GitHubCommentEditor"]
    OP --> TDF["TicketDraftForm"]
    OP --> OA["OutputActions"]
    OA --> PCB["PostCommentButton"]
    OA --> CTB["CreateTicketButton"]
    OA --> LSD["LangSmithDeepLink"]

Tech Stack¶

Concern	Choice	Rationale
Framework	React 18 + TypeScript	Component model matches streaming UI patterns well
State management	Zustand	Lightweight, no boilerplate, works perfectly with streaming updates
Streaming	Native `EventSource` (SSE)	Server-push only; simpler than WebSockets for this use case
Styling	Tailwind CSS	Utility-first, fast iteration on the Jira-like layout
HTTP client	Axios	Clean REST calls for submit/answer/pause/kill
Build tool	Vite	Fast HMR, modern ESM

Non-Functional Requirements¶

Requirement	Target
Time to first streaming token in UI	< 5 seconds from job submission
SSE reconnect on drop	Automatic, within 2 seconds; v1 may miss events emitted during disconnect window (no replay)
Concurrent jobs in UI	Support up to 10 simultaneously with no performance degradation
Browser support	Chrome 110+, Firefox 115+, Safari 16+
Accessibility	WCAG 2.1 AA for all static content; streaming regions use `aria-live="polite"`
Responsive layout	Full functionality at 1280px+; graceful degradation at 1024px

PRD-002 — Frontend & User Experience¶

Overview¶

Design Principles¶

Application Layout¶

Zone 1 — Job Queue Sidebar¶

Purpose¶

Job Card Structure¶

Status Definitions¶

Interactions¶

Zone 2 — Live Workspace¶

Purpose¶

Workspace Header¶

Agent Cards¶

Agent Question Cards¶

Execution Timeline¶

Zone 3 — Output Panel¶

Purpose¶

Structured Report Card¶

GitHub Comment Draft¶

Ticket Draft¶

Action Buttons¶

Streaming Architecture¶

SSE Event Types¶

Concurrent connections (multiple tabs)¶

Frontend State Management¶

GitHub Write-Back Flow¶

Settings Page¶

GitHub Account Panel¶

Session Panel¶

Component Tree¶

Tech Stack¶

Non-Functional Requirements¶