# Roadmap

What's coming. These are teases, not commitments — order and shape will shift based on what hurts most as FJX gets used in earnest.

# Fully autonomous agents

Today the agents run in bootstrap loops kicked off by a human (just agent-dev, just agent-pm, just agent-qa). They pick up work when it's there and idle when it isn't, but the operator still starts and stops them.

The next step is self-driving agents that hold themselves up: launch on their own cadence, recover from their own crashes, throttle when the queue is empty, escalate when they detect drift in their own ledgers. The plumbing is mostly there — labels and assignees already carry the workflow state; cycle-to-cycle context already lives in the ledger. What's missing is a supervisor surface and a set of operator triggers (cost, irreversibility, confidence, drift) that decide when the human gets paged.

The triggers are the same four categories from the operator chair. This work is gated on the agent-build chair landing first — autonomy without action-tiered permissioning is the loud-failure-mode shape.

# A more mobile-friendly interface

The operator's job — approving specs, dropping /notes, routing handoffs, reading ledgers, merging PRs — should be doable from a phone while standing in line for coffee.

Forgejo's mobile UI handles the basics, but the FJX-specific surface (briefs, role-scoped ledgers, the trust-mapped slash-command vocabulary) deserves better than scrolling through long comment threads on a 6-inch screen. Likely shape: a small companion UI that surfaces the things only the human can do, in the order they need to do them. The state still lives in Forgejo — this is a view, not a second source of truth.

# Garbage collection & compaction

Forgejo issues are durable on purpose — the timeline is the audit trail. However, issues that are months old filled with agent briefs & ledgers is noise. Past a useful retention window, the issue weighs the instance down without contributing anything a search would surface.

The plan is a two-stage process owned by PM (this is librarian work — chief-of-staff chair):

Compact — for issues older than a configurable age, summarize the durable signal into the Forgejo project wiki: the decision, the design rationale, anything that became a convention. The wiki page is the surviving artifact; the issue's value is distilled into a few paragraphs and a link to the merged PR.
Collect — once compacted, delete the issue. The PR stays (commit history is the actual record); the wiki page carries the why; the issue itself is gone.

This is destructive, so it runs under operator-trigger discipline: a dry-run mode that prints the compaction summary for human review, a holdback list for issues tagged keep (or whatever the convention turns out to be), and per-batch confirmation. No surprise sweeps.

The hardest part isn't the deletion — it's the compaction. A bad summary destroys signal that looked redundant but wasn't. Likely shape: PM proposes the summary, the human approves it, the deletion follows. Same maker-checker pattern as everything else, applied to the act of forgetting.

# Tiered working agents

One agent-dev and one agent-qa is a starting shape, not a final one. The right shape is tiered roles picked based on the work's risk and complexity:

# Junior vs. Senior Dev

Jr Dev — cheap, fast, narrow scope. Takes only simple-phase issues (typos, comments, obvious one-liners) where the issue body is the spec and the blast radius is small. Smaller model, shorter context, lower per-cycle cost.
Sr Dev — slower, more expensive, broader judgment. Handles propose and apply, multi-file changes, anything that touches the schema or the API surface. Larger model, fuller context, paired with the more conservative review tier below.

PM (or the brief itself) decides which tier picks the issue up. Routing is by assignee, so this is mostly an account-and-permissions exercise once the prompts are split.

# Lite vs. Chaos Monkey QA

Lite QA — reads the Actions output, checks coverage delta, verifies the diff matches what the brief asked for. Fast, deterministic, cheap. Catches the obvious misses.
Chaos Monkey QA — the current agent-qa prompt. Deliberately misaligned, adversarial, looks for what the harness can't catch. Slower, more expensive, and worth it for PRs above a complexity threshold.

The brief decides which tier runs — trivial PRs get Lite, substantive ones get the full Chaos Monkey. Both tiers can be in flight on the same PR; their findings stack.

Why tier at all? Because uniformly spending Sr-Dev tokens on typos is wasteful, and uniformly running Chaos Monkey on doc-only PRs is wasteful, and the cost differential between tiers is large enough that the routing decision compounds. The PM-brief layer is the natural place to make that call.