Roadmap
Current Status
v0.9.0 is the current release. Phases 1 through 9 are complete.
Version Architecture
| Version | Phase(s) | Target | Description |
|---|---|---|---|
| v0.9.0 | 1–9 | Now (stealth) | Full build pipeline, 8 platforms, marketplace infra |
| v1.0 GA | 10–11 | June 2026 | Knowledge layer (brain index), marketplace live, stealth exit |
| v1.x | 12 | Aug–Oct 2026 | Org scale: ADR governance, A2A, domain layers |
| v2.0 | 13 | Q1 2027 | ADLC governance: agent registry, approval workflows, compliance scoring, full UI |
| v2.x | 14+ | H1 2027+ | Platform/OEM: non-coding agent compilation, white-labeled hub |
See docs/internal/plans/commercialization-roadmap.md for full business model and go-to-market strategy.
Phase 1: "AgentBoot Builds Itself" -- COMPLETE
AgentBoot compiles its own personas and uses them when developing itself.
What shipped:
- 6 core traits authored and composed into 4 personas
- Build pipeline: validate, compile, sync (all end-to-end)
- Configuration schema (
agentboot.config.json) - Claude Code native output: agents, skills, rules, CLAUDE.md with
@imports - Always-on instructions (baseline and security)
- Unit and integration tests for the build pipeline
Phase 2: "Usable by Others" -- COMPLETE
Any repo can install AgentBoot personas via CLI. Install, scaffolding, diagnostics, and uninstall all work.
What shipped:
- Scope merging (org to group to team to repo)
- Full CLI:
build,validate,sync,install,status,doctor,uninstall - Scaffolding:
add persona,add trait - Cross-platform output: standalone SKILL.md, copilot-instructions.md
- PERSONAS.md auto-generation
- Gotchas rules (path-scoped knowledge)
- Static prompt linting (
agentboot lint) - Token budget calculation at build time
- Prompt style guide baked into scaffolding templates
- Repo platform switching (
agentboot config repo platform) - First-session welcome fragment for developer onboarding
- Manifest tracking for managed files (
.agentboot-manifest.json)
Phase 3: "Ship It" -- COMPLETE
AgentBoot is distributable as a Claude Code plugin with compliance and privacy foundations in place.
What shipped:
- Plugin packaging (plugin.json, agents, skills, hooks)
agentboot export --format pluginandagentboot publish- Private marketplace template (marketplace.json)
- N-tier scope model (flexible node hierarchy replaces fixed groups/teams)
- Extended scaffolding:
add gotcha,add domain,add hook - Per-persona extensions (extend without modifying base)
- Domain layers (agentboot.domain.json)
- Compliance hooks: input scanning (UserPromptSubmit) and output scanning (Stop)
- Audit trail hooks (SubagentStart/Stop, PostToolUse)
- Telemetry: NDJSON output with canonical schema, configurable developer identity
- Three-tier privacy model (Private, Privileged, Organizational)
- Managed settings artifact generation
- Sync via GitHub API with PR creation mode
- MCP configuration generation (.mcp.json)
- Brew tap distribution
- Model selection matrix documentation
- ACKNOWLEDGMENTS.md (prior art credit)
Phase 4: "Core Pipeline" -- COMPLETE
Establish the foundational systems that everything else builds on: composition types for scope merging, lexicon for context compression, AGENTS.md for universal reach, provider abstraction for import, and install completion.
Shipped:
- Two-path install -- interactive onboarding with tab-completing directory selection, agent tool discovery, org slug inference, inline import
- Import system -- scan, LLM classify, composition type assignment, prompts as code
- Composition type classification --
rule/preferencedefaults per artifact,composition_typein staging files, frontmatter injection on apply - Lexicon classification --
lexiconas artifact type in import classifier, prompt, and schema - Persona classification --
personaas artifact type in import classifier - Agent tool discovery -- install learns which tools the org uses, derives output formats
- Multi-provider foundation --
agentsconfig section, Claude auth flow, billing disclosure - Prompts as code --
scripts/prompts/directory with loader,--isolatedtesting mode - Doctor --fix and config writes
Planned:
- Composition type core --
compositionfrontmatter field parser,CompositionConfigin config,composition-manifest.jsongeneration in compile, composition-awaremergeScopes()in sync - Lexicon artifact --
core/lexicon/directory with YAML term definitions, compilation first in pipeline, compact glossary block output - AGENTS.md output -- generate the universal cross-tool agent config standard
- Persona-as-subagent -- compile personas to
.claude/agents/*.mdwith tool restrictions - LLM provider abstraction --
LLMProviderinterface,ClaudeCodeProvider,ManualProvider,resolveProvider()from config - Install completion -- same-org repo registration, type reference cleanup
- Token budget enforcement --
agentboot linttoken counting per persona - Expand
AgentBootConfig.agents--llmModel,billingAcknowledgedfields
Phase 5: "Cross-Platform & Import" -- DONE (v0.5.0, 2026-04-04)
Reach every major agent platform. Import everything, not just markdown. 497 tests, 0 TS errors.
Delivered:
- Cursor output (AB-109) --
.cursor/rules/*/RULE.mdwith YAML list globs from gotchaspaths:frontmatter - Copilot agents output (AB-110) --
.github/agents/*.agent.mdcustom agent definitions - Managed settings fragments (AB-111) --
managed-settings.d/00-org.jsondrop-in files - AGENTS.md sync (AB-116) -- synced to repo root during sync, regardless of platform
- Expanded import: whole-file strategy (AB-112) -- deterministic import for agents→personas, traits→core/traits/, rules-with-paths→gotchas (no LLM, instant, free)
- Expanded import: config merge (AB-113) -- settings.json permissions extraction (union merge), MCP config import with entropy-based secret detection, hook import with per-hook security confirmation (default NO)
- Skill import with agent linking (AB-114) -- skills linked to imported agents or standalone personas; staging file v2 with
whole_file_imports[],config_merges[],deduplication{} - Cross-platform deduplication (AB-115) -- Jaccard similarity, claude > cursor > copilot priority,
--parentflag wired to 3-strategy expanded pipeline - Security hardening -- path traversal validation on generates[], trusted-source checks, ALLOWED_CLASSIFICATION_DIRS enforcement, symlink detection, word-boundary secret detection, JSONC-safe comment stripping
- TS error cleanup -- fixed all 45 pre-existing TypeScript errors across 5 files
Phase 6: "Governance & Quality" -- DONE (v0.6.0, 2026-04-04)
Enterprise governance, validation, testing, and CI. Make AgentBoot auditable and reliable at scale.
Delivered:
- Composition validation (AB-118, AB-119) -- check 5 (composition type consistency across scopes) and check 6 (rule override detection). Warnings in normal mode, errors in
--strict. - Doctor composition diagnostics (AB-120) -- missing manifests, orphaned overrides, scope shadow detection
- Doctor tool/format consistency (AB-121) -- warns when
agents.toolsandpersonas.outputFormatsdiverge - PreToolUse compliance hooks (AB-122) -- compiles
managed.guardrails.denyToolsto PreToolUse bash hooks that block denied tools. Fail-closed (blocks if jq missing). - Behavioral testing (AB-123) -- YAML-defined test cases with
contains,not-contains,regexassertions. 2-of-3 flake tolerance.agentboot test --behavioral. - Snapshot and regression testing (AB-124) -- SHA-256 hashing of dist/ with diff reporting.
agentboot test --snapshotand--regression. - CI integration (AB-125) -- reusable GitHub Actions workflow (
workflow_call) with configurable version, snapshot, behavioral, and strict inputs. - Hub migration (AB-126) --
agentboot migrateconverts repos to hubs with--revert(safety-checked against post-migration content) and--dry-run. - API providers (AB-127) --
AnthropicAPIProvider,OpenAIAPIProvider,GoogleAPIProviderwith secure stdin-based prompt passing, TLS enforcement, API error checking, and automatic fallback viaresolveProviderWithFallback().
Phase 7: "Production Ready" -- COMPLETE
Platform completeness, trait calibration, developer velocity, and harness intelligence.
Delivered (v0.7.0):
- Cursor .mdc output — flat
.cursor/rules/*.mdcfiles withalwaysApply/globsfrontmatter (AB-129) - Copilot scoped instructions —
.github/instructions/*.instructions.mdwithapplyTo(AB-130) - CC Plugin validation — manifest validation on export (AB-131)
--non-interactivemode — CI-safe install/import with env var defaults (AB-132)- Real YAML parser — js-yaml in test-runner with backward compat (AB-133)
- Trait weight system — HIGH/MEDIUM/LOW/MAX/OFF calibration per persona (AB-134)
- Harness SME personas — 5 internal domain experts (AB-135)
- Nightly intelligence pipeline — GitHub Actions workflow + scripts (AB-136)
/learnskill — contextual help for AgentBoot users (AB-137)- Production sync testing — multi-platform integration tests (AB-138)
Phase 8: "Multi-Platform Coverage & Enterprise" -- COMPLETE
Multi-platform output to counter competitive threats, enterprise compliance enforcement, and infrastructure improvements.
Delivered (v0.8.0):
- Gemini output — GEMINI.md project instructions +
.gemini/rules directory (AB-144) - Windsurf output —
.windsurfrulesflat text file (AB-146) - AGENTS.md scope awareness — per-scope AGENTS.md for group/team nodes (AB-145)
- Compliance hook compilation — per-persona hooks from persona.config.json (AB-147)
- MCP connection governance — approved/required server validation (AB-143)
agentboot cost-estimate— projected monthly costs per persona (AB-139)- MCP server — JSON-RPC stdio server for cross-platform persona serving (AB-140)
- Strategic analysis layer — cross-cutting intelligence synthesis persona (AB-141)
- Monorepo support — per-package persona deployment via
packages[]in repos.json (AB-142)
Total: 7 output platforms (skill, claude, copilot, cursor, agents, gemini, windsurf). 711 tests across 15 files.
Phase 9: "Marketplace & Optimization" — COMPLETE
Marketplace infrastructure, optimization tooling, JetBrains output, and evaluation maturity.
Delivered (v0.9.0):
- Marketplace infrastructure —
agentboot search,agentboot pull,agentboot publish. Three-layer registry: Core/Verified/Community. Web catalog atagentboot.dev/marketplace. Contribution validation workflow. (AB-150–151) - agentskills.io export —
agentboot export --format agentskillsgeneratesskills-index.jsonfrom compiled SKILL.md files. (AB-152) agentboot optimize— Reads GELF telemetry, aggregates per-persona metrics (invocations, token cost, rephrase rate, finding distribution). LLM-powered trait weight recommendations. HTML report generation.--applyflag writes recommendations topersona.config.json. (AB-153–154)- Trait weight calibration — all traits — Calibration preambles (OFF/LOW/MEDIUM/HIGH/MAX) authored for all 6 traits: critical-thinking, structured-output, source-citation, audit-trail, confidence-signaling, schema-awareness. (AB-155)
- JetBrains output — 8th output platform. Personas →
.junie/guidelines.md. Instructions and gotchas →.aiassistant/rules/*.md. (AB-156) - Copilot agent output —
.github/agents/{name}.agent.mdwithname,model,toolsfrontmatter. Higher-fidelity thancopilot-instructions.md. (AB-157) - Agent pattern selection —
patternfield inpersona.config.json:react,rewoo,router,sequential,tool-calling. Validation warns on misuse. (AB-158) - Managed settings group/team fragments —
10-group.jsonper group and20-team.jsonper team alongside existing00-org.json. Full MDM scope coverage. (AB-159) - LLM-as-Judge evaluation — 5-dimension persona quality scoring (accuracy, precision, recall, specificity, actionability).
agentboot test --judge --min-score 0.7. (AB-160) - Intelligence-driven roadmap — Nightly synthesis generates prioritized roadmap suggestions to
docs/internal/plans/roadmap-suggestions.md. Human review gated. (AB-161)
Total: 8 output platforms (skill, claude, copilot, cursor, agents, gemini, windsurf, jetbrains). 944 tests across 18 files.
Phase 10: "Engineer Traction" — NEXT (v0.10.0)
Make the harness worth using every day. The Second Brain is the centerpiece.
CLI feature freeze: Starting Phase 10, the agentboot CLI is the CI and scripting interface only. /ab is the human interface. All CLI subcommands are soft-deprecated once /ab covers the equivalent ground — they remain functional but receive no new features or enhancements. Bug fixes are still applied. No new CLI subcommands in Phase 10 unless there is no /ab alternative.
What's planned:
- Second Brain Stage 2 — SQLite knowledge index +
agentboot-brainMCP server +agentboot brainCLI (index/query/add/stats). Queryable org memory at the file level. - Second Brain Stage 2.5 — ADR and incident ingestion as first-class knowledge types.
agentboot add adr/agentboot add incidentscaffolding. /askskill — natural language queries against the Second Brain. "Why does the session middleware not use Redis directly?" → ADR surface, zero prompting.- Harness template library —
agentboot add template api-service|event-processor|data-pipeline. Traits + gotchas + personas pre-bundled for common topologies. Ships in Core marketplace tier. - Import from remote repos + interop —
agentboot import --url github.com/org/repo. Supports AGENTS.md repos, Google Conductor repos, Context Hub repos, SuperClaude repos. agentboot audit— periodic consistency checks: orphaned traits, dead gotchas, stale ADRs, scope shadows, manifest drift.- Global hub registry —
~/.agentboot/config.json,agentboot connect,agentboot use,agentboot hubs. - Agentboot authoring instruction —
core/instructions/agentboot-authoring.instructions.mdcompiled into every hub's.claude/rules/. Teaches Claude the trait format, weight semantics, validation rules, and anti-patterns so free-form assistance produces artifacts that passagentboot validate --strictwithout correction. - MCP server +
/abskill (built together) — these ship as a unit because/abdepends on live hub state from MCP to be useful.- MCP server:
agentboot mcp-serverentry added to the compiled.mcp.jsonincore/. Tools:get_repos,get_personas,get_traits,get_build_status,get_validate_warnings. Sets up the Second Brain query path when Stage 2 lands. /abskill: single NL-driven entry point incore/skills/, compiled to every repo (hub and spoke). Natural language intent → clarify with user → confirm plan → execute. Replaces the separate/add-trait,/add-gotcha,/add-persona, and/agentbootmeta-skill concepts — those become internal routing, not user-facing commands.- Interaction model: clarify ambiguity first, then present a concrete plan, then execute with
agentbootCLI calls. Yolo mode (skip confirm step) planned as a configurable flag in a later release. - Scope:
/abteaches scope vocabulary (org / group / team / repo / path) through repeated clarification questions rather than requiring the user to know it upfront. Users learn the model by being asked, not by reading docs. - Examples:
/ab add a rule for the team that requires structured logging,/ab what traits does the code-reviewer persona use?,/ab show me what's registered,/ab create a persona for data engineers at the group level. - Architecture — orchestrator, not monolith:
ab.mdis a thin orchestrator that classifies intent and handles clarification. It dispatches to specialist sub-agents incore/skills/ab/that carry only the context relevant to that operation:ab/author.md(trait/gotcha/persona/instruction authoring),ab/diagnose.md(doctor, validate, audit),ab/query.md(read hub state via MCP),ab/manage.md(sync, build, import). Sub-agents receive a clean context fork — the parent passes only what they need. This keeps/ab's per-invocation context cost flat regardless of how many specialists exist. Adding a capability means adding a specialist file, not growingab.md. The MCP server call happens in the orchestrator before dispatch, so sub-agents receive live hub state as input rather than needing to know how to fetch it. - Artifact type classifier: between intent detection and specialist dispatch, the orchestrator infers the correct artifact type from what the user described — not just what they called it. If the inferred type differs from the user's language,
/absurfaces the mismatch as a teaching moment before routing. Example:/ab add a rule that when I say gtd you always know that refers to David Allen's Getting Things Done— the user said "rule" but the pattern ("when I say X it means Y") is a lexicon entry. The classifier catches it: "This looks like a lexicon entry — a domain term definition that teaches Claude vocabulary without using up rule space. Want me to create it as a lexicon entry instead?" The user learns what a lexicon is by being corrected, not by reading docs. Classifier covers the full artifact taxonomy: lexicon (term definitions), gotcha (path-scoped operational knowledge), trait (behavioral building blocks), instruction (always-on guardrails), persona (role definitions).
- MCP server:
/ab specialist coverage (replaces CLI subcommands):
ab/manage— repo management (/ab show registered repos,/ab add this repo,/ab remove repo X), build, sync. Replaces manualrepos.jsonediting andagentboot sync. Also surfaces the collateral capability: run git operations across all registered repos (/ab git pull all).ab/upgrade— migrate existing hub to new AgentBoot core content when a new version ships. Replacesagentboot upgradeCLI (not yet built). Clarify → show diff of what would change → confirm → apply.ab/import— replace the LLM-poweredagentboot importCLI with a skill. Claude drives the import conversation: scan, classify, surface duplicates, offer promotion, handle source attribution. Consistent with the "Claude as UX layer" principle. CLI import remains for CI use; skill is the human path.ab/diagnose— covers doctor, validate, audit, prune./ab something seems off,/ab prune stale artifacts,/ab audit for orphaned traits. Replaces standaloneagentboot doctor,agentboot audit.
CLI bug fixes (Phase 10, not enhancements):
agentboot doctorruns from any directory, not just hub cwd- Smart sync: only open PRs for repos affected by a given change, not all registered repos on every sync
- Fix org install: ensure initial commit is created correctly during hub scaffold
- Import: batch all files from the same repo into a single LLM call (perf); track timeouts (exit code 143) and offer retry; fix path-scoped files not appearing in scan; document which model is used
- Import source attribution: record which repo each imported artifact came from
- Import duplicate detection: offer to promote when the same content appears in multiple repos
Docs and narrative:
- Fix "developer never runs agentboot" theme in
docs/org-connection.md— this messaging conflicts with the grassroots artifact promotion pipeline, which depends on developers engaging with the tool - Hub repo CI/CD guide: recommended branching strategy, versioning conventions, GitHub Actions templates for validate/build/sync
- GitHub bot setup: docs and scaffolding to help orgs configure a bot that auto-PRs sync artifacts and auto-merges when CI passes
- Document which subcommands require hub cwd vs work from anywhere (interim until
/abcovers all of it)
Multi-platform repos:
- Support registering a repo for multiple output platforms (e.g. a repo used by both Claude Code and Copilot users receives both compiled outputs in a single sync)
Phase 11: "Stealth Exit" — (v0.11.0)
Target: AI Engineer World's Fair, June 29–July 2, 2026. Come out of stealth with a live marketplace, a working Second Brain demo, and competitive positioning.
What's planned:
- agentboot.dev/marketplace live — hosted registry;
agentboot publishfunctional; Core/Verified/Community tiers; web catalog. - Community launch — Dev.to, Hacker News Show HN, AI Engineer World's Fair, Latent Space Discord. Messaging: "context engineering, not prompt engineering."
- SuperClaude cross-listing — shared portable trait format standard; cross-list in both marketplaces; submit to AAIF/AGENTS.md working group.
- MCP marketplace listing — list
agentboot-brainandagentbootMCP servers on mcp.so, pulsemcp, Smithery. - Security hardening — CVE audit against CVE-2025-59536/CVE-2026-21852/DXT patterns; security advisory; new
agentboot validatecheck for dangerous hook patterns. - OpenCode / Cline research — integration viability assessment for 9th/10th output platforms.
- Competitive positioning — internal docs: Google Conductor, Context Hub, SuperClaude, Copilot Cowork.
- Catalog built-in platform artifacts — catalog what each agent platform (Claude Code, Copilot, Cursor, Gemini) ships natively alongside AB core artifacts; surface gaps and overlaps per platform in the marketplace.
- Copilot per-request cost model —
agentboot cost-estimatecurrently models token cost; Copilot CLI bills per premium request, not per token. Model both in cost-estimate output. - Conventions and defaults docs — how AgentBoot establishes conventions in a volatile/new industry; rationale for current defaults; how orgs override them.
- Anonymized usage telemetry opt-in — user/org option to share anonymized data; informs roadmap prioritization.
- Platform obsolescence playbook — design doc for the scenario where a platform ships a native feature that replaces AB-managed content; migration and graceful deprecation path.
- CLI rename deferred from Phase 10 — install path rename (
hub→org,connect→user) once CLI freeze lifts; post-adoption timing. - Shell tab completion / OMZ plugin — deferred from Phase 10 CLI freeze.
agentboot/agentbootas GitHub template — research and product planning: using the repo as agh repo create --templatesource; concerns TBD.- Richer CLI help — verbose mode, examples per subcommand, man pages; deferred from Phase 10 CLI freeze.
Phase 12: "Org Scale" — (v0.12.0)
Gated on engineer adoption landing. Goals: compliance differentiators, governance lifecycle, multi-agent enterprise architecture.
What's planned:
- ADR governance -- exception lifecycle, expired ADR validation errors,
agentboot adrCLI. /insightsskill + org dashboard -- personal prompt pattern extraction; anonymized org metrics (rephrase rates, false positives, cost by team). Raw prompts never leave the machine.- Domain layers: healthcare/fintech/govtech -- complete compliance packages (traits + personas + gotchas + instructions); Verified marketplace tier.
- A2A protocol support -- Agent-to-Agent server/client; personas as A2A-callable services; complements MCP.
- Autonomy progression -- Advisory → Auto-approve → Autonomous, with telemetry-gated promotion.
- Abstract/binding composition -- org semantic contracts, team implementations, validation check 9.
- Second Brain Stage 3 -- sqlite-vec semantic retrieval for 500+ knowledge item hubs.
- OpenCode + Cline output platforms -- if Phase 11 research confirms viability.
- Personal brain / brain of brains -- human-scoped knowledge federation. A person works across multiple orgs (each with its own hub and Second Brain). The brain of brains is a personal layer that aggregates across them — AT brain, HG brain, AB brain — queryable as a unified personal knowledge index. Org brains remain org-scoped; the personal layer is additive. Design TBD; see discussion notes.
- Distributed repo operations -- collateral of repo management:
agentboot git pull(or/ab git pull all) runs a git operation across all registered repos. Surfaces naturally once/ab manageexists. - OS user-scope via symlinks -- personal AgentBoot installation scoped to the OS user rather than a hub, managed via symlinks. Enables solo developers without a formal org hub.