Roadmap
Capability-driven, not date-driven. Phase boundaries are defined by delivered function. Items move from this page to the changelog when shipped.
This page mirrors FUTURE.md from the blacklight repo. The visual at-a-glance is on the lander under Roadmap; the prose detail is here.
The five phases
Phase 1: Operational hygiene and signing
- GPG-signed releases (item 8)
- Manifest rotation and agent retirement lifecycle (item 9)
- Interactive
bl shellREPL (item 2) - File-level cross-run evidence dedup (item 17, tier 0)
Phase gate (P1 → P2): signed-release pipeline operational; cross-run evidence dedup deployed.
Phase 2: Detection breadth and source-side compression
- Additional trigger sources:
imunify,modsec-audit(item 5) - Additional firewall backends:
firewalld,fail2ban,nft jumpchains (item 5) - False-positive baseline tooling (item 12)
bl setup --retire-queueoperator command (extends item 9)- Source-side log compaction: exact dedup, path normalization, burst compaction (item 17, tiers 1+2+3)
- Inline curator tool wiring:
synthesize_defense,reconstruct_intent(item 1) - Session turn delta envelopes (item 7)
bl case noteannotation surface (item 11)- Operator config tree expansion (items 13, 15)
Phase gate (P2 → P3): at least one additional trigger source shipped; source-side log compaction deployed with measured ≥10× compression on representative incident replay; live API verification covers the core verb surface.
Phase 3: Fleet operation and cross-source correlation
- Multi-host fan-out:
bl observe --fleet(item 3) callable_agentsintegration for hunter dispatch, pending API stability (item 3)- Live API end-to-end verification across the verb surface (item 14)
- Cross-source IP correlation in evidence stream (item 17, tier 4)
Phase gate (P3 → P4): fleet fan-out operating across multi-host investigations; cross-source IP correlation in production; brief render confirmed against the live curator sandbox.
Phase 4: Skill extensibility and presentation
- Role-swappable substrate config (item 10)
- External skill directories with operator-local authoring (item 10)
- Brief HTML/PDF production-quality rendering (item 6)
- Portability and code quality hardening (item 16)
Phase gate (P4 → P5): OSS feature set stable for a 1.0 release; no breaking changes in the state.json schema for an extended period; release-signing pipeline supports SaaS-tier auto-update.
Phase 5: Commercial control plane
- Multi-tenant SaaS plane with hosted Managed Agents (item 4)
- Per-tenant case retention and audit trail (item 4)
- Web frontend for case review (item 4)
- Role-based access, operator, analyst, regulator (item 4)
Items in detail
1. Inline curator tool wiring: synthesize_defense and reconstruct_intent
The curator agent is provisioned with three custom tools, report_step, synthesize_defense, and reconstruct_intent, but only report_step is wired end-to-end. The remaining two are registered in the agent body and documented in the curator system prompt, but the wrapper does not consume their tool-use replies as first-class action surfaces.
Today the curator logs synthesis intent as a free-text case-log-note report_step and the operator manually invokes bl defend modsec --from-action <act-id> to consume the defense payload, an operator friction point and a forward-compatibility gap.
Capabilities delivered: wrapper routes on tool_name rather than verb; reconstruct_intent reply lands directly into attribution.md; per-turn synthesizer input shifts to a delta envelope.
2. bl shell: interactive investigation REPL
bl shell [<case-id>] loops over the pending step queue, presents each step with diff and reasoning, and accepts y / N / explain / skip / abort keystrokes without re-invoking bl run for each step. Aimed at multi-step incident flows where the operator is at the terminal watching the curator work through a case.
The unattended path (bl_is_unattended) already handles the no-TTY case; bl shell is the TTY-present, operator-led counterpart.
3. Multi-host fan-out: bl observe --fleet
bl observe --fleet <hostfile> fans out the observe verbs to N hosts over SSH, collects per-host evidence bundles locally, and merges them into a single multi-host case for the curator. The curator session receives a combined bundle with per-host labeled evidence streams; its 1M context window correlates cross-host signals without summarization loss.
Architecture decision pending: the original design named this as the callsite for Sonnet 4.6 hunters dispatched as Managed Agents callable_agents. That primitive was unavailable when the design was written. Revisit once callable_agents is stable.
Explicitly cut: no fleet daemon, no persistent fleet agent, no heartbeat protocol. Each host runs bl as a stateless CLI; orchestration is the operator's SSH access plus a merge script.
4. SaaS control plane and multi-tenancy
A hosted plane with per-tenant Managed Agents, per-tenant case retention and audit trail, web frontend for case review, and role-based access (operator / analyst / regulator). Commercial product build above the OSS bl CLI, not a CLI extension.
OSS-side prerequisites: stable state.json schema; signed releases (item 8); BL_REPO_URL env override (already present).
5. Additional defensive backends and trigger sources
Backends to add: firewalld, fail2ban, nft jump chains.
Trigger sources to add: bl trigger imunify, bl trigger modsec-audit.
6. Incident brief: HTML and PDF rendering
The bl_case_close_stage2_render function is written: it POSTs a wake event to the curator session requesting HTML and PDF render of the brief Markdown, then polls /v1/files?scope_id=<session_id> for up to 60 seconds. The curator sandbox env-create body cannot pre-install pandoc and weasyprint (the env body accepts only {name, config}, see Anthropic API Notes §1), so the agent installs them per-session via the bash tool when the render is requested.
Validation: tests/live/brief-render-live.bats exercises bl case close against a real session and confirms brief-CASE-*.{html,pdf} appear in the Files API response.
7. Session turn delta envelopes
Each curator session turn currently re-sends the full case YAML as the user-message content. Long-running cases inflate per-turn token cost.
Capabilities delivered: per-turn user message body carries only the diff since the last wake event; session-wake event body carries since_event_id; the case-log-note bridge (item 1) retires in the same pass.
8. Signed releases (GPG)
make release signs the assembled bl artifact with the project release key, emitting bl.sig. install.sh verifies the signature before placing the binary. Required prerequisite for the SaaS control plane (item 4).
9. Manifest rotation and agent retirement lifecycle
Three gaps:
- Agent version pinning. When
bl setup --eval --promotebumps the agent, sessions referencing the previous version are not invalidated or migrated. After this lands,state.jsontracksagent.version(CAS field); on--promote, sessions on the previous version receive a deprecation notice in the outbox. - Retire-queue processing.
bl_case_close_schedule_retireappends entries toretire-queue.jsonlfor applied actions with aretire_hintduration.bl setup --retire-queuereads the queue, presents expired entries, and asks the operator whether to revoke each. - Workspace drift reconciliation.
bl_files_list_workspaceis defined but unwired.bl setup --resetand--gccurrently treatstate.jsonas authoritative; any drift leaks orphan Files. Pickup order:--reset(highest value) →--check(new diagnostic) →--gc(storage cost only).
10. Role-swappable substrate and third-party skill extensibility
Three gaps:
- External skill directories.
bl setup --synconly reads fromBL_REPO_ROOT/skills/. No--skill-dirflag for an external path. - Skill authorship validation.
bl_setup_seed_skillschecks the 1024-char description.txt cap but does not validateSKILL.mdstructure or warn on bodies that exceed the effective context window share. - Substrate role config. The curator cannot be configured to assume a substrate at case-open time (e.g.
bl consult --new --substrate nginx).
11. bl case note: manual annotation surface
bl case note "..." is listed in bl_help_case but not routed in the dispatcher. The operator currently has no first-class CLI surface for appending freeform annotations to the active case ledger without invoking bl consult --attach.
12. False-positive baseline tooling
_bl_defend_sig_fp_gate checks $BL_DEFEND_FP_CORPUS (default /var/lib/bl/fp-corpus) and bypasses the gate when the directory is missing. The baseline directory is never populated by bl setup.
Capabilities delivered: bl setup --sync populates /var/lib/bl/fp-corpus/ with a baseline set of clean-file samples; bl setup --gc prunes baseline entries older than 90 days.
13. Per-source dedup window configuration
When bl trigger imunify and bl trigger modsec-audit land (item 5), each needs its own dedup window config key. The _bl_load_blacklight_conf allowlist extends to cover them.
14. Live API verification across the verb surface
Live tests do not yet exist for: bl observe apache → real Sonnet 4.6 summary render; bl consult --new → real session creation → real pending step poll; bl run → real step execution → real result POST; bl case close → real brief render; bl defend modsec → real apachectl -t pre-flight.
These are the only ground-truth proof that the API shapes documented during development are correct end-to-end.
15. Operator config tree expansion — shipped in v0.5.2
The /etc/blacklight/blacklight.conf allowlist grew from 7 → 22 keys in the v0.5.2 release: log verbosity, the LLM kill switch, defend's ASN/CDN/CIDR-floor levers, clean's TTL/grace, observe's journal cap, scanner sig dirs, and the alt skill repo are now first-class conf knobs. New keys ship commented-out at their source defaults so unconfigured operators see no behaviour change. See Configuration Reference for the landed shape.
16. Portability and code quality hardening
mv -T portability (src/bl.d/83-clean.sh). mv -T is a GNU coreutils extension; BSD mv does not support it. Mitigation: a small rename(2) C helper, or document the BSD caveat.
local var=$(...) exit-code masking. Several older functions predate the local var; var=$(...) split required by the coding convention. A shellcheck SC2155 sweep surfaces remaining instances.
17. Source-side log compaction and normalization
Attack patterns dominated by template-shaped repetition produce evidence files 10–100× larger than their information content. Today's only volume guards are mechanical (10k journal head-truncate, 64 MB triage cap, 100 MB / 500 MB bundle limits). None operate on content semantics.
Tiered rollout (build order = dependency order):
- Tier 0, File-level cross-run dedup. SHA-256 the new JSONL stream and skip upload+attach when it matches
prev_sha256. Trivial; lands first. - Tier 1+2, Exact dedup plus path normalization. A new
_bl_obs_compact_streamawk pass with template grouping (ip+method+path_norm+status+uafor apache;rule_id+normalized_uri+actionfor modsec). Forensic anchor preserved via three raw samples. - Tier 3, Burst compaction. Within a template group, runs of records with
ts_delta < burst_gap(default 5s) collapse to one burst record. O(1) memory per in-flight burst. - Tier 4, Cross-source IP correlation. Case-scoped
ip-seen.tsvupdated under flock; each record enriched withcorrelated_sources[].
Compression expectations on a typical APSB25-94-class incident: Apache brute-force 5,000 → 8–20 records; ModSec rule storm 800 → 5–12; journal 10,000 → 50–200.
Schema impact: additive only.
Cross-references
- Phase boundaries: ship-when-ready, not calendar-bound.
- Where items move when shipped: repo
CHANGELOG. - Architecture: Architecture doc.
- What we cut to get here: Pivot Moments.
- Build cadence: Build Timeline.
Source of truth: FUTURE.md in the blacklight repo.