Architecture
The shape, precise enough to code against. The companion pivots doc answers why blacklight took this shape; this one answers what the shape is.
The pitch in one paragraph
blacklight is a portable bash Runner (bl) that turns any Linux host into an agent-directed incident-response surface. The Runner runs locally on bash 4.1+ with curl, awk, and jq. No daemons, no Python. Every investigation is a conversation between the operator, the Runner, and a Managed Agents session hosted in the operator's Anthropic workspace. That session runs Opus 4.7 with 1M context and the skills bundle mounted at creation. It decides what to look at next, authors the defensive payloads (ModSec rules, firewall entries, YARA sigs), and prescribes remediation (rogue cron stripping, .htaccess cleanup, quarantine). The Runner executes; the agent directs; the existing defensive primitives your host already runs (ModSec, APF, CSF, iptables, nftables, LMD, ClamAV, YARA) are the hands. Man-days of manual IR become agentic-minutes on the substrate the defender already owns.
Three named layers
The system is three named layers: Runner, Curator, Substrate. Each has its own surface, its own contract, and its own enforcement boundary.
┌─ Layer 01 / Runner / `bl` on the host (bash, ~1000 lines, one file) ──────┐
│ │
│ bl observe bl consult bl run bl defend bl clean bl case bl setup │
│ │
│ Dependencies: bash ≥ 4.1, curl, awk, jq, grep, sed, tar, gzip. │
│ No daemons. No services. Invoked per operator thought; exits when done. │
│ │
└─────────────────────────────────────────────┬──────────────────────────────┘
│ HTTPS + API key
↓
┌─ Layer 02 / Curator / Managed Agent session (Anthropic-hosted) ───────────┐
│ │
│ agent bl-curator (Opus 4.7 + 1M context, Managed Agent session) │
│ environment bl-curator-env (apt: apache2, mod_security2, yara, │
│ jq, zstd, duckdb: installed once at env creation) │
│ memory store bl-skills 65 markdown files across 20 domains, ro │
│ memory store bl-case hypothesis + evidence + pending + results │
│ files evidence bundles, shell samples, closed-case briefs │
│ │
│ No local runtime process. The session lives in the Anthropic workspace. │
│ The Runner reaches it via HTTPS on every invocation. │
│ │
└─────────────────────────────────────────────┬──────────────────────────────┘
│ step directives
↓
┌─ Layer 03 / Substrate / defensive primitives on the host ─────────────────┐
│ │
│ apachectl + mod_security, APF, CSF, iptables, nftables, LMD, ClamAV, │
│ YARA, Apache / nginx logs, journalctl, crontab, find, stat, cat -v │
│ │
│ blacklight directs the Substrate; it does not install, replace, or │
│ re-abstract it. The primitives predate blacklight and will outlive it. │
│ │
└────────────────────────────────────────────────────────────────────────────┘
Layer boundary rules
- Runner never decides what action to take. It executes what the Curator prescribes (gated by safety tiers) and reports results.
- Curator never touches the host filesystem or primitives directly. It reasons, authors, prescribes. It never applies.
- Substrate is untouched by blacklight source code. No new rule engines, no new manifests, no new wire formats. Only native usage of existing primitives.
Runtime flow
blacklight investigations are operator-agent conversations, not batch dossier analyses. The canonical flow:
operator Runner (bl) Curator (Managed Agent)
──────── ─────────── ─────────────────────
$ bl consult \
--new --trigger <hit> ┐
│ preflight workspace
│ create case record
│ POST session event
session.wake
read bl-skills/*
read bl-case/hypothesis
reason, emit 4 steps to
bl-case/pending/s-01..04
poll bl-case/pending
show proposed steps:
s-01 observe log apache --around … --window 6h
s-02 observe fs --mtime-cluster …
s-03 observe htaccess …
s-04 observe cron --user …
Accept? [Y/n] Y ┐
│ exec each step locally
│ write result to
│ bl-case/results/s-01..04.json
│ POST wake event
read results
revise hypothesis
emit next batch:
s-05..07 observe
s-08 defend firewall
s-09 defend modsec
s-10 clean cron
s-11 clean htaccess
auto-exec read-only
auto-exec auto-tier
stop on destructive;
show diff; require
--yes per step
Confirm cron removal? y
exec, write result
…
propose close_case
when open_questions = 0
$ bl case close ┐
│ archive brief to Files
│ precedent pointer in bl-case
│ retire firewall blocks @ T+30d
The mechanical choice is async step-emit over polled memory-store files, not synchronous SSE tool-result. The agent writes proposed step JSON to bl-case/pending/<id>.json; the Runner consumes pending steps via two modes: a continuous poll loop used by bl consult foreground REPL (3-second tick, dedup-against-seen-set, exit on end_turn or --timeout), and an on-demand single-fetch via bl run --list for batched/async operator workflows. Both paths execute, write to bl-case/results/<id>.json, and send wake events.
This avoids SSE bidirectional plumbing in bash, makes the case memory a self-documenting audit log, and keeps bl a short-lived command per invocation. Polling overhead (~3-9s per loop tick) is invisible against agent reasoning time.
Command namespace
Six runtime namespaces plus one setup command. All bash functions in a single bl script, dispatched by first argument.
bl observe: read-only evidence extraction
Auto-runs (no confirm). Emits JSONL to stdout and appends structured output to the current case.
bl observe file <path> stat + magic + sha256 + strings + file(1)
bl observe log apache --around <path> time-window vhost log slice → JSONL
bl observe log modsec [--txn|--rule|--around] ModSec audit A/B/F/H sections → JSONL
bl observe log journal --since <time> journalctl extract → JSONL
bl observe cron --user <user> [--system] crontab -l | cat -v (ANSI ESC[2J reveal)
bl observe proc --user <user> [--verify-argv] argv[0] vs /proc/<pid>/exe basename
bl observe htaccess <dir> [--recursive] injected directive flagging
bl observe fs --mtime-cluster <path> --window <N>s cluster discovery
bl observe fs --mtime-since <date> retrospective sweep
bl observe firewall [--backend auto] APF/CSF/iptables/nftables enumeration
bl observe sigs [--scanner lmd|clamav|yara] loaded signature inventory
bl observe substrate 12 substrate.category JSONL records
bl consult: session attach and case management
bl consult --new --trigger <path-or-event> open a case
bl consult --attach <case-id> tag observations to existing case
bl consult --sweep-mode --cve <id> retrospective fleet posture (no case open)
bl run: execute agent-prescribed step
bl run <step-id> [--yes] [--dry-run] execute a single step
bl run --batch s-01..s-07 [--yes-auto-tier] batched contiguous run
bl run --list enumerate pending steps for current case
bl defend: apply agent-authored payload
bl defend modsec <rule-file-or-id> configtest → symlink-swap → graceful
bl defend firewall <ip> [--backend auto] CDN safe-list → apply → ledger entry
bl defend sig <sig-file> FP corpus gate → append → reload
bl clean: destructive remediation, diff-confirmed
bl clean htaccess <dir> [--patch <id>] diff → backup → apply
bl clean cron --user <user> [--patch <id>] same pattern, ANSI-aware
bl clean proc <pid> [--capture] /proc snapshot → SIGTERM → SIGKILL
bl clean file <path> [--reason <str>] quarantine, never delete
bl case: case lifecycle
bl case show [<case-id>] hypothesis, evidence, pending, applied
bl case log [<case-id>] [--audit] chronological ledger with fence-decode
bl case list [--open|--closed|--all] workspace case roster
bl case close [<case-id>] render brief → schedule retire-sweep
bl case reopen <case-id> re-attach closed case to curator
bl setup: workspace bootstrap
bl setup idempotent provision
bl setup --sync diff local skills, push changed only
bl setup --check dry-run preflight
Dependencies
Tier 1: always present on the host
bash≥ 4.1coreutils(ls,cat,stat,find,sort,uniq,head,tail,wc,sha256sum,tar,gzip)curlawk(mawk or gawk; prefer gawk for associative arrays)sedgrep(GNU preferred for-F -f patterns.txtAho-Corasick speed)
Tier 2: ship as bl deps
jq: single static binary, ~3 MB, portable back to CentOS 6. Non-optional.zstd: optional, runtime-detected. Falls back togzip.
Tier 3: curator sandbox only
Inside the Anthropic-hosted environment, provisioned at env creation:
apache2 + libapache2-mod-security2 + modsecurity-crs: forapachectl -tpre-flight of synthesized ModSec rulesyara: for on-sandbox signature testingduckdb: for agentic SQL over JSONL
None of these Tier-3 deps are installed on the fleet host. The sandbox has them; the host does not.
Explicitly NOT required
- No Python on any host
- No Docker (operator can use docker for demo fixtures;
blruns native) - No systemd requirement
- No database (SQLite or otherwise). All state lives in memory stores, files, and a small local ledger
- No web server or local HTTP listener.
blis a command, not a service
Non-goals
- Fleet-scope orchestration.
blis per-host. Fleet propagation rides the operator's existing primitive (Puppet, Ansible, Salt, Chef, manual SSH). blacklight generates the payload; the operator propagates. - Continuous posture monitoring daemon. blacklight is trigger-bound by design. Periodic sweeps and trajectory analysis are deferred future work.
- Web frontend or dashboard. The terminal REPL is the operator surface. A rendered HTML brief is a post-close artifact.
- Replacing defensive primitives. blacklight directs
apachectl,apf,csf,iptables,nftables,maldet,clamscan,yara. It does not re-implement any of them. - Cross-CVE threat intelligence sharing. Future work.
- Windows / BSD support. Future work. The current build is Linux only.
Glossary
- Case: an investigation; carries hypothesis, evidence, actions, precedent. One per incident.
- Step: a single action the agent prescribes. Has an action tier, a verb, typed arguments, and a reasoning field. Written to
bl-case/pending/. - Action tier: one of
read-only,auto,suggested,destructive,unknown. Determines gate behavior. - Skill: an operator-voice markdown file in
bl-skills. The programmable knowledge surface. - Trigger: the first signal that opens a case (e.g. maldet quarantine, auditd critical event, ModSec rule fire).
- Curator: the Managed Agents session that owns a case.
- Synthesizer: the curator's
synthesize_defensecustom-tool emit surface for authoring a defensive payload. - Intent reconstructor: the curator's
reconstruct_intentcustom-tool emit surface for analyzing a malware sample. - Precedent: a closed case accessible to future cases via
bl-archive/(lives withinbl-casememory store in v2, not a separate store). - Defense: any applied change to host state that reduces attack surface (ModSec rule, firewall entry, scanner signature).
- Remediation: any applied change to host state that removes attacker presence (file quarantine, cron strip, .htaccess edit, process kill).