Evidence Format
JSONL on the wire. Bundles are tar+gzip with a summary the curator reads first. Raw logs never enter context directly.
Every bl observe output is JSONL, one record per line. Fields vary by source but share a common preamble. Bundles compose the per-source JSONL streams, a summary.md first-read, and a MANIFEST.json for verification.
JSONL on the wire
{"ts":"2026-04-24T04:17:08Z","host":"example-host","source":"apache.transfer","record":{"client_ip":"203.0.113.42","method":"POST","path":"/pub/media/catalog/product/.cache/a.php","status":200,"path_class":"php_in_cache","is_post_to_php":true}}
Common preamble fields:
ts: ISO-8601 UTC timestamp of the event the record describeshost: the host this record was collected onsource: the collector that emitted it (apache.transfer,modsec.audit,fs.mtime-since, etc.)record: source-specific structured data
Apache transfer
Base fields: client_ip, method, path, status, bytes, ua, referer, site.
Plus derived fields the Runner computes pre-emit:
path_class: one ofphp_in_cache,polyglot,static,admin,vendor,unknownis_post_to_php:boolstatus_bucket:2xx,3xx,4xx,5xxfor stream-level histograms
ModSec audit
A/B/F/H/Z section walker output. Fields: txn_id, client, uri, rule_id, action, phase, timestamp, matched_var, matched_value, severity, msg.
Filesystem
Two collectors share the fs source:
fs.mtime-cluster—record: { path, mtime, ext, cluster_id, cluster_size }fs.mtime-since—record: { path, mtime, ext }
Cron
record: { user, system, line_n, raw, decoded, has_ansi_escape }.
The decoded field shows what the line looks like after cat -v reveals ANSI ESC[2J escape sequences attackers use to obscure cron entries from crontab -l.
Process
record: { pid, user, ps_argv, exe_basename, argv_spoof }.
argv_spoof: true when argv[0] from ps -u differs from /proc/<pid>/exe basename, the gsocket persistence-class signal.
Bundle shape
Evidence bundles (for bl consult --upload) are tar + gzip -5 (or zstd -3 if available). The archive name is bundle-<host>-<window>.tgz. Members:
MANIFEST.json— host, window, sources, sha256s, bl versionsummary.md— 1–2 KB first-read; top IOCs, counts, hot pathstransfer.log.jsonl— pre-parsed Apache / nginx access recordsmodsec_audit.jsonl— pre-parsed ModSec audit eventsfs_anomalies.jsonl— mtime clusters, perm drift, suid changessystem_messages.jsonl— journalctl extracts
MANIFEST.json carries every per-file sha256 plus the bl version that produced the bundle. Verification on the curator side: the Runner attaches the manifest to the upload event; the curator's first action is to read summary.md and confirm the manifest's record count matches the JSONL.
summary.md: the first-read convention
The first file the agent reads. ≤ 2 KB. Structured:
# Evidence bundle: <host> | <from> → <to>
## Trigger
<one-paragraph description of the artifact that prompted collection>
## Top-line findings
- <bullet list of ≤ 7 facts>
## Jump points
- <jq/grep expressions the agent can use to drill into the JSONL files>
## Attention-worthy
- <anomalies the pre-parse flagged>
The "Jump points" section is the key invention. Rather than dumping the whole bundle into context, the Runner pre-computes the queries that matter: "200s to PHP files in /pub/media/catalog/product/.cache/", "ModSec rule 920450 hits clustered around obs-0001 ts ± 90s". The curator picks one or two and tool-uses grep, jq, or duckdb to drill in. The bundle is hot storage, not context.
Why JSONL, not a binary format
Three reasons:
- Human-readable in the case ledger.
bl case logiscat-able. Investigators reviewing a closed case see structured records, not opaque blobs. grepandjq-native. The curator's tool-use is pre-existing primitives. No custom parser. No schema-versioning headaches acrossblreleases.- Streaming-friendly. Large collections (50k Apache lines) write incrementally. The Runner does not load the file before emitting it.
Compression
- Default:
gzip -5: portable to CentOS 6 / bash 4.1 baseline without EPEL. - Upgrade path:
zstd -3ifcommand -v zstdsucceeds: ~1.3× smaller, faster compress. - Detection:
bl collectpicks best available codec; extension is.tgzregardless (tar magic-byte detects codec on the decompress side).
Sonnet 4.6 bundle summary
Where the heavy lifting actually happens. Sonnet 4.6 only renders the
summary.mdfirst-read on a single bundle. Every load-bearing reasoning step in an investigation, cross-stream correlation, hypothesis revision, defensive-payload authorship, sample intent reconstruction, brief writing, runs in the Opus 4.7 curator session at 1M context. The curator absorbs the full case state (mounted skills + reference files, thebl-casememstore, every per-case Files bundle, every prior step result) without a retriever or chunker. Sonnet here is a fast, cheap condenser that hands the curator a 2 KB index plus jump-point queries; the actual drill-down happens via the curator's tool-use overgrep/jq/duckdbagainst the JSONL files. See Architecture · model assignments and PRD §5.1 for the full routing.
summary.md generation runs through Sonnet 4.6 by default, bl_messages_call to the Messages API with prompts/bundle-summary-system.md as the system prompt. Sonnet treats log content as untrusted, produces a ≤ 2 KB output budget, formats jump-points and attention-worthy sections. No anthropic-beta header; this is a plain /v1/messages call outside the Managed Agents surface.
Two bypasses keep the Runner deterministic:
--no-llm-summary: skip Sonnet, fall back to deterministic_bl_obs_render_summary_deterministic.BL_DISABLE_LLM=1env var: same effect, scoped to the shell. Tests use this. Cost-controlled environments use this.
If Sonnet returns 401 / 5xx / 429, the Runner falls back automatically. Bundle creation never blocks on a Messages API outage.
Stress test bundle
exhibits/fleet-01/ carries a deterministic, byte-identical, ~360k-token APSB25-94 forensic bundle (apache + modsec + fs + cron + proc + journal + maldet) with attack needles buried in realistic noise. The bundle is regeneratable from tools/dev/synth-corpus.sh --seed 42. Sources are documented; no operator-local data ever lands in the bundle.
This bundle exercises the full 1M-context curator turn, a realistic case that wouldn't fit in 200k. It is the test that keeps the "1M context as one bundle" claim honest.
Memory-store size discipline
Memory-store entries have a hard 100 KB cap per file (Managed Agents spec). Under Path C / M13, blacklight uses one memory store per workspace (bl-case) plus the Skills + Files primitives for skill content, see Skills Architecture.
| Store | Access | Typical contents | Cap discipline |
|---|---|---|---|
bl-case | read_write | hypothesis, evidence pointers, pending steps, applied actions, ledger; path-namespaced per case | per-file ≤ 100 KB; raw evidence offloaded to Files API |
bl-skills | read_only | RETIRED in M13. Skill content moved to the Skills primitive (description-routed) + reference Files (mounted at /skills/<basename>). Older docs that name a bl-skills memstore predate Path C. | n/a |
Raw evidence bundles (.tgz packed) live in the Files API, not in memory stores. Memory stores carry pointers (evidence/evid-0001.md → {source, sha256, summary, file_id}). The curator's read_memory calls return the pointer; it then read_files the file_id to drill in. This keeps memory-store budgets small and re-readable across sessions.