Evidence Format

JSONL on the wire. Bundles are tar+gzip with a summary the curator reads first. Raw logs never enter context directly.

Every bl observe output is JSONL, one record per line. Fields vary by source but share a common preamble. Bundles compose the per-source JSONL streams, a summary.md first-read, and a MANIFEST.json for verification.

JSONL on the wire

{"ts":"2026-04-24T04:17:08Z","host":"example-host","source":"apache.transfer","record":{"client_ip":"203.0.113.42","method":"POST","path":"/pub/media/catalog/product/.cache/a.php","status":200,"path_class":"php_in_cache","is_post_to_php":true}}

Common preamble fields:

ts: ISO-8601 UTC timestamp of the event the record describes
host: the host this record was collected on
source: the collector that emitted it (apache.transfer, modsec.audit, fs.mtime-since, etc.)
record: source-specific structured data

Apache transfer

client_ip, method, path, status, bytes, ua, referer, site

Plus derived fields the Runner computes pre-emit:

path_class: one of php_in_cache, polyglot, static, admin, vendor, unknown
is_post_to_php: bool
status_bucket: 2xx, 3xx, 4xx, 5xx for stream-level histograms

ModSec audit

A/B/F/H/Z section walker output:

txn_id, client, uri, rule_id, action, phase, timestamp,
matched_var, matched_value, severity, msg

Filesystem

Two collectors share the fs source:

# fs.mtime-cluster
record: { path, mtime, ext, cluster_id, cluster_size }

# fs.mtime-since
record: { path, mtime, ext }

Cron

record: { user, system, line_n, raw, decoded, has_ansi_escape }

The decoded field shows what the line looks like after cat -v reveals ANSI ESC[2J escape sequences attackers use to obscure cron entries from crontab -l.

Process

record: { pid, user, ps_argv, exe_basename, argv_spoof }

argv_spoof: true when argv[0] from ps -u differs from /proc/<pid>/exe basename, the gsocket persistence-class signal.

Bundle shape

Evidence bundles (for bl consult --upload) are tar + gzip -5 (or zstd -3 if available):

bundle-<host>-<window>.tgz
├── MANIFEST.json           (host, window, sources, sha256s, bl version)
├── summary.md              (1–2 KB first-read, top IOCs, counts, hot paths)
├── transfer.log.jsonl      (pre-parsed Apache/nginx access records)
├── modsec_audit.jsonl      (pre-parsed ModSec audit events)
├── fs_anomalies.jsonl      (mtime clusters, perm drift, suid changes)
└── system_messages.jsonl   (journalctl extracts)

MANIFEST.json carries every per-file sha256 plus the bl version that produced the bundle. Verification on the curator side: the Runner attaches the manifest to the upload event; the curator's first action is to read summary.md and confirm the manifest's record count matches the JSONL.

summary.md: the first-read convention

The first file the agent reads. ≤ 2 KB. Structured:

# Evidence bundle: <host> | <from> → <to>

## Trigger
<one-paragraph description of the artifact that prompted collection>

## Top-line findings
- <bullet list of ≤ 7 facts>

## Jump points
- <jq/grep expressions the agent can use to drill into the JSONL files>

## Attention-worthy
- <anomalies the pre-parse flagged>

The "Jump points" section is the key invention. Rather than dumping the whole bundle into context, the Runner pre-computes the queries that matter: "200s to PHP files in /pub/media/catalog/product/.cache/", "ModSec rule 920450 hits clustered around obs-0001 ts ± 90s". The curator picks one or two and tool-uses grep, jq, or duckdb to drill in. The bundle is hot storage, not context.

Why JSONL, not a binary format

Three reasons:

Human-readable in the case ledger. bl case log is cat-able. Investigators reviewing a closed case see structured records, not opaque blobs.
grep and jq-native. The curator's tool-use is pre-existing primitives. No custom parser. No schema-versioning headaches across bl releases.
Streaming-friendly. Large collections (50k Apache lines) write incrementally. The Runner does not load the file before emitting it.

Compression

Default: gzip -5: portable to CentOS 6 / bash 4.1 baseline without EPEL.
Upgrade path: zstd -3 if command -v zstd succeeds: ~1.3× smaller, faster compress.
Detection: bl collect picks best available codec; extension is .tgz regardless (tar magic-byte detects codec on the decompress side).

Sonnet 4.6 bundle summary

summary.md generation runs through Sonnet 4.6 by default, bl_messages_call to the Messages API with prompts/bundle-summary-system.md as the system prompt. Sonnet treats log content as untrusted, produces ≤ 2 KB output budget, formats jump-points and attention-worthy sections.

Two bypasses keep the Runner deterministic:

--no-llm-summary: skip Sonnet, fall back to deterministic _bl_obs_render_summary_deterministic.
BL_DISABLE_LLM=1 env var: same effect, scoped to the shell. Tests use this. Cost-controlled environments use this.

If Sonnet returns 401 / 5xx / 429, the Runner falls back automatically. Bundle creation never blocks on a Messages API outage.

Stress corpus

exhibits/fleet-01/ carries a deterministic, byte-identical, ~360k-token APSB25-94 forensic bundle (apache + modsec + fs + cron + proc + journal + maldet) with attack needles buried in realistic noise. The corpus is regeneratable from tools/dev/synth-corpus.sh --seed 42. Sources are documented; no operator-local data ever lands in the corpus.

This bundle exercises the full 1M-context curator turn, a realistic case that wouldn't fit in 200k. It is the test that keeps the "1M context as one bundle" claim honest.

Memory-store size discipline

Memory-store entries have a hard 100 KB cap per file (Managed Agents spec). blacklight uses 2 of 8 available memory stores per session.

Store	Access	Typical contents	Cap discipline
`bl-skills`	`read_only`	65 .md files	≤ 50 KB total bundle
`bl-case`	`read_write`	hypothesis, evidence pointers, pending steps, applied actions, ledger	per-file ≤ 100 KB; raw evidence offloaded to Files API

Raw evidence bundles (.tgz packed) live in the Files API, not in memory stores. Memory stores carry pointers (evidence/evid-0001.md → {source, sha256, summary, file_id}). The curator's read_memory calls return the pointer; it then read_files the file_id to drill in. This keeps memory-store budgets small and re-readable across sessions.