Skip to main content
BLACKLIGHTrfxnforged in prod
operating·Doc 06 of 8

Evidence Format

JSONL on the wire, bundle shape, summary.md convention. How raw logs reach the curator without raw logs reaching the curator.

Evidence Format

JSONL on the wire. Bundles are tar+gzip with a summary the curator reads first. Raw logs never enter context directly.

Every bl observe output is JSONL, one record per line. Fields vary by source but share a common preamble. Bundles compose the per-source JSONL streams, a summary.md first-read, and a MANIFEST.json for verification.

JSONL on the wire

{"ts":"2026-04-24T04:17:08Z","host":"example-host","source":"apache.transfer","record":{"client_ip":"203.0.113.42","method":"POST","path":"/pub/media/catalog/product/.cache/a.php","status":200,"path_class":"php_in_cache","is_post_to_php":true}}

Common preamble fields:

  • ts: ISO-8601 UTC timestamp of the event the record describes
  • host: the host this record was collected on
  • source: the collector that emitted it (apache.transfer, modsec.audit, fs.mtime-since, etc.)
  • record: source-specific structured data

Apache transfer

client_ip, method, path, status, bytes, ua, referer, site

Plus derived fields the Runner computes pre-emit:

  • path_class: one of php_in_cache, polyglot, static, admin, vendor, unknown
  • is_post_to_php: bool
  • status_bucket: 2xx, 3xx, 4xx, 5xx for stream-level histograms

ModSec audit

A/B/F/H/Z section walker output:

txn_id, client, uri, rule_id, action, phase, timestamp,
matched_var, matched_value, severity, msg

Filesystem

Two collectors share the fs source:

# fs.mtime-cluster
record: { path, mtime, ext, cluster_id, cluster_size }

# fs.mtime-since
record: { path, mtime, ext }

Cron

record: { user, system, line_n, raw, decoded, has_ansi_escape }

The decoded field shows what the line looks like after cat -v reveals ANSI ESC[2J escape sequences attackers use to obscure cron entries from crontab -l.

Process

record: { pid, user, ps_argv, exe_basename, argv_spoof }

argv_spoof: true when argv[0] from ps -u differs from /proc/<pid>/exe basename, the gsocket persistence-class signal.

Bundle shape

Evidence bundles (for bl consult --upload) are tar + gzip -5 (or zstd -3 if available):

bundle-<host>-<window>.tgz
├── MANIFEST.json           (host, window, sources, sha256s, bl version)
├── summary.md              (1–2 KB first-read, top IOCs, counts, hot paths)
├── transfer.log.jsonl      (pre-parsed Apache/nginx access records)
├── modsec_audit.jsonl      (pre-parsed ModSec audit events)
├── fs_anomalies.jsonl      (mtime clusters, perm drift, suid changes)
└── system_messages.jsonl   (journalctl extracts)

MANIFEST.json carries every per-file sha256 plus the bl version that produced the bundle. Verification on the curator side: the Runner attaches the manifest to the upload event; the curator's first action is to read summary.md and confirm the manifest's record count matches the JSONL.

summary.md: the first-read convention

The first file the agent reads. ≤ 2 KB. Structured:

# Evidence bundle: <host> | <from> → <to>

## Trigger
<one-paragraph description of the artifact that prompted collection>

## Top-line findings
- <bullet list of ≤ 7 facts>

## Jump points
- <jq/grep expressions the agent can use to drill into the JSONL files>

## Attention-worthy
- <anomalies the pre-parse flagged>

The "Jump points" section is the key invention. Rather than dumping the whole bundle into context, the Runner pre-computes the queries that matter: "200s to PHP files in /pub/media/catalog/product/.cache/", "ModSec rule 920450 hits clustered around obs-0001 ts ± 90s". The curator picks one or two and tool-uses grep, jq, or duckdb to drill in. The bundle is hot storage, not context.

Why JSONL, not a binary format

Three reasons:

  1. Human-readable in the case ledger. bl case log is cat-able. Investigators reviewing a closed case see structured records, not opaque blobs.
  2. grep and jq-native. The curator's tool-use is pre-existing primitives. No custom parser. No schema-versioning headaches across bl releases.
  3. Streaming-friendly. Large collections (50k Apache lines) write incrementally. The Runner does not load the file before emitting it.

Compression

  • Default: gzip -5: portable to CentOS 6 / bash 4.1 baseline without EPEL.
  • Upgrade path: zstd -3 if command -v zstd succeeds: ~1.3× smaller, faster compress.
  • Detection: bl collect picks best available codec; extension is .tgz regardless (tar magic-byte detects codec on the decompress side).

Sonnet 4.6 bundle summary

summary.md generation runs through Sonnet 4.6 by default, bl_messages_call to the Messages API with prompts/bundle-summary-system.md as the system prompt. Sonnet treats log content as untrusted, produces ≤ 2 KB output budget, formats jump-points and attention-worthy sections.

Two bypasses keep the Runner deterministic:

  • --no-llm-summary: skip Sonnet, fall back to deterministic _bl_obs_render_summary_deterministic.
  • BL_DISABLE_LLM=1 env var: same effect, scoped to the shell. Tests use this. Cost-controlled environments use this.

If Sonnet returns 401 / 5xx / 429, the Runner falls back automatically. Bundle creation never blocks on a Messages API outage.

Stress corpus

exhibits/fleet-01/ carries a deterministic, byte-identical, ~360k-token APSB25-94 forensic bundle (apache + modsec + fs + cron + proc + journal + maldet) with attack needles buried in realistic noise. The corpus is regeneratable from tools/dev/synth-corpus.sh --seed 42. Sources are documented; no operator-local data ever lands in the corpus.

This bundle exercises the full 1M-context curator turn, a realistic case that wouldn't fit in 200k. It is the test that keeps the "1M context as one bundle" claim honest.

Memory-store size discipline

Memory-store entries have a hard 100 KB cap per file (Managed Agents spec). blacklight uses 2 of 8 available memory stores per session.

StoreAccessTypical contentsCap discipline
bl-skillsread_only65 .md files≤ 50 KB total bundle
bl-caseread_writehypothesis, evidence pointers, pending steps, applied actions, ledgerper-file ≤ 100 KB; raw evidence offloaded to Files API

Raw evidence bundles (.tgz packed) live in the Files API, not in memory stores. Memory stores carry pointers (evidence/evid-0001.md{source, sha256, summary, file_id}). The curator's read_memory calls return the pointer; it then read_files the file_id to drill in. This keeps memory-store budgets small and re-readable across sessions.