Resource Management

How compaction works

No matter how large an LLM's context window is, it is finite. Compaction is how Buffaly automatically reduces active timeline weight during long-running sessions, preventing context collapse and saving token costs.

The Reality of Compaction

Let's be honest: despite what AI marketing materials claim, no compaction algorithm is perfect. When you aggressively compress context, the model will inevitably drop nuances or temporarily lose the immediate thread of its work.

Buffaly expects this. The system is designed to compact aggressively to save tokens, while relying on architectural fallbacks—like the System 2 Watcher, Semantic Database (SemDB), and Task Artifacts—to cleanly recover state when the primary agent gets confused.

What compaction actually does

A session accumulates user messages, tool calls, logs, and answers. When tokens hit a threshold, Compaction reduces the active context that must be replayed into the LLM on the next turn. It does not delete your data. Everything is permanently backed by SQL Server.

What happens in active context:

Older middle details are summarized or dropped from the immediate prompt.
The most recent boundaries and steps are preserved.

What remains completely safe:

The full SQL Server timeline history.
Durable Task Artifacts, Plan.md, and Scratch.md.
Committed codebase and wiki changes.

The 3 Compaction Methods

Buffaly supports three distinct provider methods for compressing context. You configure this via the CompactionEngine setting.

Codex Compaction (Recommended)

Configured as CodexApi. This is generally the best-performing and most reliable compaction engine for long-running enterprise tasks. It intelligently preserves the operational narrative.

OpenAI Responses API

Configured as ResponsesApi. Uses provider-backed native summarization tools. Strong, but subject to provider-side logic updates.

Local Advanced Compactor

Configured as LocalAutomatic or LocalManual. A deterministic, local algorithm that forcefully archives and truncates state without relying on an external LLM call.

Settings and thresholds

Setting	Meaning
MaxConversationTokens	Absolute maximum token budget before forced failure (defaults to 100,000).
TriggerTokens	The token count where compaction is automatically triggered.
TargetFreeTokens	The desired amount of headroom to clear out during compaction.
TargetConversationTokens	The computed target size of the active conversation after compaction finishes.
CompactionEngine	Must exactly match the enum: `CodexApi`, `ResponsesApi`, `LocalAutomatic`.

Recovery: What to do when the agent forgets

Because compaction removes immediate context, agents will sometimes stall or repeat actions after a heavy compaction cycle. Here is the strict escalation path for recovery:

1. Rely on the System 2 Watcher

Ideally, you don't do anything. If the primary agent gets off track post-compaction, the System 2 Watcher (the supervisory agent) will catch the deviation, validate the Plan, and automatically instruct the primary agent to correct its course.

2. Prompt a Semantic History Search

If the agent is truly lost, do not try to re-explain the entire task. Because all history is in SQL and the Semantic Database (SemDB), simply ask the agent to self-retrieve its past context.

"You just underwent compaction. Query your semantic session history to remember exactly how we configured the database connection, then resume the plan."

3. Fall back to Durable Artifacts

If the timeline is severely truncated, instruct the agent to re-anchor itself against the files that survived the wipe.

"Read Plan.md, Scratch.md, and task-01.md to regain your context, then tell me the next safe step."

How to verify compaction ran

Ask the agent for verified evidence rather than a status guess. A good diagnostic prompt looks like this:

"Inspect this session's compaction state. Report the configured compaction provider, the current token thresholds, the compaction epoch, and whether it is safe to continue."

Check lifecycle rows or logs mentioning "compaction start" or "success".
Look for an archive snapshot path for the pre-compaction epoch.
Verify that Plan, Scratch, and task artifacts are perfectly intact.

Sessions and long-running work Plan, Scratch, and Tasks The System 2 Watcher Logs and recovery

Start here

Learn Buffaly

Build and extend

Operate

Reference