Reduction

adk/middlewares/reduction

💡 This middleware was introduced in v0.8.0.

Overview

The reduction middleware manages the token count occupied by tool outputs in Agent conversations, operating in two phases:

Truncation: Triggered immediately when a tool call returns. When a single output exceeds MaxLengthForTrunc, the full content is stored in the Backend and the message is replaced with a truncated summary.
Clear: Triggered before model calls (BeforeModelRewriteState). When total tokens exceed MaxTokensForClear, it iterates through historical messages and offloads old tool arguments and results to the Backend.

Architecture

Tool call returns result
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│     WrapInvokableToolCall / WrapStreamableToolCall          │
│     WrapEnhancedInvokableToolCall / WrapEnhancedStreamable  │
│                                                             │
│  Truncation (can be skipped via SkipTruncation)             │
│    Result length > MaxLengthForTrunc?                       │
│      Yes → Truncate content, save full content to Backend   │
│      No  → Return as-is                                     │
└─────────────────────────────────────────────────────────────┘
                          │
                          ▼
                    Result added to Messages
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                  BeforeModelRewriteState                    │
│                                                             │
│  Clear (can be skipped via SkipClear)                       │
│    Total tokens > MaxTokensForClear?                        │
│      Yes → ClearMessageRewriter preprocessing              │
│         → Old tool results stored to Backend, replaced     │
│           with file paths                                   │
│         → ClearAtLeastTokens minimum release check         │
│         → ClearPostProcess callback                        │
│      No  → Do nothing                                       │
└─────────────────────────────────────────────────────────────┘
                          │
                          ▼
                     Call Model

Generic System

This middleware follows the ADK standard generic pattern, supporting both *schema.Message and *schema.AgenticMessage:

// Generic config, M is constrained to adk.MessageType
type TypedConfig[M adk.MessageType] struct { ... }

// Backward-compatible alias
type Config = TypedConfig[*schema.Message]

Constructors are also available in both generic and non-generic forms:

func NewTyped[M adk.MessageType](ctx context.Context, config *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error)
func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error)

Configuration

TypedConfig[M] Main Configuration

Field	Type	Description
Backend	Backend	Storage backend. Required when SkipTruncation is false; can be nil when only doing Clear without offload.
SkipTruncation	bool	Skip the truncation phase.
SkipClear	bool	Skip the clear phase.
ReadFileToolName	string	Tool name for reading offloaded content. Default "read_file" .
RootDir	string	Root directory for saving content. Default "/tmp" . Truncated content is saved to {RootDir}/trunc/{tool_call_id} , cleared content to {RootDir}/clear/{tool_call_id} .
GenTruncOffloadFilePath	func(ctx, *ToolDetail) (string, error)	Custom truncation file path generator. When set, RootDir does not apply to truncation. Useful for scenarios where tool_call_id is not unique.
GenClearOffloadFilePath	func(ctx, *ToolDetail) (string, error)	Custom clear file path generator. When set, RootDir does not apply to clear.
MaxLengthForTrunc	int	Maximum character length to trigger truncation. Default 50000 .
TruncExcludeTools	[]string	List of tool names to exclude from truncation.
TokenCounter	func(ctx, []M, []*schema.ToolInfo) (int64, error)	Token counting function. Defaults to character_count/4 estimation. Recommend replacing with tiktoken-go/tokenizer.
MaxTokensForClear	int64	Token threshold to trigger clear. Default 160000 .
ClearRetentionSuffixLimit	int	Keep the most recent N assistant message rounds without clearing. Default 1 .
ClearAtLeastTokens	int64	Minimum token amount that must be released by clearing. If not met, clearing is not executed (avoids needlessly breaking prompt cache). Default 0 .
ClearExcludeTools	[]string	List of tool names to exclude from clearing.
ClearMessageRewriter	func(ctx, M, []M) ([]M, error)	Message rewrite callback before clearing. Parameters are toolCallMsg and the corresponding toolResponseMsgs. Can be used to rewrite write_file/edit_file calls into system-reminders. Returning nil removes that message group.
ClearPostProcess	func(ctx, *adk.TypedChatModelAgentState[M]) context.Context	Callback after clearing completes, can save state or send notifications. Returns a potentially updated context.
ToolConfig	map[string]*ToolReductionConfig	Per-tool configuration, takes precedence over global settings.

ToolReductionConfig Tool-level Configuration

type ToolReductionConfig struct {
    Backend        Backend
    SkipTruncation bool
    TruncHandler   func(ctx context.Context, detail *ToolDetail) (*TruncResult, error)
    SkipClear      bool
    ClearHandler   func(ctx context.Context, detail *ToolDetail) (*ClearResult, error)
}

TruncHandler / ClearHandler: when nil and not skipped, the global default handler is used.
Backend: independent storage backend for this tool, overrides the global Backend.

ToolDetail Tool Details

type ToolDetail struct {
    ToolContext       *adk.ToolContext
    ToolArgument      *schema.ToolArgument
    ToolResult        *schema.ToolResult                    // non-streaming
    StreamToolResult  *schema.StreamReader[*schema.ToolResult] // streaming
}

TruncResult Truncation Result

type TruncResult struct {
    NeedTrunc        bool
    ToolResult       *schema.ToolResult                    // Required when NeedTrunc && non-streaming
    StreamToolResult *schema.StreamReader[*schema.ToolResult] // Required when NeedTrunc && streaming
    NeedOffload      bool
    OffloadFilePath  string  // Required when NeedOffload
    OffloadContent   string  // Required when NeedOffload
}

ClearResult Clear Result

type ClearResult struct {
    NeedClear       bool
    ToolArgument    *schema.ToolArgument  // Required when NeedClear
    ToolResult      *schema.ToolResult    // Required when NeedClear
    NeedOffload     bool
    OffloadFilePath string  // Required when NeedOffload
    OffloadContent  string  // Required when NeedOffload
}

Backend Interface

// Defined in reduction/internal, exported via type alias
type Backend interface {
    Write(context.Context, *filesystem.WriteRequest) error
}

filesystem.WriteRequest contains two fields: FilePath string and Content string.

Creating the Middleware

Basic Usage

import "github.com/cloudwego/eino/adk/middlewares/reduction"

middleware, err := reduction.New(ctx, &reduction.Config{
    Backend: myBackend,
})

agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Model:       chatModel,
    Middlewares: []adk.ChatModelAgentMiddleware{middleware},
})

Generic Usage (AgenticMessage)

middleware, err := reduction.NewTyped[*schema.AgenticMessage](ctx, &reduction.TypedConfig[*schema.AgenticMessage]{
    Backend: myBackend,
    TokenCounter: myAgenticTokenCounter,
})

agent, err := adk.NewTypedChatModelAgent(ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{
    Model:       chatModel,
    Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{middleware},
})

Custom Configuration

middleware, err := reduction.New(ctx, &reduction.Config{
    Backend:           myBackend,
    RootDir:           "/data/agent",
    MaxLengthForTrunc: 30000,
    MaxTokensForClear: 100000,
    ClearRetentionSuffixLimit: 2,
    ClearAtLeastTokens: 10000,
    TruncExcludeTools: []string{"search_tool"},
    ClearExcludeTools: []string{"read_file"},
    ClearMessageRewriter: func(ctx context.Context, toolCallMsg *schema.Message, toolResponseMsgs []*schema.Message) ([]*schema.Message, error) {
        // Rewrite write_file calls into system-reminder
        return []*schema.Message{schema.UserMessage("<system-reminder>file written</system-reminder>")}, nil
    },
    ClearPostProcess: func(ctx context.Context, state *adk.ChatModelAgentState) context.Context {
        log.Printf("Clear completed, messages: %d", len(state.Messages))
        return ctx
    },
    ToolConfig: map[string]*reduction.ToolReductionConfig{
        "grep": {Backend: grepBackend},
        "read_file": {SkipClear: true},
    },
})

Truncation Only

middleware, err := reduction.New(ctx, &reduction.Config{
    Backend:   myBackend,
    SkipClear: true,
})

Clear Only

middleware, err := reduction.New(ctx, &reduction.Config{
    SkipTruncation: true,
    MaxTokensForClear: 100000,
    // When Backend is nil, clearing still replaces content with placeholders but does not perform offload
})

How It Works

Truncation

Handled in WrapInvokableToolCall / WrapStreamableToolCall / WrapEnhancedInvokableToolCall / WrapEnhancedStreamableToolCall:

Tool returns result
Check TruncExcludeTools; skip if matched
Look up ToolConfig → global defaultConfig to obtain TruncHandler
TruncHandler determines: reads the full output, checks if the total length of all text parts exceeds MaxLengthForTrunc
If exceeded: retains the first and last MaxLengthForTrunc/(textParts*2) characters as a preview, stores the full content in the Backend
Returns a truncation notice informing the agent of the file path for the full content

💡 For streaming tools, the default TruncHandler waits for the complete stream to be read before deciding whether to truncate. If you need strict incremental streaming behavior, provide a custom TruncHandler for that tool.

Clear

Handled in BeforeModelRewriteState:

Use TokenCounter to calculate total tokens
Skip if not exceeding MaxTokensForClear
Determine clear range: from the first unprocessed assistant message to len(messages) - ClearRetentionSuffixLimit rounds
If ClearMessageRewriter is configured, execute rewrite preprocessing on messages within the range first
Iterate through tool call messages in range, skipping ClearExcludeTools
Call ClearHandler for each tool call, replacing arguments and results
If ClearAtLeastTokens is set: operate on a copy first, compare token difference before and after clearing; abandon this clearing attempt if threshold not met
Once threshold is met, execute actual offload writes and update state.Messages
Call ClearPostProcess

Multi-language Support

Truncation and clear prompt text supports automatic Chinese/English switching:

adk.SetLanguage(adk.LanguageChinese)  // Chinese
adk.SetLanguage(adk.LanguageEnglish)  // English (default)

Notes

When SkipTruncation is false, Backend must be set
The default TokenCounter uses character_count/4 estimation; recommend replacing with github.com/tiktoken-go/tokenizer
Already processed messages are marked via the Extra field _reduction_mw_processed and will not be processed again
Configuration in ToolConfig takes precedence over global settings; if a ToolConfig only sets SkipTruncation: false without providing a TruncHandler, it falls back to the default handler
GenTruncOffloadFilePath / GenClearOffloadFilePath are useful for scenarios where tool_call_id is not unique (e.g., retries), preventing file overwrites
ClearMessageRewriter executes after the clear range is determined but before per-tool clearing, suitable for compressing write/edit-type calls into brief prompts
ClearAtLeastTokens set to 0 means clearing executes whenever the threshold is exceeded; values greater than 0 can avoid minimal clearing that would break prompt cache
Legacy API (NewClearToolResult, NewToolResultMiddleware) is deprecated; recommend migrating to New / NewTyped

Feedback

Was this page helpful?

Please tell us how we can improve.

Last modified May 21, 2026: docs(eino): sync docs from feishu (2026-05-21) (#1545) (14c457e4)