Summarization

💡 This middleware was introduced in v0.8.0. Package path: github.com/cloudwego/eino/adk/middlewares/summarization

Overview

The Summarization middleware automatically calls a summary model to compress conversation history when the conversation token count exceeds a threshold, keeping long conversations coherent within the model’s context window. The middleware hooks into BeforeModelRewriteState, checking trigger conditions before each model call. When triggered, it executes: counting → summary generation (with retry/failover) → post-processing → state replacement.

Generic System

All core types and functions in this package provide both a Typed generic version (M adk.MessageType) and a non-generic alias (fixed to *schema.Message).

Generic Version	Non-generic Alias (= Typed\[*schema.Message\])
TypedConfig[M]	Config
NewTyped[M](ctx, *TypedConfig[M])	New(ctx, *Config)
TypedTokenCounterFunc[M]	TokenCounterFunc
TypedGenModelInputFunc[M]	GenModelInputFunc
TypedGetFailoverModelFunc[M]	GetFailoverModelFunc
TypedFinalizeFunc[M]	FinalizeFunc
TypedCallbackFunc[M]	CallbackFunc
TypedUserMessageFilterFunc[M]	UserMessageFilterFunc
TypedPreserveUserMessages[M]	PreserveUserMessages
TypedRetryConfig[M]	RetryConfig
TypedFailoverConfig[M]	FailoverConfig
TypedFailoverContext[M]	FailoverContext
TypedFinalizerBuilder[M]	FinalizerBuilder

Unless otherwise noted, type signatures in this document use the generic form M. When using non-generic aliases, M = *schema.Message.

Constructors

// Generic version — supports *schema.Message and *schema.AgenticMessage
func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error)

// Non-generic version — equivalent to NewTyped[*schema.Message]
func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error)

TypedConfig[M] Configuration

Field	Type	Required	Default	Description
Model	model.BaseModel[M]	Yes	—	Model used to generate summaries
ModelOptions	[]model.Option	No	—	Options passed to the summary model
TokenCounter	TypedTokenCounterFunc[M]	No	Estimates based on the most recent assistant message's total_tokens as baseline, incremental messages at ~4 chars/token	Custom token counting function
Trigger	*TriggerCondition	No	ContextTokens=160,000	Condition to trigger summarization
UserInstruction	string	No	Built-in prompt	Custom user-level summarization instruction, overrides the default instruction
TranscriptFilePath	string	No	—	Full conversation transcript file path, appended to the summary to remind the model where to find original context. Only effective when Finalize is not set
GenModelInput	TypedGenModelInputFunc[M]	No	sysInstruction → contextMsgs → userInstruction	Full control over constructing the summary model input
Finalize	TypedFinalizeFunc[M]	No	Built-in post-processing	Custom summary post-processing. When set, the middleware no longer performs any default post-processing
Callback	TypedCallbackFunc[M]	No	—	Called after Finalize, with parameters before, after adk.TypedChatModelAgentState[M] (value types), read-only
EmitInternalEvents	bool	No	false	Whether to send internal events at key points
PreserveUserMessages	*TypedPreserveUserMessages[M]	No	Enabled: true	Preserve original user messages in the summary. Only effective when Finalize is not set
Retry	*TypedRetryConfig[M]	No	nil (no retry)	Retry strategy for the primary model summary generation
Failover	*TypedFailoverConfig[M]	No	nil	Failover strategy after primary model failure

💡 Finalize override semantics: Once a custom Finalize is set, the middleware will skip all default post-processing — PreserveUserMessages and TranscriptFilePath will no longer take effect. To reuse default post-processing logic in a custom Finalize, use the DefaultFinalizer function.

Sub-configuration Structs

TriggerCondition

Summarization is triggered when any condition is met.

type TriggerCondition struct {
    ContextTokens   int // Trigger when token count exceeds this threshold
    ContextMessages int // Trigger when message count exceeds this threshold
}

TypedPreserveUserMessages[M]

When enabled, replaces the <all_user_messages>...</all_user_messages> section in the summary with the most recent original user messages.

type TypedPreserveUserMessages[M adk.MessageType] struct {
    Enabled   bool
    MaxTokens int                        // Max tokens for preserved user messages; defaults to TriggerCondition.ContextTokens / 3
    Filter    TypedUserMessageFilterFunc[M] // Filter function, return false to exclude a message
}

TypedRetryConfig[M]

type TypedRetryConfig[M adk.MessageType] struct {
    MaxRetries  *int                                                            // Default 3
    ShouldRetry func(ctx context.Context, resp M, err error) bool              // Default: retry when err != nil
    BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration // Default: exponential backoff + jitter
}

TypedFailoverConfig[M]

type TypedFailoverConfig[M adk.MessageType] struct {
    MaxRetries     *int                                                            // Default 3
    ShouldFailover func(ctx context.Context, resp M, err error) bool              // Default: failover when err != nil
    BackoffFunc    func(ctx context.Context, attempt int, resp M, err error) time.Duration
    GetFailoverModel TypedGetFailoverModelFunc[M] // Returns (failoverModel model.BaseModel[M], failoverModelInputMsgs []M, failoverErr error)
}

TypedFailoverContext[M]

Context passed to the GetFailoverModel callback.

type TypedFailoverContext[M adk.MessageType] struct {
    Attempt           int  // Current failover attempt number, starting from 1
    SystemInstruction M    // System instruction (set internally by the middleware, not configurable)
    UserInstruction   M    // User instruction
    OriginalMessages  []M  // Original complete conversation
    LastModelResponse M    // Model response from the last attempt
    LastErr           error
}

TypedTokenCounterInput[M]

type TypedTokenCounterInput[M adk.MessageType] struct {
    Messages []M
    Tools    []*schema.ToolInfo
}

Function Type Signature Reference

type TypedTokenCounterFunc[M]      func(ctx context.Context, input *TypedTokenCounterInput[M]) (int, error)
type TypedGenModelInputFunc[M]     func(ctx context.Context, sysInstruction, userInstruction M, originalMsgs []M) ([]M, error)
type TypedGetFailoverModelFunc[M]  func(ctx context.Context, failoverCtx *TypedFailoverContext[M]) (model.BaseModel[M], []M, error)
type TypedFinalizeFunc[M]          func(ctx context.Context, originalMessages []M, summary M) ([]M, error)
type TypedCallbackFunc[M]          func(ctx context.Context, before, after adk.TypedChatModelAgentState[M]) error
type TypedUserMessageFilterFunc[M] func(ctx context.Context, msg M) (bool, error)

DefaultFinalizer

DefaultFinalizer is a standalone factory function that returns a TypedFinalizeFunc[M] consistent with the middleware’s default post-processing logic. Use it when you need to reuse default logic (preserving user messages, appending transcript path, etc.) in a custom Finalize.

func DefaultFinalizer[M adk.MessageType](cfg *DefaultFinalizerConfig[M]) (TypedFinalizeFunc[M], error)

DefaultFinalizerConfig[M]

type DefaultFinalizerConfig[M adk.MessageType] struct {
    PreserveUserMessages *TypedPreserveUserMessages[M] // Default Enabled=true, MaxTokens=30000
    TranscriptFilePath   string
}

Example: Execute default post-processing first in a custom Finalize, then add a system message:

defaultFinalize, err := summarization.DefaultFinalizer[*schema.Message](&summarization.DefaultFinalizerConfig[*schema.Message]{
    TranscriptFilePath: "/path/to/transcript.txt",
})
if err != nil {
    // handle error
}

cfg := &summarization.Config{
    Model: yourModel,
    Finalize: func(ctx context.Context, originalMessages []*schema.Message, summary *schema.Message) ([]*schema.Message, error) {
        msgs, err := defaultFinalize(ctx, originalMessages, summary)
        if err != nil {
            return nil, err
        }
        // Add system message before the summary
        return append([]*schema.Message{schema.SystemMessage("your system prompt")}, msgs...), nil
    },
}

FinalizerBuilder

TypedFinalizerBuilder[M] provides a chainable API for building TypedFinalizeFunc[M], supporting linking multiple handlers and an optional custom finalizer.

func NewTypedFinalizer[M adk.MessageType]() *TypedFinalizerBuilder[M]
func NewFinalizer() *FinalizerBuilder // = NewTypedFinalizer[*schema.Message]

func (b *TypedFinalizerBuilder[M]) PreserveSkills(config *PreserveSkillsConfig) *TypedFinalizerBuilder[M]
func (b *TypedFinalizerBuilder[M]) Custom(fn TypedFinalizeFunc[M]) *TypedFinalizerBuilder[M]
func (b *TypedFinalizerBuilder[M]) Build() (TypedFinalizeFunc[M], error)

Execution order: Handlers transform the summary in registration order → Custom determines the final output message list. If Custom is not set, returns []M{summary}.

PreserveSkills

Preserves skill content loaded by the Skill middleware after summary compression, ensuring the agent retains skill knowledge after context window compression.

type PreserveSkillsConfig struct {
    SkillToolName     string // Skill tool name, must match the Skill middleware. Default "skill"
    MaxSkills         *int   // Maximum number of skills to preserve. Default 5; 0 means disabled
    MaxTokensPerSkill *int   // Maximum tokens per skill, truncated if exceeded. Default 5000
    SkillsTokenBudget *int   // Total token budget for all skills. Default 25000
}

Example:

finalizer, err := summarization.NewFinalizer().
    PreserveSkills(&summarization.PreserveSkillsConfig{}).
    Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) {
        return []*schema.Message{schema.SystemMessage("system prompt"), summary}, nil
    }).
    Build()

cfg := &summarization.Config{
    Model:    yourModel,
    Finalize: finalizer,
}

Summarize Method

TypedMiddleware[M] exposes a Summarize method that can manually trigger a summarization outside of the middleware’s automatic trigger:

func (m *TypedMiddleware[M]) Summarize(ctx context.Context, state *adk.TypedChatModelAgentState[M]) ([]M, error)

This method executes the full summarization flow (generation → post-processing → Callback → events) but does not check trigger conditions. Returns the replaced message list.

How It Works

Trigger condition check: First checks ContextMessages (message count), then calculates token count via TokenCounter and compares with ContextTokens. Triggered if either is met.

Default post-processing (when Finalize is not set):

Replaces <all_user_messages>...</all_user_messages> in the summary with the most recent original user messages (controlled by PreserveUserMessages)
Appends TranscriptFilePath hint
Adds summary preamble and continuation instructions

Internal Events

When EmitInternalEvents = true, the middleware sends events via adk.TypedSendEvent:

Event Type	Trigger Timing	Carried Data
ActionTypeBeforeSummarize	After trigger condition is met, before calling the model	TypedBeforeSummarizeAction[M]{Messages} : original message list
ActionTypeGenerateSummary	After each model generation attempt (including retry/failover)	TypedGenerateSummaryAction[M]{Attempt, Phase, ModelResponse, GetError()}
ActionTypeAfterSummarize	After summary completion and Finalize	TypedAfterSummarizeAction[M]{Messages} : final message list

Events are wrapped in TypedCustomizedAction[M] and placed in the adk.AgentAction.CustomizedAction field. GenerateSummaryPhase has two values: GenerateSummaryPhasePrimary (primary model/retry) and GenerateSummaryPhaseFailover (failover).

Usage Examples

Minimal Configuration

mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,
})

agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Model:       yourChatModel,
    Middlewares: []adk.ChatModelAgentMiddleware{mw},
})

Custom Trigger + Retry + Failover

mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,
    Trigger: &summarization.TriggerCondition{
        ContextTokens:   100000,
        ContextMessages: 80,
    },
    TranscriptFilePath: "/path/to/transcript.txt",
    Retry: &summarization.RetryConfig{
        MaxRetries: ptrOf(2),
    },
    Failover: &summarization.FailoverConfig{
        MaxRetries: ptrOf(3),
        GetFailoverModel: func(ctx context.Context, fctx *summarization.FailoverContext) (model.BaseModel[*schema.Message], []*schema.Message, error) {
            return backupModel, nil, nil // Returning nil input will reuse the default input
        },
    },
})

FinalizerBuilder + PreserveSkills + DefaultFinalizer

defaultFinalize, _ := summarization.DefaultFinalizer[*schema.Message](
    &summarization.DefaultFinalizerConfig[*schema.Message]{
        TranscriptFilePath: "/path/to/transcript.txt",
    },
)

finalizer, err := summarization.NewFinalizer().
    PreserveSkills(&summarization.PreserveSkillsConfig{
        MaxSkills: ptrOf(3),
    }).
    Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) {
        msgs, err := defaultFinalize(ctx, origMsgs, summary)
        if err != nil {
            return nil, err
        }
        return append([]*schema.Message{schema.SystemMessage("system prompt")}, msgs...), nil
    }).
    Build()

cfg := &summarization.Config{
    Model:    yourModel,
    Finalize: finalizer,
}

Notes

Set TranscriptFilePath: Strongly recommended to provide a conversation transcript file path so the model can trace back details from the original records after summarization.
Adjust trigger threshold: Trigger.ContextTokens should be set to 80-90% of the model’s context window. The default value of 160,000 is suitable for models with 200k windows.
Custom TokenCounter: For production environments, it’s recommended to implement a counter that precisely matches the model’s tokenizer. The default estimator uses the most recent assistant message’s ResponseMeta.Usage.TotalTokens as a baseline and estimates incremental messages at ~4 chars/token.
Finalize override: Setting Finalize means PreserveUserMessages and TranscriptFilePath no longer take effect automatically. To reuse them, use DefaultFinalizer or FinalizerBuilder.
GetFailoverModel constraint: The callback must return a non-nil model and non-empty input message list.

Feedback

Was this page helpful?

Please tell us how we can improve.

Last modified May 21, 2026: docs(eino): sync docs from feishu (2026-05-21) (#1545) (14c457e4)