Summarization

πŸ’‘ This middleware was introduced in v0.8.0. Package path: github.com/cloudwego/eino/adk/middlewares/summarization

Overview

The Summarization middleware automatically calls a summary model to compress conversation history when the conversation token count exceeds a threshold, keeping long conversations coherent within the model’s context window. The middleware hooks into BeforeModelRewriteState, checking trigger conditions before each model call. When triggered, it executes: counting β†’ summary generation (with retry/failover) β†’ post-processing β†’ state replacement.

Generic System

All core types and functions in this package provide both a Typed generic version (M adk.MessageType) and a non-generic alias (fixed to *schema.Message).

Generic VersionNon-generic Alias (= Typed\[*schema.Message\])
TypedConfig[M]
Config
NewTyped[M](ctx, *TypedConfig[M])
New(ctx, *Config)
TypedTokenCounterFunc[M]
TokenCounterFunc
TypedGenModelInputFunc[M]
GenModelInputFunc
TypedGetFailoverModelFunc[M]
GetFailoverModelFunc
TypedFinalizeFunc[M]
FinalizeFunc
TypedCallbackFunc[M]
CallbackFunc
TypedUserMessageFilterFunc[M]
UserMessageFilterFunc
TypedPreserveUserMessages[M]
PreserveUserMessages
TypedRetryConfig[M]
RetryConfig
TypedFailoverConfig[M]
FailoverConfig
TypedFailoverContext[M]
FailoverContext
TypedFinalizerBuilder[M]
FinalizerBuilder

Unless otherwise noted, type signatures in this document use the generic form M. When using non-generic aliases, M = *schema.Message.

Constructors

// Generic version β€” supports *schema.Message and *schema.AgenticMessage
func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error)

// Non-generic version β€” equivalent to NewTyped[*schema.Message]
func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error)

TypedConfig[M] Configuration

FieldTypeRequiredDefaultDescription
Model
model.BaseModel[M]
Yesβ€”Model used to generate summaries
ModelOptions
[]model.Option
Noβ€”Options passed to the summary model
TokenCounter
TypedTokenCounterFunc[M]
NoEstimates based on the most recent assistant message's total_tokens as baseline, incremental messages at ~4 chars/tokenCustom token counting function
Trigger
*TriggerCondition
NoContextTokens=160,000Condition to trigger summarization
UserInstruction
string
NoBuilt-in promptCustom user-level summarization instruction, overrides the default instruction
TranscriptFilePath
string
Noβ€”Full conversation transcript file path, appended to the summary to remind the model where to find original context. Only effective when Finalize is not set
GenModelInput
TypedGenModelInputFunc[M]
NosysInstruction β†’ contextMsgs β†’ userInstructionFull control over constructing the summary model input
Finalize
TypedFinalizeFunc[M]
NoBuilt-in post-processingCustom summary post-processing. When set, the middleware no longer performs any default post-processing
Callback
TypedCallbackFunc[M]
Noβ€”Called after Finalize, with parameters
before, after adk.TypedChatModelAgentState[M]
(value types), read-only
EmitInternalEvents
bool
NofalseWhether to send internal events at key points
PreserveUserMessages
*TypedPreserveUserMessages[M]
NoEnabled: truePreserve original user messages in the summary. Only effective when Finalize is not set
Retry
*TypedRetryConfig[M]
Nonil (no retry)Retry strategy for the primary model summary generation
Failover
*TypedFailoverConfig[M]
NonilFailover strategy after primary model failure

πŸ’‘ Finalize override semantics: Once a custom Finalize is set, the middleware will skip all default post-processing β€” PreserveUserMessages and TranscriptFilePath will no longer take effect. To reuse default post-processing logic in a custom Finalize, use the DefaultFinalizer function.

Sub-configuration Structs

TriggerCondition

Summarization is triggered when any condition is met.

type TriggerCondition struct {
    ContextTokens   int // Trigger when token count exceeds this threshold
    ContextMessages int // Trigger when message count exceeds this threshold
}

TypedPreserveUserMessages[M]

When enabled, replaces the <all_user_messages>...</all_user_messages> section in the summary with the most recent original user messages.

type TypedPreserveUserMessages[M adk.MessageType] struct {
    Enabled   bool
    MaxTokens int                        // Max tokens for preserved user messages; defaults to TriggerCondition.ContextTokens / 3
    Filter    TypedUserMessageFilterFunc[M] // Filter function, return false to exclude a message
}

TypedRetryConfig[M]

type TypedRetryConfig[M adk.MessageType] struct {
    MaxRetries  *int                                                            // Default 3
    ShouldRetry func(ctx context.Context, resp M, err error) bool              // Default: retry when err != nil
    BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration // Default: exponential backoff + jitter
}

TypedFailoverConfig[M]

type TypedFailoverConfig[M adk.MessageType] struct {
    MaxRetries     *int                                                            // Default 3
    ShouldFailover func(ctx context.Context, resp M, err error) bool              // Default: failover when err != nil
    BackoffFunc    func(ctx context.Context, attempt int, resp M, err error) time.Duration
    GetFailoverModel TypedGetFailoverModelFunc[M] // Returns (failoverModel model.BaseModel[M], failoverModelInputMsgs []M, failoverErr error)
}

TypedFailoverContext[M]

Context passed to the GetFailoverModel callback.

type TypedFailoverContext[M adk.MessageType] struct {
    Attempt           int  // Current failover attempt number, starting from 1
    SystemInstruction M    // System instruction (set internally by the middleware, not configurable)
    UserInstruction   M    // User instruction
    OriginalMessages  []M  // Original complete conversation
    LastModelResponse M    // Model response from the last attempt
    LastErr           error
}

TypedTokenCounterInput[M]

type TypedTokenCounterInput[M adk.MessageType] struct {
    Messages []M
    Tools    []*schema.ToolInfo
}

Function Type Signature Reference

type TypedTokenCounterFunc[M]      func(ctx context.Context, input *TypedTokenCounterInput[M]) (int, error)
type TypedGenModelInputFunc[M]     func(ctx context.Context, sysInstruction, userInstruction M, originalMsgs []M) ([]M, error)
type TypedGetFailoverModelFunc[M]  func(ctx context.Context, failoverCtx *TypedFailoverContext[M]) (model.BaseModel[M], []M, error)
type TypedFinalizeFunc[M]          func(ctx context.Context, originalMessages []M, summary M) ([]M, error)
type TypedCallbackFunc[M]          func(ctx context.Context, before, after adk.TypedChatModelAgentState[M]) error
type TypedUserMessageFilterFunc[M] func(ctx context.Context, msg M) (bool, error)

DefaultFinalizer

DefaultFinalizer is a standalone factory function that returns a TypedFinalizeFunc[M] consistent with the middleware’s default post-processing logic. Use it when you need to reuse default logic (preserving user messages, appending transcript path, etc.) in a custom Finalize.

func DefaultFinalizer[M adk.MessageType](cfg *DefaultFinalizerConfig[M]) (TypedFinalizeFunc[M], error)

DefaultFinalizerConfig[M]

type DefaultFinalizerConfig[M adk.MessageType] struct {
    PreserveUserMessages *TypedPreserveUserMessages[M] // Default Enabled=true, MaxTokens=30000
    TranscriptFilePath   string
}

Example: Execute default post-processing first in a custom Finalize, then add a system message:

defaultFinalize, err := summarization.DefaultFinalizer[*schema.Message](&summarization.DefaultFinalizerConfig[*schema.Message]{
    TranscriptFilePath: "/path/to/transcript.txt",
})
if err != nil {
    // handle error
}

cfg := &summarization.Config{
    Model: yourModel,
    Finalize: func(ctx context.Context, originalMessages []*schema.Message, summary *schema.Message) ([]*schema.Message, error) {
        msgs, err := defaultFinalize(ctx, originalMessages, summary)
        if err != nil {
            return nil, err
        }
        // Add system message before the summary
        return append([]*schema.Message{schema.SystemMessage("your system prompt")}, msgs...), nil
    },
}

FinalizerBuilder

TypedFinalizerBuilder[M] provides a chainable API for building TypedFinalizeFunc[M], supporting linking multiple handlers and an optional custom finalizer.

func NewTypedFinalizer[M adk.MessageType]() *TypedFinalizerBuilder[M]
func NewFinalizer() *FinalizerBuilder // = NewTypedFinalizer[*schema.Message]

func (b *TypedFinalizerBuilder[M]) PreserveSkills(config *PreserveSkillsConfig) *TypedFinalizerBuilder[M]
func (b *TypedFinalizerBuilder[M]) Custom(fn TypedFinalizeFunc[M]) *TypedFinalizerBuilder[M]
func (b *TypedFinalizerBuilder[M]) Build() (TypedFinalizeFunc[M], error)

Execution order: Handlers transform the summary in registration order β†’ Custom determines the final output message list. If Custom is not set, returns []M{summary}.

PreserveSkills

Preserves skill content loaded by the Skill middleware after summary compression, ensuring the agent retains skill knowledge after context window compression.

type PreserveSkillsConfig struct {
    SkillToolName     string // Skill tool name, must match the Skill middleware. Default "skill"
    MaxSkills         *int   // Maximum number of skills to preserve. Default 5; 0 means disabled
    MaxTokensPerSkill *int   // Maximum tokens per skill, truncated if exceeded. Default 5000
    SkillsTokenBudget *int   // Total token budget for all skills. Default 25000
}

Example:

finalizer, err := summarization.NewFinalizer().
    PreserveSkills(&summarization.PreserveSkillsConfig{}).
    Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) {
        return []*schema.Message{schema.SystemMessage("system prompt"), summary}, nil
    }).
    Build()

cfg := &summarization.Config{
    Model:    yourModel,
    Finalize: finalizer,
}

Summarize Method

TypedMiddleware[M] exposes a Summarize method that can manually trigger a summarization outside of the middleware’s automatic trigger:

func (m *TypedMiddleware[M]) Summarize(ctx context.Context, state *adk.TypedChatModelAgentState[M]) ([]M, error)

This method executes the full summarization flow (generation β†’ post-processing β†’ Callback β†’ events) but does not check trigger conditions. Returns the replaced message list.

How It Works

Trigger condition check: First checks ContextMessages (message count), then calculates token count via TokenCounter and compares with ContextTokens. Triggered if either is met.

Default post-processing (when Finalize is not set):

  1. Replaces <all_user_messages>...</all_user_messages> in the summary with the most recent original user messages (controlled by PreserveUserMessages)
  2. Appends TranscriptFilePath hint
  3. Adds summary preamble and continuation instructions

Internal Events

When EmitInternalEvents = true, the middleware sends events via adk.TypedSendEvent:

Event TypeTrigger TimingCarried Data
ActionTypeBeforeSummarize
After trigger condition is met, before calling the model
TypedBeforeSummarizeAction[M]{Messages}
: original message list
ActionTypeGenerateSummary
After each model generation attempt (including retry/failover)
TypedGenerateSummaryAction[M]{Attempt, Phase, ModelResponse, GetError()}
ActionTypeAfterSummarize
After summary completion and Finalize
TypedAfterSummarizeAction[M]{Messages}
: final message list

Events are wrapped in TypedCustomizedAction[M] and placed in the adk.AgentAction.CustomizedAction field. GenerateSummaryPhase has two values: GenerateSummaryPhasePrimary (primary model/retry) and GenerateSummaryPhaseFailover (failover).

Usage Examples

Minimal Configuration

mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,
})

agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Model:       yourChatModel,
    Middlewares: []adk.ChatModelAgentMiddleware{mw},
})

Custom Trigger + Retry + Failover

mw, err := summarization.New(ctx, &summarization.Config{
    Model: yourChatModel,
    Trigger: &summarization.TriggerCondition{
        ContextTokens:   100000,
        ContextMessages: 80,
    },
    TranscriptFilePath: "/path/to/transcript.txt",
    Retry: &summarization.RetryConfig{
        MaxRetries: ptrOf(2),
    },
    Failover: &summarization.FailoverConfig{
        MaxRetries: ptrOf(3),
        GetFailoverModel: func(ctx context.Context, fctx *summarization.FailoverContext) (model.BaseModel[*schema.Message], []*schema.Message, error) {
            return backupModel, nil, nil // Returning nil input will reuse the default input
        },
    },
})

FinalizerBuilder + PreserveSkills + DefaultFinalizer

defaultFinalize, _ := summarization.DefaultFinalizer[*schema.Message](
    &summarization.DefaultFinalizerConfig[*schema.Message]{
        TranscriptFilePath: "/path/to/transcript.txt",
    },
)

finalizer, err := summarization.NewFinalizer().
    PreserveSkills(&summarization.PreserveSkillsConfig{
        MaxSkills: ptrOf(3),
    }).
    Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) {
        msgs, err := defaultFinalize(ctx, origMsgs, summary)
        if err != nil {
            return nil, err
        }
        return append([]*schema.Message{schema.SystemMessage("system prompt")}, msgs...), nil
    }).
    Build()

cfg := &summarization.Config{
    Model:    yourModel,
    Finalize: finalizer,
}

Notes

  1. Set TranscriptFilePath: Strongly recommended to provide a conversation transcript file path so the model can trace back details from the original records after summarization.
  2. Adjust trigger threshold: Trigger.ContextTokens should be set to 80-90% of the model’s context window. The default value of 160,000 is suitable for models with 200k windows.
  3. Custom TokenCounter: For production environments, it’s recommended to implement a counter that precisely matches the model’s tokenizer. The default estimator uses the most recent assistant message’s ResponseMeta.Usage.TotalTokens as a baseline and estimates incremental messages at ~4 chars/token.
  4. Finalize override: Setting Finalize means PreserveUserMessages and TranscriptFilePath no longer take effect automatically. To reuse them, use DefaultFinalizer or FinalizerBuilder.
  5. GetFailoverModel constraint: The callback must return a non-nil model and non-empty input message list.