Chapter 5: Middleware (Cross-Cutting Concerns)

Goal of this chapter: understand the Middleware pattern and implement Tool error handling and ChatModel retry.

Why We Need Middleware

In Chapter 4 we added Tool capabilities to the Agent, enabling file system access. But in real-world scenarios, Tool errors or ChatModel errors are common, for example:

Tool errors: file not found, invalid arguments, insufficient permissions, etc.
ChatModel errors: API rate limiting (429), network timeouts, service unavailable, etc.

Problem 1: Tool Errors Interrupt the Entire Flow

When a Tool execution fails, the error propagates directly to the Agent, interrupting the entire conversation:

[tool call] read_file(file_path: "nonexistent.txt")
Error: open nonexistent.txt: no such file or directory
// Conversation interrupted, user must restart

Problem 2: Model Calls May Fail Due to Rate Limiting

When the model API returns a 429 (Too Many Requests) error, the entire conversation also interrupts:

Error: rate limit exceeded (429)
// Conversation interrupted

Desired Behavior

These errors often should not directly terminate the Agent flow. Instead, we want to pass the error information to the model and let it self-correct in the next turn. For example:

[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist. Let me first list the files in the current directory...
[tool call] glob(pattern: "*")

Middleware’s Role

The Middleware pattern can extend Tool and ChatModel behavior, making it ideal for solving this problem:

Middleware is an Agent interceptor: inserts custom logic before and after calls
Middleware can handle errors: converts errors into a format the model can understand
Middleware can implement retries: automatically retries failed operations
Middleware is composable: multiple Middlewares can be chained together

Simple analogy:

Agent = “business logic”
Middleware = “AOP aspect” (logging, retry, error handling, and other cross-cutting concerns)

Code Location

Entry code: cmd/ch05/main.go

Prerequisites

Same as Chapter 1: you need a configured and available ChatModel (OpenAI or Ark). Also, like Chapter 4, set PROJECT_ROOT:

export PROJECT_ROOT=/path/to/eino  # Eino core library root directory

Running

In the examples/quickstart/chatwitheino directory:

# Set project root directory
export PROJECT_ROOT=/path/to/your/project

go run ./cmd/ch05

Example output:

you> List the files in the current directory
[assistant] Let me list the files...
[tool call] list_files(directory: ".")

you> Read a non-existent file
[assistant] Trying to read the file...
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist...

Key Concepts

Middleware Interface

ChatModelAgentMiddleware is the middleware interface for Agents:

type ChatModelAgentMiddleware interface {
    // BeforeAgent is called before each agent run, allowing modification of
    // the agent's instruction and tools configuration.
    BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error)

    // BeforeModelRewriteState is called before each model invocation.
    // The returned state is persisted to the agent's internal state and passed to the model.
    BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)

    // AfterModelRewriteState is called after each model invocation.
    // The input state includes the model's response as the last message.
    AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)

    // WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior.
    // This method is only called for tools that implement InvokableTool.
    WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error)

    // WrapStreamableToolCall wraps a tool's streaming execution with custom behavior.
    // This method is only called for tools that implement StreamableTool.
    WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error)

    // WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution.
    // This method is only called for tools that implement EnhancedInvokableTool.
    WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error)

    // WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution.
    // This method is only called for tools that implement EnhancedStreamableTool.
    WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error)

    // WrapModel wraps a chat model with custom behavior.
    // This method is called at request time when the model is about to be invoked.
    WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error)
}

Design philosophy:

Decorator pattern: each Middleware wraps the original call, and can modify inputs, outputs, or errors
Onion model: requests pass through Middlewares from outer to inner; responses return from inner to outer
Composable: multiple Middlewares execute in sequence

Middleware Execution Order

Handlers (i.e., Middlewares) wrap in array order, forming an onion model:

Handlers: []adk.ChatModelAgentMiddleware{
    &middlewareA{},  // Outermost: wraps first, intercepts requests first, but WrapModel takes effect last
    &middlewareB{},  // Middle layer
    &middlewareC{},  // Innermost: wraps last
}

Execution order for Tool calls:

Request → A.Wrap → B.Wrap → C.Wrap → Actual Tool execution → C returns → B returns → A returns → Response

Practical advice: place safeToolMiddleware (error capture) at the innermost position (end of array) to ensure that interrupt errors thrown by other Middlewares propagate correctly outward.

SafeToolMiddleware

SafeToolMiddleware converts Tool errors into strings so the model can understand and handle them:

type safeToolMiddleware struct {
    *adk.BaseChatModelAgentMiddleware
}

func (m *safeToolMiddleware) WrapInvokableToolCall(
    _ context.Context,
    endpoint adk.InvokableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
        result, err := endpoint(ctx, args, opts...)
        if err != nil {
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return "", err
            }
            // Convert error to string instead of returning an error
            return fmt.Sprintf("[tool error] %v", err), nil
        }
        return result, nil
    }, nil
}

Effect:

[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist, please check the file path...
// Conversation continues, model can adjust its strategy based on the error

ModelRetryConfig

ModelRetryConfig configures automatic retry for ChatModel:

type ModelRetryConfig struct {
    MaxRetries int                          // Maximum retry count
    IsRetryAble func(ctx context.Context, err error) bool  // Determines if an error is retryable
}

Usage (using DeepAgent as an example):

agent, err := deep.New(ctx, &deep.Config{
    // ...
    ModelRetryConfig: &adk.ModelRetryConfig{
        MaxRetries: 5,
        IsRetryAble: func(_ context.Context, err error) bool {
            // 429 rate limit errors are retryable
            return strings.Contains(err.Error(), "429") ||
                strings.Contains(err.Error(), "Too Many Requests") ||
                strings.Contains(err.Error(), "qpm limit")
        },
    },
})

Retry strategy:

Exponential backoff: retry intervals increase with each attempt
Configurable conditions: IsRetryAble determines which errors are retryable
Automatic recovery: no user intervention needed

Middleware Implementation

1. Implement SafeToolMiddleware

type safeToolMiddleware struct {
    *adk.BaseChatModelAgentMiddleware
}

func (m *safeToolMiddleware) WrapInvokableToolCall(
    _ context.Context,
    endpoint adk.InvokableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
        result, err := endpoint(ctx, args, opts...)
        if err != nil {
            // Interrupt errors are not converted; they need to propagate
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return "", err
            }
            // Other errors are converted to strings
            return fmt.Sprintf("[tool error] %v", err), nil
        }
        return result, nil
    }, nil
}

2. Implement Streaming Tool Error Handling

func (m *safeToolMiddleware) WrapStreamableToolCall(
    _ context.Context,
    endpoint adk.StreamableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.StreamableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) {
        sr, err := endpoint(ctx, args, opts...)
        if err != nil {
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return nil, err
            }
            // Return a single-chunk stream containing the error message
            return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil
        }
        // Wrap the stream to capture errors within the stream
        return safeWrapReader(sr), nil
    }, nil
}

3. Configure the Agent to Use Middleware

This chapter continues using the DeepAgent introduced in Chapter 4, registering Middleware in its Handlers field:

agent, err := deep.New(ctx, &deep.Config{
    Name:           "Ch05MiddlewareAgent",
    Description:    "ChatWithDoc agent with safe tool middleware and retry.",
    ChatModel:      cm,
    Instruction:    agentInstruction,
    Backend:        backend,
    StreamingShell: backend,
    MaxIteration:   50,
    Handlers: []adk.ChatModelAgentMiddleware{
        &safeToolMiddleware{},  // Converts Tool errors to strings
    },
    ModelRetryConfig: &adk.ModelRetryConfig{
        MaxRetries: 5,
        IsRetryAble: func(_ context.Context, err error) bool {
            return strings.Contains(err.Error(), "429") ||
                strings.Contains(err.Error(), "Too Many Requests") ||
                strings.Contains(err.Error(), "qpm limit")
        },
    },
})

Note: Handlers field (in config) and “Middleware” (the concept discussed in docs) are the same thing — Handlers is the config field name, while ChatModelAgentMiddleware is the interface name.

Key code snippet (Note: this is a simplified snippet that cannot run directly; see cmd/ch05/main.go for the full code):

// SafeToolMiddleware captures Tool errors and converts them to strings
type safeToolMiddleware struct {
    *adk.BaseChatModelAgentMiddleware
}

func (m *safeToolMiddleware) WrapInvokableToolCall(
    _ context.Context,
    endpoint adk.InvokableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
        result, err := endpoint(ctx, args, opts...)
        if err != nil {
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return "", err
            }
            return fmt.Sprintf("[tool error] %v", err), nil
        }
        return result, nil
    }, nil
}

// Configure DeepAgent (same as Chapter 4, adding Handlers and ModelRetryConfig)
agent, _ := deep.New(ctx, &deep.Config{
    ChatModel:      cm,
    Backend:        backend,
    StreamingShell: backend,
    MaxIteration:   50,
    Handlers: []adk.ChatModelAgentMiddleware{
        &safeToolMiddleware{},
    },
    ModelRetryConfig: &adk.ModelRetryConfig{
        MaxRetries: 5,
        IsRetryAble: func(_ context.Context, err error) bool {
            return strings.Contains(err.Error(), "429")
        },
    },
})

Middleware Execution Flow

┌─────────────────────────────────────────┐
│  User: read a non-existent file          │
└─────────────────────────────────────────┘
                   ↓
        ┌──────────────────────┐
        │  Agent analyzes intent│
        │  Decides to call      │
        │  read_file            │
        └──────────────────────┘
                   ↓
        ┌──────────────────────┐
        │  SafeToolMiddleware  │
        │  Intercepts Tool call │
        └──────────────────────┘
                   ↓
        ┌──────────────────────┐
        │  Execute read_file    │
        │  Returns error        │
        └──────────────────────┘
                   ↓
        ┌──────────────────────┐
        │  SafeToolMiddleware  │
        │  Converts error to    │
        │  string               │
        └──────────────────────┘
                   ↓
        ┌──────────────────────┐
        │  Return Tool Result   │
        │  "[tool error] ..."  │
        └──────────────────────┘
                   ↓
        ┌──────────────────────┐
        │  Agent generates reply│
        │  "Sorry, file not    │
        │   found..."          │
        └──────────────────────┘

Chapter Summary

Middleware: Agent interceptors that insert custom logic before and after calls
SafeToolMiddleware: converts Tool errors to strings so the model can understand and handle them
ModelRetryConfig: configures automatic ChatModel retry for handling temporary errors like rate limiting
Decorator pattern: Middleware wraps the original call and can modify inputs, outputs, or errors
Onion model: requests pass through Middlewares from outer to inner; responses return from inner to outer

Extended Thinking

Eino built-in Middlewares:

Middleware	Description
reduction	Tool output reduction — when tool output is too long, automatically truncates and offloads to the file system to prevent context overflow
summarization	Conversation history auto-summarization — automatically generates summaries to compress history when token count exceeds a threshold
skill	Skill loading middleware — enables the Agent to dynamically load and execute predefined skills

Middleware chain example:

import (
    "github.com/cloudwego/eino/adk/middlewares/reduction"
    "github.com/cloudwego/eino/adk/middlewares/summarization"
    "github.com/cloudwego/eino/adk/middlewares/skill"
)

// Create reduction middleware: manages tool output length
reductionMW, _ := reduction.New(ctx, &reduction.Config{
    Backend:           filesystemBackend,     // Storage backend
    MaxLengthForTrunc: 50000,                  // Max length per tool output
    MaxTokensForClear: 30000,                  // Token threshold to trigger cleanup
})

// Create summarization middleware: auto-compresses conversation history
summarizationMW, _ := summarization.New(ctx, &summarization.Config{
    Model: chatModel,                          // Model used for generating summaries
    Trigger: &summarization.TriggerCondition{
        ContextTokens: 190000,                 // Token threshold to trigger summarization
    },
})

// Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New)
agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Handlers: []adk.ChatModelAgentMiddleware{  // Note: the config field is named Handlers, conceptually equivalent to Middlewares
        summarizationMW,   // Outermost: conversation history summarization
        reductionMW,       // Middle layer: tool output reduction
    },
})

Feedback

Was this page helpful?

Please tell us how we can improve.

Last modified May 21, 2026: docs(eino): sync docs from feishu (2026-05-21) (#1545) (14c457e4)