Chapter 5: Middleware (Cross-Cutting Concerns)
Goal of this chapter: understand the Middleware pattern and implement Tool error handling and ChatModel retry.
Why We Need Middleware
In Chapter 4 we added Tool capabilities to the Agent, enabling file system access. But in real-world scenarios, Tool errors or ChatModel errors are common, for example:
- Tool errors: file not found, invalid arguments, insufficient permissions, etc.
- ChatModel errors: API rate limiting (429), network timeouts, service unavailable, etc.
Problem 1: Tool Errors Interrupt the Entire Flow
When a Tool execution fails, the error propagates directly to the Agent, interrupting the entire conversation:
[tool call] read_file(file_path: "nonexistent.txt")
Error: open nonexistent.txt: no such file or directory
// Conversation interrupted, user must restart
Problem 2: Model Calls May Fail Due to Rate Limiting
When the model API returns a 429 (Too Many Requests) error, the entire conversation also interrupts:
Error: rate limit exceeded (429)
// Conversation interrupted
Desired Behavior
These errors often should not directly terminate the Agent flow. Instead, we want to pass the error information to the model and let it self-correct in the next turn. For example:
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist. Let me first list the files in the current directory...
[tool call] glob(pattern: "*")
Middleware’s Role
The Middleware pattern can extend Tool and ChatModel behavior, making it ideal for solving this problem:
- Middleware is an Agent interceptor: inserts custom logic before and after calls
- Middleware can handle errors: converts errors into a format the model can understand
- Middleware can implement retries: automatically retries failed operations
- Middleware is composable: multiple Middlewares can be chained together
Simple analogy:
- Agent = “business logic”
- Middleware = “AOP aspect” (logging, retry, error handling, and other cross-cutting concerns)
Code Location
- Entry code: cmd/ch05/main.go
Prerequisites
Same as Chapter 1: you need a configured and available ChatModel (OpenAI or Ark). Also, like Chapter 4, set PROJECT_ROOT:
export PROJECT_ROOT=/path/to/eino # Eino core library root directory
Running
In the examples/quickstart/chatwitheino directory:
# Set project root directory
export PROJECT_ROOT=/path/to/your/project
go run ./cmd/ch05
Example output:
you> List the files in the current directory
[assistant] Let me list the files...
[tool call] list_files(directory: ".")
you> Read a non-existent file
[assistant] Trying to read the file...
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist...
Key Concepts
Middleware Interface
ChatModelAgentMiddleware is the middleware interface for Agents:
type ChatModelAgentMiddleware interface {
// BeforeAgent is called before each agent run, allowing modification of
// the agent's instruction and tools configuration.
BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error)
// BeforeModelRewriteState is called before each model invocation.
// The returned state is persisted to the agent's internal state and passed to the model.
BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)
// AfterModelRewriteState is called after each model invocation.
// The input state includes the model's response as the last message.
AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)
// WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior.
// This method is only called for tools that implement InvokableTool.
WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error)
// WrapStreamableToolCall wraps a tool's streaming execution with custom behavior.
// This method is only called for tools that implement StreamableTool.
WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error)
// WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution.
// This method is only called for tools that implement EnhancedInvokableTool.
WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error)
// WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution.
// This method is only called for tools that implement EnhancedStreamableTool.
WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error)
// WrapModel wraps a chat model with custom behavior.
// This method is called at request time when the model is about to be invoked.
WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error)
}
Design philosophy:
- Decorator pattern: each Middleware wraps the original call, and can modify inputs, outputs, or errors
- Onion model: requests pass through Middlewares from outer to inner; responses return from inner to outer
- Composable: multiple Middlewares execute in sequence
Middleware Execution Order
Handlers (i.e., Middlewares) wrap in array order, forming an onion model:
Handlers: []adk.ChatModelAgentMiddleware{
&middlewareA{}, // Outermost: wraps first, intercepts requests first, but WrapModel takes effect last
&middlewareB{}, // Middle layer
&middlewareC{}, // Innermost: wraps last
}
Execution order for Tool calls:
Request → A.Wrap → B.Wrap → C.Wrap → Actual Tool execution → C returns → B returns → A returns → Response
Practical advice: place safeToolMiddleware (error capture) at the innermost position (end of array) to ensure that interrupt errors thrown by other Middlewares propagate correctly outward.
SafeToolMiddleware
SafeToolMiddleware converts Tool errors into strings so the model can understand and handle them:
type safeToolMiddleware struct {
*adk.BaseChatModelAgentMiddleware
}
func (m *safeToolMiddleware) WrapInvokableToolCall(
_ context.Context,
endpoint adk.InvokableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
result, err := endpoint(ctx, args, opts...)
if err != nil {
if _, ok := compose.IsInterruptRerunError(err); ok {
return "", err
}
// Convert error to string instead of returning an error
return fmt.Sprintf("[tool error] %v", err), nil
}
return result, nil
}, nil
}
Effect:
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist, please check the file path...
// Conversation continues, model can adjust its strategy based on the error
ModelRetryConfig
ModelRetryConfig configures automatic retry for ChatModel:
type ModelRetryConfig struct {
MaxRetries int // Maximum retry count
IsRetryAble func(ctx context.Context, err error) bool // Determines if an error is retryable
}
Usage (using DeepAgent as an example):
agent, err := deep.New(ctx, &deep.Config{
// ...
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 5,
IsRetryAble: func(_ context.Context, err error) bool {
// 429 rate limit errors are retryable
return strings.Contains(err.Error(), "429") ||
strings.Contains(err.Error(), "Too Many Requests") ||
strings.Contains(err.Error(), "qpm limit")
},
},
})
Retry strategy:
- Exponential backoff: retry intervals increase with each attempt
- Configurable conditions:
IsRetryAbledetermines which errors are retryable - Automatic recovery: no user intervention needed
Middleware Implementation
1. Implement SafeToolMiddleware
type safeToolMiddleware struct {
*adk.BaseChatModelAgentMiddleware
}
func (m *safeToolMiddleware) WrapInvokableToolCall(
_ context.Context,
endpoint adk.InvokableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
result, err := endpoint(ctx, args, opts...)
if err != nil {
// Interrupt errors are not converted; they need to propagate
if _, ok := compose.IsInterruptRerunError(err); ok {
return "", err
}
// Other errors are converted to strings
return fmt.Sprintf("[tool error] %v", err), nil
}
return result, nil
}, nil
}
2. Implement Streaming Tool Error Handling
func (m *safeToolMiddleware) WrapStreamableToolCall(
_ context.Context,
endpoint adk.StreamableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.StreamableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) {
sr, err := endpoint(ctx, args, opts...)
if err != nil {
if _, ok := compose.IsInterruptRerunError(err); ok {
return nil, err
}
// Return a single-chunk stream containing the error message
return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil
}
// Wrap the stream to capture errors within the stream
return safeWrapReader(sr), nil
}, nil
}
3. Configure the Agent to Use Middleware
This chapter continues using the DeepAgent introduced in Chapter 4, registering Middleware in its Handlers field:
agent, err := deep.New(ctx, &deep.Config{
Name: "Ch05MiddlewareAgent",
Description: "ChatWithDoc agent with safe tool middleware and retry.",
ChatModel: cm,
Instruction: agentInstruction,
Backend: backend,
StreamingShell: backend,
MaxIteration: 50,
Handlers: []adk.ChatModelAgentMiddleware{
&safeToolMiddleware{}, // Converts Tool errors to strings
},
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 5,
IsRetryAble: func(_ context.Context, err error) bool {
return strings.Contains(err.Error(), "429") ||
strings.Contains(err.Error(), "Too Many Requests") ||
strings.Contains(err.Error(), "qpm limit")
},
},
})
Note: Handlers field (in config) and “Middleware” (the concept discussed in docs) are the same thing — Handlers is the config field name, while ChatModelAgentMiddleware is the interface name.
Key code snippet (Note: this is a simplified snippet that cannot run directly; see cmd/ch05/main.go for the full code):
// SafeToolMiddleware captures Tool errors and converts them to strings
type safeToolMiddleware struct {
*adk.BaseChatModelAgentMiddleware
}
func (m *safeToolMiddleware) WrapInvokableToolCall(
_ context.Context,
endpoint adk.InvokableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
result, err := endpoint(ctx, args, opts...)
if err != nil {
if _, ok := compose.IsInterruptRerunError(err); ok {
return "", err
}
return fmt.Sprintf("[tool error] %v", err), nil
}
return result, nil
}, nil
}
// Configure DeepAgent (same as Chapter 4, adding Handlers and ModelRetryConfig)
agent, _ := deep.New(ctx, &deep.Config{
ChatModel: cm,
Backend: backend,
StreamingShell: backend,
MaxIteration: 50,
Handlers: []adk.ChatModelAgentMiddleware{
&safeToolMiddleware{},
},
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 5,
IsRetryAble: func(_ context.Context, err error) bool {
return strings.Contains(err.Error(), "429")
},
},
})
Middleware Execution Flow
┌─────────────────────────────────────────┐
│ User: read a non-existent file │
└─────────────────────────────────────────┘
↓
┌──────────────────────┐
│ Agent analyzes intent│
│ Decides to call │
│ read_file │
└──────────────────────┘
↓
┌──────────────────────┐
│ SafeToolMiddleware │
│ Intercepts Tool call │
└──────────────────────┘
↓
┌──────────────────────┐
│ Execute read_file │
│ Returns error │
└──────────────────────┘
↓
┌──────────────────────┐
│ SafeToolMiddleware │
│ Converts error to │
│ string │
└──────────────────────┘
↓
┌──────────────────────┐
│ Return Tool Result │
│ "[tool error] ..." │
└──────────────────────┘
↓
┌──────────────────────┐
│ Agent generates reply│
│ "Sorry, file not │
│ found..." │
└──────────────────────┘
Chapter Summary
- Middleware: Agent interceptors that insert custom logic before and after calls
- SafeToolMiddleware: converts Tool errors to strings so the model can understand and handle them
- ModelRetryConfig: configures automatic ChatModel retry for handling temporary errors like rate limiting
- Decorator pattern: Middleware wraps the original call and can modify inputs, outputs, or errors
- Onion model: requests pass through Middlewares from outer to inner; responses return from inner to outer
Extended Thinking
Eino built-in Middlewares:
| Middleware | Description |
| reduction | Tool output reduction — when tool output is too long, automatically truncates and offloads to the file system to prevent context overflow |
| summarization | Conversation history auto-summarization — automatically generates summaries to compress history when token count exceeds a threshold |
| skill | Skill loading middleware — enables the Agent to dynamically load and execute predefined skills |
Middleware chain example:
import (
"github.com/cloudwego/eino/adk/middlewares/reduction"
"github.com/cloudwego/eino/adk/middlewares/summarization"
"github.com/cloudwego/eino/adk/middlewares/skill"
)
// Create reduction middleware: manages tool output length
reductionMW, _ := reduction.New(ctx, &reduction.Config{
Backend: filesystemBackend, // Storage backend
MaxLengthForTrunc: 50000, // Max length per tool output
MaxTokensForClear: 30000, // Token threshold to trigger cleanup
})
// Create summarization middleware: auto-compresses conversation history
summarizationMW, _ := summarization.New(ctx, &summarization.Config{
Model: chatModel, // Model used for generating summaries
Trigger: &summarization.TriggerCondition{
ContextTokens: 190000, // Token threshold to trigger summarization
},
})
// Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New)
agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
Handlers: []adk.ChatModelAgentMiddleware{ // Note: the config field is named Handlers, conceptually equivalent to Middlewares
summarizationMW, // Outermost: conversation history summarization
reductionMW, // Middle layer: tool output reduction
},
})