文档API 参考📓 教程🧑‍🍳 食谱🤝 集成💜 Discord🎨 Studio
文档

状态

State 是一个用于在 Agent 和 Tool 执行期间存储共享信息的容器。它提供了一种结构化的方式来维护对话历史、在工具之间共享数据以及存储代理工作流程中的中间结果。

概述

构建使用多个工具的代理时,通常需要工具之间共享信息。State 通过提供一个所有工具都可以读取和写入的集中式存储来解决这个问题。例如,一个工具可能会检索文档,而另一个工具会使用这些文档来生成答案。

State 使用基于模式的方法,您可以在其中定义

  • 可以存储哪些数据,
  • 每种数据的类型,
  • 更新时如何合并值。

支持的类型

State 支持标准的 Python 类型

  • 基本类型str, int, float, bool, dict,
  • 列表类型list, list[str], list[int], list[Document],
  • 联合类型Union[str, int], Optional[str],
  • 自定义类和数据类。

自动消息处理

State 会自动包含一个messages 字段来存储对话历史。您无需在模式中定义它。

# State automatically adds messages field
state = State(schema={"user_id": {"type": str}})

# The messages field is available
print("messages" in state.schema)  # True
print(state.schema["messages"]["type"])  # list[ChatMessage]

# Access conversation history
messages = state.get("messages", [])

messages 字段使用list[ChatMessage] 类型和merge_lists 处理程序,默认情况下,这意味着新消息会追加到对话历史中。

用法

创建 State

通过定义一个模式来创建 State,该模式指定可以存储哪些数据及其类型

from haystack.components.agents.state import State

# Define the schema
schema = {
    "user_name": {"type": str},
    "documents": {"type": list},
    "count": {"type": int}
}

# Create State with initial data
state = State(
    schema=schema,
    data={"user_name": "Alice", "documents": [], "count": 0}
)

从 State 读取

使用get() 方法检索值

# Get a value
user_name = state.get("user_name")

# Get a value with a default if key doesn't exist
documents = state.get("documents", [])

# Check if a key exists
if state.has("user_name"):
    print(f"User: {state.get('user_name')}")

写入 State

使用set() 方法来存储或合并值

# Set a value
state.set("user_name", "Bob")

# Set list values (these are merged by default)
state.set("documents", [{"title": "Doc 1", "content": "Content 1"}])

模式定义

模式定义了可以存储哪些数据以及如何更新值。每个模式条目都包含

  • type (必需):Python 类型,定义了可以存储什么类型的数据(例如,str, int, list)
  • handler (可选):一个函数,用于确定在调用时如何将新值与现有值合并set()
{
    "parameter_name": {
        "type": SomeType,  # Required: Expected Python type for this field
        "handler": Optional[Callable[[Any, Any], Any]]  # Optional: Function to merge values
    }
}

如果您不指定处理程序,State 会根据类型自动分配默认处理程序。

默认处理程序

处理程序控制调用时值的合并方式set() 在现有键上。State 提供两个默认处理程序

  • merge_lists:将列表合并在一起(列表类型的默认值)
  • replace_values:覆盖现有值(非列表类型的默认值)
from haystack.components.agents.state.state_utils import merge_lists, replace_values

schema = {
    "documents": {"type": list},  # Uses merge_lists by default
    "user_name": {"type": str},   # Uses replace_values by default
    "count": {"type": int}         # Uses replace_values by default
}

state = State(schema=schema)

# Lists are merged by default
state.set("documents", [1, 2])
state.set("documents", [3, 4])
print(state.get("documents"))  # Output: [1, 2, 3, 4]

# Other values are replaced
state.set("user_name", "Alice")
state.set("user_name", "Bob")
print(state.get("user_name"))  # Output: "Bob"

自定义处理程序

您可以为特定的合并行为定义自定义处理程序

def custom_merge(current_value, new_value):
    """Custom handler that merges and sorts lists."""
    current_list = current_value or []
    new_list = new_value if isinstance(new_value, list) else [new_value]
    return sorted(current_list + new_list)

schema = {
    "numbers": {"type": list, "handler": custom_merge}
}

state = State(schema=schema)
state.set("numbers", [3, 1])
state.set("numbers", [2, 4])
print(state.get("numbers"))  # Output: [1, 2, 3, 4]

您也可以为单个操作覆盖处理程序

def concatenate_strings(current, new):
    return f"{current}-{new}" if current else new
    
schema = {"user_name": {"type": str}}
state = State(schema=schema)

state.set("user_name", "Alice")
state.set("user_name", "Bob", handler_override=concatenate_strings)
print(state.get("user_name"))  # Output: "Alice-Bob"

将 State 与 Agent 结合使用

要将 State 与 Agent 一起使用,请在创建 Agent 时定义一个 State 模式。Agent 会在其整个执行过程中自动管理 State。

from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.tools import Tool

# Define a simple calculation tool
def calculate(expression: str) -> dict:
    """Evaluate a mathematical expression."""
    result = eval(expression, {"__builtins__": {}})
    return {"result": result}

# Create a tool that writes to state
calculator_tool = Tool(
    name="calculator",
    description="Evaluate basic math expressions",
    parameters={
        "type": "object",
        "properties": {"expression": {"type": "string"}},
        "required": ["expression"]
    },
    function=calculate,
    outputs_to_state={"calc_result": {"source": "result"}}
)

# Create agent with state schema
agent = Agent(
    chat_generator=OpenAIChatGenerator(),
    tools=[calculator_tool],
    state_schema={"calc_result": {"type": int}}
)

# Run the agent
result = agent.run(
    messages=[ChatMessage.from_user("Calculate 15 + 27")]
)

# Access the state from results
calc_result = result["calc_result"]
print(calc_result)  # Output: 42

工具和 State

工具通过两种机制与 State 交互inputs_from_stateoutputs_to_state.

从 State 读取inputs_from_state

工具可以自动从 State 读取值并将其用作参数。该inputs_from_state 参数将 State 键映射到工具参数名称。

def search_documents(query: str, user_context: str) -> dict:
    """Search documents using query and user context."""
    return {
        "results": [f"Found results for '{query}' (user: {user_context})"]
    }

# Create tool that reads from state
search_tool = Tool(
    name="search",
    description="Search documents",
    parameters={
        "type": "object",
        "properties": {
            "query": {"type": "string"},
            "user_context": {"type": "string"}
        },
        "required": ["query"]
    },
    function=search_documents,
    inputs_from_state={"user_name": "user_context"}  # Maps state's "user_name" to the tool’s input parameter “user_context”
)

# Define agent with state schema including user_name
agent = Agent(
    chat_generator=OpenAIChatGenerator(),
    tools=[search_tool],
    state_schema={
        "user_name": {"type": str},
        "search_results": {"type": list}
    }
)

# Initialize agent with user context
result = agent.run(
    messages=[ChatMessage.from_user("Search for Python tutorials")],
    user_name="Alice"  # All additional kwargs passed to Agent at runtime are put into State
)

调用工具时,Agent 会自动从 State 中检索值并将其传递给工具函数。

写入 Stateoutputs_to_state

工具可以将其结果写回 State。该outputs_to_state 参数定义了从工具输出到 State 键的映射。

输出的结构是{”state_key”: {”source”: “tool_result_key”}}.

def retrieve_documents(query: str) -> dict:
    """Retrieve documents based on query."""
    return {
        "documents": [
            {"title": "Doc 1", "content": "Content about Python"},
            {"title": "Doc 2", "content": "More about Python"}
        ],
        "count": 2,
        "query": query
    }

# Create tool that writes to state
retrieval_tool = Tool(
    name="retrieve",
    description="Retrieve relevant documents",
    parameters={
        "type": "object",
        "properties": {"query": {"type": "string"}},
        "required": ["query"]
    },
    function=retrieve_documents,
    outputs_to_state={
        "documents": {"source": "documents"},      # Maps tool's "documents" output to state's "documents"
        "result_count": {"source": "count"},       # Maps tool's "count" output to state's "result_count"
        "last_query": {"source": "query"}          # Maps tool's "query" output to state's "last_query"
    }
)

agent = Agent(
    chat_generator=OpenAIChatGenerator(),
    tools=[retrieval_tool],
    state_schema={
        "documents": {"type": list},
        "result_count": {"type": int},
        "last_query": {"type": str}
    }
)

result = agent.run(
    messages=[ChatMessage.from_user("Find information about Python")]
)

# Access state values from result
documents = result["documents"]
result_count = result["result_count"]
last_query = result["last_query"]
print(documents)      # List of retrieved documents
print(result_count)   # 2
print(last_query)     # "Find information about Python"

每个映射都可以指定

  • source:要使用的工具输出中的哪个字段
  • handler:可选的自定义函数,用于合并值

如果省略source,则将存储整个工具结果

from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.tools import Tool

def get_user_info() -> dict:
    """Get user information."""
    return {"name": "Alice", "email": "[email protected]", "role": "admin"}

# Tool that stores entire result
info_tool = Tool(
    name="get_info",
    description="Get user information",
    parameters={"type": "object", "properties": {}},
    function=get_user_info,
    outputs_to_state={
        "user_info": {}  # Stores entire result dict in state's "user_info"
    }
)

# Create agent with matching state schema
agent = Agent(
    chat_generator=OpenAIChatGenerator(),
    tools=[info_tool],
    state_schema={
        "user_info": {"type": dict}  # Schema must match the tool's output type
    }
)

# Run the agent
result = agent.run(
    messages=[ChatMessage.from_user("Get the user information")]
)

# Access the complete result from state
user_info = result["user_info"]
print(user_info)  # Output: {"name": "Alice", "email": "[email protected]", "role": "admin"}
print(user_info["name"])   # Output: "Alice"
print(user_info["email"])  # Output: "[email protected]"

合并输入和输出

工具可以同时从 State 读取和写入 State,从而实现工具链

from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.tools import Tool

def process_documents(documents: list, max_results: int) -> dict:
    """Process documents and return filtered results."""
    processed = documents[:max_results]
    return {
        "processed_docs": processed,
        "processed_count": len(processed)
    }

processing_tool = Tool(
    name="process",
    description="Process retrieved documents",
    parameters={
        "type": "object",
        "properties": {"max_results": {"type": "integer"}},
        "required": ["max_results"]
    },
    function=process_documents,
    inputs_from_state={"documents": "documents"},  # Reads documents from state
    outputs_to_state={
        "final_docs": {"source": "processed_docs"},
        "final_count": {"source": "processed_count"}
    }
)

agent = Agent(
    chat_generator=OpenAIChatGenerator(),
    tools=[retrieval_tool, processing_tool],  # Chain tools using state
    state_schema={
        "documents": {"type": list},
        "final_docs": {"type": list},
        "final_count": {"type": int}
    }
)

# Run the agent - tools will chain through state
result = agent.run(
    messages=[ChatMessage.from_user("Find and process 3 documents about Python")]
)

# Access the final processed results
final_docs = result["final_docs"]
final_count = result["final_count"]
print(f"Processed {final_count} documents")
print(final_docs)

完整示例

此示例展示了一个多工具代理工作流程,其中工具通过 State 共享数据

import math
from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.tools import Tool

# Tool 1: Calculate factorial
def factorial(n: int) -> dict:
    """Calculate the factorial of a number."""
    result = math.factorial(n)
    return {"result": result}

factorial_tool = Tool(
    name="factorial",
    description="Calculate the factorial of a number",
    parameters={
        "type": "object",
        "properties": {"n": {"type": "integer"}},
        "required": ["n"]
    },
    function=factorial,
    outputs_to_state={"factorial_result": {"source": "result"}}
)

# Tool 2: Perform calculation
def calculate(expression: str) -> dict:
    """Evaluate a mathematical expression."""
    result = eval(expression, {"__builtins__": {}})
    return {"result": result}

calculator_tool = Tool(
    name="calculator",
    description="Evaluate basic math expressions",
    parameters={
        "type": "object",
        "properties": {"expression": {"type": "string"}},
        "required": ["expression"]
    },
    function=calculate,
    outputs_to_state={"calc_result": {"source": "result"}}
)

# Create agent with both tools
agent = Agent(
    chat_generator=OpenAIChatGenerator(),
    tools=[calculator_tool, factorial_tool],
    state_schema={
        "calc_result": {"type": int},
        "factorial_result": {"type": int}
    }
)

# Run the agent
result = agent.run(
    messages=[ChatMessage.from_user("Calculate the factorial of 5, then multiply it by 2")]
)

# Access state values from result
factorial_result = result["factorial_result"]
calc_result = result["calc_result"]

# Access conversation messages
for message in result["messages"]:
    print(f"{message.role}: {message.text}")