JsonSchemaValidator

使用此组件来确保 LLM 生成的聊天消息 JSON 符合特定模式。


pipeline 中的最常见位置	在 Generator 之后
强制运行变量	“messages”：一个要验证的 `ChatMessage` 实例列表——此列表中的最后一个消息是要验证的消息
输出变量	“validated”：如果最后一个消息有效，则为消息列表 “validation_error”：如果最后一个消息无效，则为消息列表
API 参考	Validators (验证器)
GitHub 链接	https://github.com/deepset-ai/haystack/blob/main/haystack/components/validators/json_schema.py

概述

JsonSchemaValidator 检查 JSON Schema 的 JSON 内容。ChatMessage 根据提供的模式。如果消息的 JSON 内容遵循提供的模式，则会移至validated 输出。否则，它会被移至validation_error输出。出现错误时，组件将使用提供的自定义error_template 或默认模板来创建错误消息。这些错误ChatMessages 可用于 Haystack 恢复循环。

用法

在 pipeline 中

在此简单流程中，MessageProducer 通过 BranchJoiner 将聊天消息列表发送到 Generator。Generator 生成的消息被发送到JsonSchemaValidator，然后将错误ChatMessages 发送回BranchJoiner 进行恢复循环。

from typing import List

from haystack import Pipeline
from haystack import component
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.joiners import BranchJoiner
from haystack.components.validators import JsonSchemaValidator
from haystack.dataclasses import ChatMessage


@component
class MessageProducer:

    @component.output_types(messages=List[ChatMessage])
    def run(self, messages: List[ChatMessage]) -> dict:
        return {"messages": messages}


p = Pipeline()
p.add_component("llm", OpenAIChatGenerator(model="gpt-4-1106-preview",
                                           generation_kwargs={"response_format": {"type": "json_object"}}))
p.add_component("schema_validator", JsonSchemaValidator())
p.add_component("branch_joiner", BranchJoiner(List[ChatMessage]))
p.add_component("message_producer", MessageProducer())

p.connect("message_producer.messages", "branch_joiner")
p.connect("branch_joiner", "llm")
p.connect("llm.replies", "schema_validator.messages")
p.connect("schema_validator.validation_error", "branch_joiner")

result = p.run(
    data={"message_producer": {
        "messages": [ChatMessage.from_user("Generate JSON for person with name 'John' and age 30")]},
          "schema_validator": {"json_schema": {"type": "object",
                                               "properties": {"name": {"type": "string"},
                                                              "age": {"type": "integer"}}}}})
print(result)

>> {'schema_validator': {'validated': [ChatMessage(_role=<ChatRole.ASSISTANT: 
>> 'assistant'>, _content=[TextContent(text='\n{\n  "name": "John",\n  "age": 30\n}')], 
>> _name=None, _meta={'model': 'gpt-4-1106-preview', 'index': 0, 'finish_reason': 'stop', 
>> 'usage': {'completion_tokens': 17, 'prompt_tokens': 20, 'total_tokens': 37, 
>> 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 
>> 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': 
>> {'audio_tokens': 0, 'cached_tokens': 0}}})]}}

更新于 5 个月前