HuggingFaceLocalGenerator

HuggingFaceLocalGenerator 提供了一个使用本地运行的 Hugging Face 模型进行文本生成的接口。


pipeline 中的最常见位置	在 `PromptBuilder` 之后
必需的初始化变量	"token": Hugging Face API 令牌。可以通过`HF_API_TOKEN` 或`HF_TOKEN` 环境变量设置。
强制运行变量	“prompt”：一个包含 LLM 提示的字符串
输出变量	“replies”：一个包含 LLM 生成的所有回复的字符串列表
API 参考	Generators (生成器)
GitHub 链接	https://github.com/deepset-ai/haystack/blob/main/haystack/components/generators/hugging_face_local.py

概述

请记住，如果 LLM 在本地运行，您可能需要一台功能强大的机器来运行它们。这在很大程度上取决于您选择的模型及其参数数量。

📘
正在寻找聊天补全？
此组件专为文本生成而设计，而非聊天。如果您想将 Hugging Face LLM 用于聊天，请考虑改用 HuggingFaceLocalChatGenerator。

对于远程文件授权，此组件默认使用HF_API_TOKEN 环境变量。否则，您可以在初始化时使用token:

local_generator = HuggingFaceLocalGenerator(token=Secret.from_token("<your-api-key>"))

流式传输

此 Generator 支持将 LLM 的 token直接流式传输到输出中。要做到这一点，请将一个函数传递给streaming_callback 初始化参数。

用法

单独使用

from haystack.components.generators import HuggingFaceLocalGenerator

generator = HuggingFaceLocalGenerator(model="google/flan-t5-large",
                                      task="text2text-generation",
                                      generation_kwargs={
                                        "max_new_tokens": 100,
                                        "temperature": 0.9,
                                        })

generator.warm_up()
print(generator.run("Who is the best American actor?"))
# {'replies': ['john wayne']}

在 Pipeline 中

from haystack import Pipeline
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import HuggingFaceLocalGenerator
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack import Document

docstore = InMemoryDocumentStore()
docstore.write_documents([Document(content="Rome is the capital of Italy"), Document(content="Paris is the capital of France")])

generator = HuggingFaceLocalGenerator(model="google/flan-t5-large",
                                      task="text2text-generation",
                                      generation_kwargs={
                                        "max_new_tokens": 100,
                                        "temperature": 0.9,
                                        })

query = "What is the capital of France?"

template = """
Given the following information, answer the question.

Context: 
{% for document in documents %}
    {{ document.content }}
{% endfor %}

Question: {{ query }}?
"""
pipe = Pipeline()

pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", generator)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")

res=pipe.run({
    "prompt_builder": {
        "query": query
    },
    "retriever": {
        "query": query
    }
})

print(res)

其他参考资料

🧑‍🍳食谱

更新于大约 1 年前

概述

📘正在寻找聊天补全？

流式传输

用法

单独使用

在 Pipeline 中

其他参考资料

📘
正在寻找聊天补全？