pipeline 中的最常见位置	在 `PromptBuilder` 之后
强制运行变量	“prompt”：一个包含 LLM 提示的字符串
输出变量	“replies”：一个包含 LLM 生成的所有回复的字符串列表 “meta”: 一个包含与每个回复相关的元数据的字典列表，例如 token 数量等。
API 参考	Ollama
GitHub 链接	https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/ollama

概述

OllamaGenerator 提供了一个通过 Ollama 上运行的 LLM 生成文本的接口。

OllamaGenerator 需要一个model 名称和一个url 才能工作。默认情况下，它使用"orca-mini" 模型和"https://:11434" url。

Ollama 是一个专注于本地运行 LLM 的项目。它默认内部使用量化 GGUF 格式。这意味着可以在标准机器（即使没有 GPU）上运行 LLM，而无需经历复杂的安装过程。

流式传输

此 Generator 支持将 LLM 的 token直接流式传输到输出中。要做到这一点，请将一个函数传递给streaming_callback 初始化参数。

用法

您需要一个正在运行的 Ollama 实例。您可以在此处找到安装说明。
运行 Ollama 的一种快速方法是使用 Docker

docker run -d -p 11434:11434 --name ollama ollama/ollama:latest

您需要下载或拉取所需的 LLM。模型库可在 Ollama 网站上找到。
如果您正在使用 Docker，您可以例如拉取 Zephyr 模型

docker exec ollama ollama pull zephyr

如果您已经在系统中安装了 Ollama，您可以执行

ollama pull zephyr

👍
选择模型的特定版本
您还可以指定一个标签来选择您模型的特定（量化）版本。可用的标签在 Ollama 模型库的模型卡片中显示。这是 Zephyr 的一个示例。
在这种情况下，只需运行
# ollama pull model:tag
ollama pull zephyr:7b-alpha-q3_K_S

您还需要安装ollama-haystack 包

pip install ollama-haystack

单独使用

以下是OllamaGenerator 独立工作的方式

from haystack_integrations.components.generators.ollama import OllamaGenerator

generator = OllamaGenerator(model="zephyr",
                            url = "https://:11434",
                            generation_kwargs={
                              "num_predict": 100,
                              "temperature": 0.9,
                              })

print(generator.run("Who is the best American actor?"))

# {'replies': ['I do not have the ability to form opinions or preferences.
# However, some of the most acclaimed american actors in recent years include
# denzel washington, tom hanks, leonardo dicaprio, matthew mcconaughey...'],
#'meta': [{'model': 'zephyr', ...}]}

在 Pipeline 中

from haystack_integrations.components.generators.ollama import OllamaGenerator

from haystack import Pipeline, Document
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore

template = """
Given the following information, answer the question.

Context: 
{% for document in documents %}
    {{ document.content }}
{% endfor %}

Question: {{ query }}?
"""

docstore = InMemoryDocumentStore()
docstore.write_documents([Document(content="I really like summer"),
                          Document(content="My favorite sport is soccer"),
                          Document(content="I don't like reading sci-fi books"),
                          Document(content="I don't like crowded places"),])

generator = OllamaGenerator(model="zephyr",
                            url = "https://:11434",
                            generation_kwargs={
                              "num_predict": 100,
                              "temperature": 0.9,
                              })

pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", generator)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")

result = pipe.run({"prompt_builder": {"query": query},
									"retriever": {"query": query}})

print(result)

# {'llm': {'replies': ['Based on the provided context, it seems that you enjoy
# soccer and summer. Unfortunately, there is no direct information given about 
# what else you enjoy...'],
# 'meta': [{'model': 'zephyr', ...]}}

概述

流式传输

用法

👍选择模型的特定版本

单独使用

在 Pipeline 中

👍
选择模型的特定版本