LocalWhisperTranscriber

使用LocalWhisperTranscriber 使用您本地安装的 Whisper，通过 OpenAI 的 Whisper 模型转录音频文件。


pipeline 中的最常见位置	作为索引管道中的第一个组件
强制运行变量	“sources”：您想要转录的路径或二进制流列表
输出变量	“documents”：文档列表
API 参考	Audio
GitHub 链接	https://github.com/deepset-ai/haystack/blob/main/haystack/components/audio/whisper_local.py

概述

该组件还需要知道要使用哪个 Whisper 模型。在初始化组件时，在model 参数中指定。所有转录都在执行机器上完成，音频永远不会发送给第三方提供商。

请参阅我们API 文档中可供您指定的其他可选参数。

请参阅Whisper API 文档和官方 Whisper GitHub 仓库以了解支持的音频格式和语言。

要使用LocalWhisperTranscriber，请先使用以下命令安装 torch 和 Whisper

pip install 'transformers[torch]'
pip install -U openai-whisper

用法

单独使用

以下是如何单独使用LocalWhisperTranscriber 的示例

import requests
from haystack.components.audio import LocalWhisperTranscriber

response = requests.get("https://ia903102.us.archive.org/19/items/100-Best--Speeches/EK_19690725_64kb.mp3")
with open("kennedy_speech.mp3", "wb") as file:
    file.write(response.content)

transcriber = LocalWhisperTranscriber(model="tiny")
transcriber.warm_up()

transcription = transcriber.run(sources=["./kennedy_speech.mp3"])
print(transcription["documents"][0].content)

在 pipeline 中

以下管道从指定 URL 获取音频文件并进行转录。它首先使用LinkContentFetcher 检索音频文件，然后使用LocalWhisperTranscriber 将音频转录为文本，最后输出转录文本。

from haystack.components.audio import LocalWhisperTranscriber
from haystack.components.fetchers import LinkContentFetcher
from haystack import Pipeline

pipe = Pipeline()
pipe.add_component("fetcher", LinkContentFetcher())
pipe.add_component("transcriber", LocalWhisperTranscriber(model="tiny"))

pipe.connect("fetcher", "transcriber")
result = pipe.run(
    data={"fetcher": {"urls": ["https://ia903102.us.archive.org/19/items/100-Best--Speeches/EK_19690725_64kb.mp3"]}})
print(result["transcriber"]["documents"][0].content)

其他参考资料

🧑‍🍳 食谱：使用 Whisper、Qdrant 和 Mistral 进行播客的多语言 RAG

更新于大约 1 年前