模块 eval_run_result

EvaluationRunResult

包含评估管道的输入和输出，并提供检查它们的方法。

EvaluationRunResult.init

def __init__(run_name: str, inputs: dict[str, list[Any]],
             results: dict[str, dict[str, Any]])

初始化一个新的评估运行结果。

参数:

run_name: 评估运行的名称。
inputs: 包含运行使用的输入的字典。每个键是输入的名称，其值是输入值列表。列表的长度应相同。
results: 包含评估管道中使用的评估器的结果的字典。每个键是指标的名称，其值是具有以下键的字典
'score': 指标的聚合分数。
'individual_scores': 每个输入样本的分数列表。

EvaluationRunResult.aggregated_report

def aggregated_report(
    output_format: Literal["json", "csv", "df"] = "json",
    csv_file: Optional[str] = None
) -> Union[dict[str, list[Any]], "DataFrame", str]

生成一个包含每个指标聚合分数的报告。

参数:

output_format: 报告的输出格式，“json”、“csv”或“df”，默认为“json”。
csv_file: 如果 output_format 为“csv”，则保存 CSV 输出的文件路径，必须提供。如果 output_format 为“csv”，则必须提供。

返回值:

包含聚合分数的 JSON 或 DataFrame。如果输出设置为 CSV 文件，则为确认成功写入的消息或错误消息。

EvaluationRunResult.detailed_report

def detailed_report(
    output_format: Literal["json", "csv", "df"] = "json",
    csv_file: Optional[str] = None
) -> Union[dict[str, list[Any]], "DataFrame", str]

生成一个包含每个指标详细分数的报告。

参数:

output_format: 报告的输出格式，“json”、“csv”或“df”，默认为“json”。
csv_file: 如果 output_format 为“csv”，则保存 CSV 输出的文件路径，必须提供。如果 output_format 为“csv”，则必须提供。

返回值:

包含详细分数的 JSON 或 DataFrame。如果输出设置为 CSV 文件，则为确认成功写入的消息或错误消息。

EvaluationRunResult.comparative_detailed_report

def comparative_detailed_report(
        other: "EvaluationRunResult",
        keep_columns: Optional[list[str]] = None,
        output_format: Literal["json", "csv", "df"] = "json",
        csv_file: Optional[str] = None) -> Union[str, "DataFrame", None]

生成一份报告，其中包含来自两次评估运行的每个指标的详细分数，用于比较。

参数:

other: 要与之比较的另一个评估运行的结果。
keep_columns: 要从要比较的评估运行的输入中保留的常见列名的列表。
output_format: 报告的输出格式，“json”、“csv”或“df”，默认为“json”。
csv_file: 如果 output_format 为“csv”，则保存 CSV 输出的文件路径，必须提供。如果 output_format 为“csv”，则必须提供。

返回值:

包含详细分数比较的 JSON 或 DataFrame。如果输出设置为 CSV 文件，则为确认成功写入的消息或错误消息。

模块 eval_run_result

EvaluationRunResult

EvaluationRunResult.__init__

EvaluationRunResult.aggregated_report

EvaluationRunResult.detailed_report

EvaluationRunResult.comparative_detailed_report

EvaluationRunResult.init