Class RunEvalConfig<T, U>

Configuration class for running evaluations on datasets.

Remarks

RunEvalConfig in LangSmith is a configuration class for running evaluations on datasets. Its primary purpose is to define the parameters and evaluators that will be applied during the evaluation of a dataset. This configuration can include various evaluators, custom evaluators, and different keys for inputs, predictions, and references.

Typeparam

T - The type of evaluators.

Typeparam

U - The type of custom evaluators.

Type Parameters

Hierarchy

  • RunEvalConfig

Constructors

Properties

customEvaluators?: U[]

Custom evaluators to apply to a dataset run. Each evaluator is provided with a run trace containing the model outputs, as well as an "example" object representing a record in the dataset.

eval_llm?: string

The language model specification for evaluators that require one.

evaluators?: (EvalConfig | T)[]

LangChain evaluators to apply to a dataset run. You can optionally specify these by name, or by configuring them with an EvalConfig object.

prepareData?: PrepareDataT

Convert the evaluation data into a format that can be used by the evaluator. By default, we pass the first value of the run.inputs, run.outputs (predictions), and references (example.outputs)

Returns

The prepared data.

Criteria: typeof __class = ...

Configuration to load a "CriteriaEvalChain" evaluator, which prompts an LLM to determine whether the model's prediction complies with the provided criteria.

Param

The criteria to use for the evaluator.

Param

The language model to use for the evaluator.

Returns

The configuration for the evaluator.

Example

const evalConfig = new RunEvalConfig(
[new RunEvalConfig.Criteria("helpfulness")],
);

Example

const evalConfig = new RunEvalConfig(
[new RunEvalConfig.Criteria(
{ "isCompliant": "Does the submission comply with the requirements of XYZ"
})],
LabeledCriteria: typeof __class = ...

Configuration to load a "LabeledCriteriaEvalChain" evaluator, which prompts an LLM to determine whether the model's prediction complies with the provided criteria and also provides a "ground truth" label for the evaluator to incorporate in its evaluation.

Param

The criteria to use for the evaluator.

Param

The language model to use for the evaluator.

Returns

The configuration for the evaluator.

Example

const evalConfig = new RunEvalConfig(
[new RunEvalConfig.LabeledCriteria("correctness")],
);

Example

const evalConfig = new RunEvalConfig(
[new RunEvalConfig.Criteria(
{ "mentionsAllFacts": "Does the include all facts provided in the reference?"
})],

Generated using TypeDoc