Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

evaluatio.metrics.uer

Universal error metrics

This module provides utilities to compute universal error rate (UER) and universal edit distance between reference and hypothesis sequences. Unlike metrics.wer and metrics.cer, all computations operate on arbitrary iterables of tokens rather than whitespace-tokenized strings, making this module suitable for subword-level or any other custom tokenization scheme.

The functions accept any iterable of iterables and internally convert them to a format compatible with the underlying native bindings.

Note

Functions

universal_error_rate_per_pair

universal_error_rate_per_pair(references: Iterable[Iterable[Any]], hypotheses: Iterable[Iterable[Any]]) -> List[float]

Compute universal error rate (UER) for each reference-hypothesis pair.

Parameters

Returns

Raises

See-Also

metrics.cer.word_error_rate_per_pair : Character-tokenized string version. metrics.wer.word_error_rate_per_pair : Whitespace-tokenized string version.

Note

universal_edit_distance_per_pair

universal_edit_distance_per_pair(references: Iterable[Iterable[Any]], hypotheses: Iterable[Iterable[Any]]) -> List[int]

Compute universal edit distance for each reference-hypothesis pair.

Parameters

Returns

See-Also

metrics.cer.word_edit_distance_per_pair : Character-tokenized string version. metrics.wer.word_edit_distance_per_pair : Whitespace-tokenized string version.

Note

universal_error_rate

universal_error_rate(references: Iterable[Iterable[Any]], hypotheses: Iterable[Iterable[Any]]) -> float

Compute the corpus level universal error rate (UER) over all pairs.

Parameters

Returns

See-Also

metrics.cer.word_error_rate : Character-tokenized string version. metrics.wer.word_error_rate : Whitespace-tokenized string version.

Note

universal_error_rate_ci

universal_error_rate_ci(references: Iterable[Iterable[Any]], hypotheses: Iterable[Iterable[Any]], iterations: int, alpha: float) -> ConfidenceInterval

Estimate a confidence interval for the universal error rate via bootstrapping.

Parameters

Returns

See-Also

metrics.cer.word_error_rate_ci : Character-tokenized string version. metrics.wer.word_error_rate_ci : Whitespace-tokenized string version.

Note