evaluatio.metrics.wer¶
Word-level error metrics
This module provides utilities to compute word error rate (WER) and
word-level edit distance between reference and hypothesis text sequences.
All computations operate on whitespace-tokenized words. If you need more
complex tokenizing, please see metrics.uer.
The functions accept any iterable of strings and internally convert them to a format compatible with the underlying native bindings.
Note
If a reference string is empty or contains no tokens, the corresponding WER is defined as
inf.These functions are thin wrappers around optimized native implementations.
Functions¶
word_error_rate_per_pair¶
word_error_rate_per_pair(references: Iterable[str], hypotheses: Iterable[str]) -> List[float]Compute word error rate (WER) for each reference-hypothesis pair.
Parameters
references:Iterable[str]
Iterable of reference strings.hypotheses:Iterable[str]
Iterable of hypothesis strings. Must be the same length asreferences.
Returns
List[float]
Word error rate for each pair of reference and hypothesis.
Raises
ValueError
If the lists are of different lengths.
See-Also
metrics.uer.universal_error_rate_per_pair : Type-agnostic version.
Note
Tokenization is performed by splitting on whitespace.
If a reference string is empty or contains no tokens, the resulting WER is
inf.
word_edit_distance_per_pair¶
word_edit_distance_per_pair(references: Iterable[str], hypotheses: Iterable[str]) -> List[int]Compute word-level edit distance for each reference-hypothesis pair.
Parameters
references:Iterable[str]
Iterable of reference strings.hypotheses:Iterable[str]
Iterable of hypothesis strings. Must be the same length asreferences.
Returns
List[int]
Word-level edit distance for each pair.
See-Also
metrics.uer.universal_edit_distance_per_pair : Type-agnostic version.
Note
Tokenization is performed by splitting on whitespace.
word_error_rate¶
word_error_rate(references: Iterable[str], hypotheses: Iterable[str]) -> floatCompute the corpus level word error rate (WER) over all pairs.
Parameters
references:Iterable[str]
Iterable of reference strings.hypotheses:Iterable[str]
Iterable of hypothesis strings. Must be the same length asreferences.
Returns
float
Corpus level word error rate across all pairs.
Note
Tokenization is performed by splitting on whitespace.
Equivalent to common WER implementations (e.g.,
jiwer-based metrics).If all reference strings are empty or contains no tokens, the resulting WER is
inf.
word_error_rate_ci¶
word_error_rate_ci(references: Iterable[str], hypotheses: Iterable[str], interations: int, alpha: float) -> ConfidenceIntervalEstimate a confidence interval for the word error rate via bootstrapping.
Parameters
references:Iterable[str]
Iterable of reference strings.hypotheses:Iterable[str]
Iterable of hypothesis strings. Must be the same length asreferences.interations:int
Number of bootstrap iterations.alpha:float
Significance level for the confidence interval.
Returns
ConfidenceInterval
Estimated confidence interval for the corpus level word error rate.
Note
The bootstrapped metric corresponds to
word_error_rate.Tokenization is performed by splitting on whitespace.
If any reference string is empty or contains no tokens, the resulting WER can be
inf.