CER

`evaluatio.metrics.cer`¶

Character-level error metrics

This module provides utilities to compute character error rate (CER) and character-level edit distance between reference and hypothesis text sequences.

The functions accept any iterable of strings and internally convert them to a format compatible with the underlying native bindings.

Note

If a reference string is empty, the corresponding CER is defined as inf.
These functions are thin wrappers around optimized native implementations.

Functions¶

character_error_rate_per_pair¶

character_error_rate_per_pair(references: Iterable[str], hypotheses: Iterable[str]) -> List[float]

Compute character error rate (CER) for each reference-hypothesis pair.

Parameters

references : Iterable[str]
Iterable of reference strings.
hypotheses : Iterable[str]
Iterable of hypothesis strings. Must be the same length as references.

Returns

List[float]
Character error rate for each pair of reference and hypothesis.

Raises

ValueError
If the lists are of different lengths.

See-Also

metrics.uer.universal_error_rate_per_pair : Type-agnostic version.

Note

If a reference string is empty or contains no characters, the resulting CER is inf.

character_edit_distance_per_pair¶

character_edit_distance_per_pair(references: Iterable[str], hypotheses: Iterable[str]) -> List[int]

Compute character-level edit distance for each reference-hypothesis pair.

Parameters

references : Iterable[str]
Iterable of reference strings.
hypotheses : Iterable[str]
Iterable of hypothesis strings. Must be the same length as references.

Returns

List[int]
character-level edit distance for each pair.

character_error_rate¶

character_error_rate(references: Iterable[str], hypotheses: Iterable[str]) -> float

Compute the corpus level character error rate (CER) over all pairs.

Parameters

references : Iterable[str]
Iterable of reference strings.
hypotheses : Iterable[str]
Iterable of hypothesis strings. Must be the same length as references.

Returns

float
Corpus level character error rate across all pairs.

Note

Equivalent to common CER implementations (e.g., jiwer-based metrics).
If all reference strings are empty, the resulting CER is inf.

character_error_rate_ci¶

character_error_rate_ci(references: Iterable[str], hypotheses: Iterable[str], iterations: int, alpha: float) -> ConfidenceInterval

Estimate a confidence interval for the character error rate via bootstrapping.

Parameters

references : Iterable[str]
Iterable of reference strings.
hypotheses : Iterable[str]
Iterable of hypothesis strings. Must be the same length as references.
iterations : int
Number of bootstrap iterations.
alpha : float
Significance level for the confidence interval.

Returns

ConfidenceInterval
Estimated confidence interval for the corpus level character error rate.

Note

The bootstrapped metric corresponds to character_error_rate.
If any reference string is empty or contains no characters, the resulting CER can be inf.

evaluatio.metrics.cer¶

Functions¶

character_error_rate_per_pair¶

character_edit_distance_per_pair¶

character_error_rate¶

character_error_rate_ci¶

`evaluatio.metrics.cer`¶