incerto.llm.LexicalSimilarity#

class incerto.llm.LexicalSimilarity[source]#

Bases: object

Measure lexical similarity across samples.

Compute exact match rate, token overlap, or edit distance to quantify how similar the generations are.

Methods

`__init__`()
`exact_match_rate`(responses[, normalize_fn])	Compute fraction of responses that exactly match the most common.
`pairwise_token_overlap`(responses[, normalize_fn])	Average pairwise token overlap (Jaccard similarity).

static exact_match_rate(responses, normalize_fn=None)[source]#

Compute fraction of responses that exactly match the most common.

Parameters:

responses (list[str]) – List of generated text responses
normalize_fn (Optional[Callable[[str], str]]) – Optional function to normalize responses before comparison

Return type:

float

Returns:

Exact match rate (0-1)

static pairwise_token_overlap(responses, normalize_fn=None)[source]#

Average pairwise token overlap (Jaccard similarity).

Parameters:

responses (list[str]) – List of generated text responses
normalize_fn (Optional[Callable[[str], str]]) – Optional function to normalize responses before comparison

Return type:

float

Returns:

Average Jaccard similarity across all pairs

incerto.llm.LexicalSimilarity