incerto.llm.ContrastiveDecoding#

class incerto.llm.ContrastiveDecoding[source]#

Bases: object

Uncertainty from contrastive decoding (comparing expert vs amateur models).

Uses the difference in predictions between a strong and weak model to identify regions of high uncertainty.

__init__()#

Methods

__init__()

compute_contrastive_score(expert_logits, ...)

Compute contrastive decoding score.

disagreement_score(expert_logits, amateur_logits)

Measure disagreement between expert and amateur.

static compute_contrastive_score(expert_logits, amateur_logits, alpha=0.5)[source]#

Compute contrastive decoding score.

Score = expert_prob - alpha * amateur_prob

Parameters:
  • expert_logits (Tensor) – Logits from expert/strong model

  • amateur_logits (Tensor) – Logits from amateur/weak model

  • alpha (float) – Weight for amateur contribution

Return type:

Tensor

Returns:

Contrastive scores

static disagreement_score(expert_logits, amateur_logits)[source]#

Measure disagreement between expert and amateur.

Parameters:
  • expert_logits (Tensor) – Logits from expert model

  • amateur_logits (Tensor) – Logits from amateur model

Return type:

Tensor

Returns:

Disagreement score (KL divergence)