Selective Prediction Guide#

Selective prediction (also called prediction with rejection) allows models to abstain from uncertain predictions, improving accuracy on predictions that are made.

Why Selective Prediction#

Key idea: Don’t predict when uncertain

Benefits:

Higher accuracy on accepted predictions
Explicit uncertainty communication
Safer deployment in critical applications

Trade-off:

Coverage: Fraction of samples where prediction is made
Risk: Error rate on predictions that are made

Goal: Maximize accuracy while maintaining acceptable coverage.

Basic Concepts#

Selective classifier:: (f, g) where f predicts, g decides whether to abstain
Coverage:: Φ = P(g(x) = 1) = fraction of samples where we predict
Selective risk:: R_φ = E[ℓ(f(x), y) | g(x) = 1] = error rate on accepted samples
Selective accuracy:: Accuracy on samples where prediction is made

Methods#

Softmax Threshold#

Best for: Simple baseline, post-hoc application

Abstain when max softmax probability < threshold:

from incerto.sp import SoftmaxThreshold

selector = SoftmaxThreshold(threshold=0.9)

# Make predictions
logits = model(test_data)
predictions, abstention = selector.predict(logits)

# predictions[i] is None where abstention[i] is True
coverage = (~abstention).float().mean()
accepted_preds = predictions[~abstention]
accepted_labels = labels[~abstention]

accuracy = (accepted_preds == accepted_labels).float().mean()

print(f"Coverage: {coverage:.2%}")
print(f"Selective accuracy: {accuracy:.2%}")

Advantages:

Simple, fast
No retraining needed
Interpretable

Disadvantages:

Threshold requires tuning
May not be optimal

Monte Carlo Dropout#

Best for: Bayesian uncertainty estimates

Use dropout uncertainty for selection:

from incerto.bayesian import MCDropout
from incerto.sp import UncertaintyBasedSelection

# Enable dropout during inference
mc_dropout = MCDropout(model, n_samples=10)

# Get predictions with uncertainty
result = mc_dropout.predict(test_data)

# Select based on epistemic uncertainty
selector = UncertaintyBasedSelection(threshold=0.1)
predictions, abstention = selector.predict(
    result['mean'],
    result['epistemic']
)

Entropy-Based Selection#

Best for: Information-theoretic approach

Abstain on high-entropy predictions:

from incerto.sp import EntropyThreshold

selector = EntropyThreshold(threshold=0.5)

logits = model(test_data)
predictions, abstention = selector.predict(logits)

Self-Adaptive Training (SAT)#

Best for: Training models for selective prediction

Train model to output selection scores:

from incerto.sp import SelfAdaptiveTraining

model_with_selection = SelfAdaptiveTraining(
    base_model,
    num_classes=10
)

# Training loop
for inputs, labels in train_loader:
    logits, selection_scores = model_with_selection(inputs)

    # SAT loss combines classification and selection
    loss = model_with_selection.sat_loss(
        logits,
        selection_scores,
        labels,
        coverage=0.8  # Target 80% coverage
    )

    loss.backward()
    optimizer.step()

# Inference with learned selection
logits, selection_scores = model_with_selection(test_data)
predictions = logits.argmax(dim=-1)
abstention = selection_scores < threshold

Complete Workflow#

import torch
from incerto.sp import SoftmaxThreshold, selective_risk, coverage_rate

# 1. Train model normally
model = train_model(train_loader)

# 2. Choose selection strategy
selector = SoftmaxThreshold(threshold=0.95)

# 3. Evaluate on validation set to choose threshold
val_logits, val_labels = get_predictions(model, val_loader)
predictions, abstention = selector.predict(val_logits)

coverage = (~abstention).float().mean()
accepted = ~abstention
selective_acc = (predictions[accepted] == val_labels[accepted]).float().mean()

print(f"Coverage: {coverage:.2%}")
print(f"Selective accuracy: {selective_acc:.2%}")

# 4. Adjust threshold to achieve desired coverage
# Try different thresholds...

# 5. Deploy with chosen threshold
def predict_with_abstention(x):
    logits = model(x)
    pred, abstain = selector.predict(logits)

    if abstain:
        return None  # Abstain - defer to human
    return pred

Metrics#

Coverage-Risk Curve:: Plot selective accuracy vs. coverage

from incerto.sp import plot_coverage_risk_curve

thresholds = [0.5, 0.6, 0.7, 0.8, 0.9, 0.95, 0.99]
coverages, risks = [], []

for threshold in thresholds:
    selector = SoftmaxThreshold(threshold=threshold)
    preds, abstention = selector.predict(logits)

    coverage = (~abstention).float().mean()
    risk = selective_risk(preds[~abstention], labels[~abstention])

    coverages.append(coverage.item())
    risks.append(risk.item())

# Plot
import matplotlib.pyplot as plt
plt.plot(coverages, risks)
plt.xlabel('Coverage')
plt.ylabel('Selective Risk')
plt.title('Coverage-Risk Curve')

Area Under Risk-Coverage Curve (AURC):: Lower is better (perfect = 0)

from incerto.sp import aurc

score = aurc(coverages, risks)

Best Practices#

Tune threshold on validation data
Never use test data for threshold selection
Consider deployment constraints
What coverage rate is acceptable?
Combine with calibration
Calibrated models have better selection
Monitor in production
Track coverage and accuracy over time
Plan for abstention
What happens when model abstains? (Human review, fallback model, etc.)
Use multiple signals
Combine softmax, entropy, MC dropout for better selection

Trade-offs#

High threshold (e.g., 0.95):

Lower coverage (~70%)
Higher accuracy on accepted samples
More abstentions

Low threshold (e.g., 0.7):

Higher coverage (~95%)
Lower accuracy on accepted samples
Fewer abstentions

Choose based on:

Cost of errors vs. cost of abstention
Availability of fallback (human expert, simpler model)
Application requirements

References#

Geifman & El-Yaniv, “Selective Classification for Deep Neural Networks” (NeurIPS 2017)
Geifman & El-Yaniv, “SelectiveNet: A Deep Neural Network with a Rejection Option” (ICML 2019)
Mozannar & Sontag, “Consistent Estimators for Learning to Defer” (NeurIPS 2020)

Selective Prediction Guide

Contents