Selective Prediction Guide#
Selective prediction (also called prediction with rejection) allows models to abstain from uncertain predictions, improving accuracy on predictions that are made.
Why Selective Prediction#
Key idea: Don’t predict when uncertain
- Benefits:
Higher accuracy on accepted predictions
Explicit uncertainty communication
Safer deployment in critical applications
- Trade-off:
Coverage: Fraction of samples where prediction is made
Risk: Error rate on predictions that are made
Goal: Maximize accuracy while maintaining acceptable coverage.
Basic Concepts#
- Selective classifier:
(f, g) where f predicts, g decides whether to abstain
- Coverage:
Φ = P(g(x) = 1) = fraction of samples where we predict
- Selective risk:
R_φ = E[ℓ(f(x), y) | g(x) = 1] = error rate on accepted samples
- Selective accuracy:
Accuracy on samples where prediction is made
Methods#
Softmax Threshold#
Best for: Simple baseline, post-hoc application
Abstain when max softmax probability < threshold:
from incerto.sp import SoftmaxThreshold
selector = SoftmaxThreshold(threshold=0.9)
# Make predictions
logits = model(test_data)
predictions, abstention = selector.predict(logits)
# predictions[i] is None where abstention[i] is True
coverage = (~abstention).float().mean()
accepted_preds = predictions[~abstention]
accepted_labels = labels[~abstention]
accuracy = (accepted_preds == accepted_labels).float().mean()
print(f"Coverage: {coverage:.2%}")
print(f"Selective accuracy: {accuracy:.2%}")
- Advantages:
Simple, fast
No retraining needed
Interpretable
- Disadvantages:
Threshold requires tuning
May not be optimal
Monte Carlo Dropout#
Best for: Bayesian uncertainty estimates
Use dropout uncertainty for selection:
from incerto.bayesian import MCDropout
from incerto.sp import UncertaintyBasedSelection
# Enable dropout during inference
mc_dropout = MCDropout(model, n_samples=10)
# Get predictions with uncertainty
result = mc_dropout.predict(test_data)
# Select based on epistemic uncertainty
selector = UncertaintyBasedSelection(threshold=0.1)
predictions, abstention = selector.predict(
result['mean'],
result['epistemic']
)
Entropy-Based Selection#
Best for: Information-theoretic approach
Abstain on high-entropy predictions:
from incerto.sp import EntropyThreshold
selector = EntropyThreshold(threshold=0.5)
logits = model(test_data)
predictions, abstention = selector.predict(logits)
Self-Adaptive Training (SAT)#
Best for: Training models for selective prediction
Train model to output selection scores:
from incerto.sp import SelfAdaptiveTraining
model_with_selection = SelfAdaptiveTraining(
base_model,
num_classes=10
)
# Training loop
for inputs, labels in train_loader:
logits, selection_scores = model_with_selection(inputs)
# SAT loss combines classification and selection
loss = model_with_selection.sat_loss(
logits,
selection_scores,
labels,
coverage=0.8 # Target 80% coverage
)
loss.backward()
optimizer.step()
# Inference with learned selection
logits, selection_scores = model_with_selection(test_data)
predictions = logits.argmax(dim=-1)
abstention = selection_scores < threshold
Complete Workflow#
import torch
from incerto.sp import SoftmaxThreshold, selective_risk, coverage_rate
# 1. Train model normally
model = train_model(train_loader)
# 2. Choose selection strategy
selector = SoftmaxThreshold(threshold=0.95)
# 3. Evaluate on validation set to choose threshold
val_logits, val_labels = get_predictions(model, val_loader)
predictions, abstention = selector.predict(val_logits)
coverage = (~abstention).float().mean()
accepted = ~abstention
selective_acc = (predictions[accepted] == val_labels[accepted]).float().mean()
print(f"Coverage: {coverage:.2%}")
print(f"Selective accuracy: {selective_acc:.2%}")
# 4. Adjust threshold to achieve desired coverage
# Try different thresholds...
# 5. Deploy with chosen threshold
def predict_with_abstention(x):
logits = model(x)
pred, abstain = selector.predict(logits)
if abstain:
return None # Abstain - defer to human
return pred
Metrics#
- Coverage-Risk Curve:
Plot selective accuracy vs. coverage
from incerto.sp import plot_coverage_risk_curve
thresholds = [0.5, 0.6, 0.7, 0.8, 0.9, 0.95, 0.99]
coverages, risks = [], []
for threshold in thresholds:
selector = SoftmaxThreshold(threshold=threshold)
preds, abstention = selector.predict(logits)
coverage = (~abstention).float().mean()
risk = selective_risk(preds[~abstention], labels[~abstention])
coverages.append(coverage.item())
risks.append(risk.item())
# Plot
import matplotlib.pyplot as plt
plt.plot(coverages, risks)
plt.xlabel('Coverage')
plt.ylabel('Selective Risk')
plt.title('Coverage-Risk Curve')
- Area Under Risk-Coverage Curve (AURC):
Lower is better (perfect = 0)
from incerto.sp import aurc
score = aurc(coverages, risks)
Best Practices#
- Tune threshold on validation data
Never use test data for threshold selection
- Consider deployment constraints
What coverage rate is acceptable?
- Combine with calibration
Calibrated models have better selection
- Monitor in production
Track coverage and accuracy over time
- Plan for abstention
What happens when model abstains? (Human review, fallback model, etc.)
- Use multiple signals
Combine softmax, entropy, MC dropout for better selection
Trade-offs#
- High threshold (e.g., 0.95):
Lower coverage (~70%)
Higher accuracy on accepted samples
More abstentions
- Low threshold (e.g., 0.7):
Higher coverage (~95%)
Lower accuracy on accepted samples
Fewer abstentions
- Choose based on:
Cost of errors vs. cost of abstention
Availability of fallback (human expert, simpler model)
Application requirements
References#
Geifman & El-Yaniv, “Selective Classification for Deep Neural Networks” (NeurIPS 2017)
Geifman & El-Yaniv, “SelectiveNet: A Deep Neural Network with a Rejection Option” (ICML 2019)
Mozannar & Sontag, “Consistent Estimators for Learning to Defer” (NeurIPS 2020)
See Also#
Selective Prediction - Complete API reference
Calibration Guide - Calibration for better confidence
Conformal Prediction Guide - Prediction sets with guarantees
Out-of-Distribution Detection Guide - OOD detection