Active Learning Guide#
Active learning reduces labeling costs by strategically selecting which samples to label. Instead of random sampling, query the most informative examples.
Why Active Learning#
- Labeling is expensive:
Medical image annotation requires expert radiologists
NLP tasks need careful human review
Robotics needs real-world interaction
Active learning can achieve same performance with 10-100x less labeled data.
Core Idea#
Train model on small labeled set
Query strategy: Select most informative unlabeled samples
Get labels for selected samples (human annotation)
Add to training set, retrain
Repeat until budget exhausted or performance adequate
Acquisition Functions#
Uncertainty Sampling#
Best for: Starting point, simple and effective
Query samples where model is most uncertain:
from incerto.active import entropy_acquisition, UncertaintySampling
strategy = UncertaintySampling(
model,
acquisition_fn=entropy_acquisition
)
# Query most uncertain samples
query_indices = strategy.query(
unlabeled_pool,
n_samples=100
)
# Label these samples
labeled_samples = label_samples(unlabeled_pool[query_indices])
- Variants:
Least confidence: Query samples with lowest max probability
Margin sampling: Query samples with smallest difference between top-2 classes
Entropy: Query samples with highest entropy
Entropy Acquisition#
from incerto.active import entropy_acquisition
# Compute entropy for each sample
logits = model(unlabeled_data)
probs = F.softmax(logits, dim=-1)
entropy_scores = entropy_acquisition(probs)
# Higher entropy = more uncertain = higher priority
top_k_indices = torch.argsort(entropy_scores, descending=True)[:k]
Least Confidence#
from incerto.active import least_confidence_acquisition
logits = model(unlabeled_data)
probs = F.softmax(logits, dim=-1)
confidence_scores = least_confidence_acquisition(probs)
# Lower confidence = higher priority
top_k_indices = torch.argsort(confidence_scores)[:k]
Margin Sampling#
from incerto.active import margin_acquisition
logits = model(unlabeled_data)
probs = F.softmax(logits, dim=-1)
margin_scores = margin_acquisition(probs)
# Smaller margin = more uncertain = higher priority
top_k_indices = torch.argsort(margin_scores)[:k]
BALD (Bayesian Active Learning by Disagreement)#
Best for: When using Bayesian methods (MC Dropout, ensembles)
Query samples with highest mutual information:
from incerto.active import BALDAcquisition
from incerto.bayesian import MCDropout
# Use MC Dropout for Bayesian uncertainty
mc_dropout = MCDropout(model, n_samples=10)
strategy = BALDAcquisition(mc_dropout)
query_indices = strategy.query(unlabeled_pool, n_samples=100)
Intuition: Query where model weights disagree most (high epistemic uncertainty)
- Advantages:
Theoretically motivated
Considers model uncertainty
Often outperforms entropy
- Disadvantages:
Requires Bayesian model
More computationally expensive
Reference: Houlsby et al., “Bayesian Active Learning for Classification” (AIStats 2011)
Complete Active Learning Loop#
import torch
from incerto.active import UncertaintySampling, entropy_acquisition
# Initial setup
labeled_data = initial_labeled_set # Small labeled set
unlabeled_pool = large_unlabeled_set
budget = 1000 # Number of labels we can afford
# Initial model
model = train_model(labeled_data)
# Active learning loop
n_queries = budget // 100 # Query 100 samples at a time
for round in range(n_queries):
print(f"Round {round + 1}/{n_queries}")
# 1. Select samples to label
strategy = UncertaintySampling(
model,
acquisition_fn=entropy_acquisition
)
query_indices = strategy.query(
unlabeled_pool,
n_samples=100
)
# 2. Get labels (human annotation or oracle)
query_samples = unlabeled_pool[query_indices]
query_labels = get_labels(query_samples) # Human labeling
# 3. Add to labeled set
labeled_data.add(query_samples, query_labels)
# 4. Remove from unlabeled pool
unlabeled_pool.remove(query_indices)
# 5. Retrain model
model = train_model(labeled_data)
# 6. Evaluate
accuracy = evaluate(model, test_set)
print(f"Accuracy: {accuracy:.2%}")
print(f"Labeled samples: {len(labeled_data)}")
Practical Tips#
- Batch mode:
Query multiple samples at once for efficiency
# Query 100 samples in batch
query_indices = strategy.query(unlabeled_pool, n_samples=100)
- Diversity:
Combine uncertainty with diversity to avoid querying similar samples
from incerto.active import diverse_batch_query
# Select diverse AND uncertain samples
query_indices = diverse_batch_query(
model,
unlabeled_pool,
n_samples=100,
diversity_weight=0.5
)
- Cold start:
Begin with random sampling or stratified sampling
# Initial random sample
initial_size = 100
initial_indices = torch.randperm(len(dataset))[:initial_size]
labeled_data = dataset[initial_indices]
- Stopping criteria:
Stop when performance plateaus or budget exhausted
if accuracy > target_accuracy:
print("Target accuracy reached!")
break
if len(labeled_data) >= max_budget:
print("Budget exhausted!")
break
Evaluation#
- Learning curve:
Plot accuracy vs. number of labeled samples
import matplotlib.pyplot as plt
plt.plot(n_labeled_samples, accuracies, label='Active')
plt.plot(n_labeled_samples, random_accuracies, label='Random')
plt.xlabel('Number of Labeled Samples')
plt.ylabel('Test Accuracy')
plt.legend()
- Area Under Learning Curve (AULC):
Higher is better
- Reduction ratio:
How much data saved to reach target accuracy
# E.g., active learning reaches 95% accuracy with 1000 samples
# Random sampling needs 5000 samples for same accuracy
# Reduction ratio = 5000 / 1000 = 5x
Best Practices#
- Start with uncertainty sampling
Simple, effective baseline
- Use batch queries
Query 50-100 samples at a time for efficiency
- Consider diversity
Prevent querying redundant samples
- Retrain frequently
Model needs to adapt to new labels
- Use Bayesian methods when possible
BALD often outperforms simple uncertainty
- Compare to random baseline
Always benchmark against random sampling
- Monitor labeling quality
Human labels may be noisy or biased
Common Pitfalls#
- ❌ Querying only hardest samples
Can lead to noisy/outlier labels
- ❌ Not using diversity
Queries may be redundant
- ❌ Infrequent retraining
Model doesn’t benefit from new labels
- ❌ Wrong initial set
Cold start matters - use stratified sampling
- ❌ Ignoring label noise
Uncertain samples may have unreliable labels
Advanced Topics#
- Query by committee:
Use ensemble disagreement instead of single model uncertainty
- Expected model change:
Query samples that change model most
- Expected error reduction:
Query samples that reduce expected error most
References#
Settles, “Active Learning Literature Survey” (2009)
Houlsby et al., “Bayesian Active Learning for Classification” (AIStats 2011)
Gal et al., “Deep Bayesian Active Learning” (ICML 2017)
Ash et al., “Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds” (ICLR 2020)
See Also#
Active Learning - Complete API reference
Bayesian Deep Learning Guide - Bayesian uncertainty for BALD
Selective Prediction Guide - Selective prediction