Skelf-Research / open source

Intelligent pairwise comparisons. Better rankings with fewer votes.

Compere is a Python library and FastAPI service that picks which pair to compare next using a Multi-Armed Bandit, then turns the verdicts into Elo ratings you can read. Built for RLHF data collection, A/B testing, leaderboards, and eval ranking.

What it does

Two algorithms, one honest answer.

Pair selection · UCB1 UCB(i) = win_rate(i) + c · √(ln N / ni)

c = 1.414 (default), N = total comparisons, ni = comparisons involving entity i. New entities receive a large initial weight so they get surveyed before they get scored.
Rating update · Elo new_rating = old_rating + K · (actual − expected)
expected = 1 / (1 + 10((opp − rating) / 400))

K = 32 by default. Initial rating 1500. The same Elo formulation you know from chess; nothing fancier is claimed.

Note on the field: compere does not implement Bradley-Terry, Thurstone, or TrueSkill. UCB plus Elo was chosen because both are interpretable end-to-end — you can explain a rating change to a stakeholder in two sentences.

Who uses it

Eval & RLHF teams

Collect preference data over model outputs without showing annotators every pair. UCB concentrates votes on uncertain comparisons; you ship a reward signal sooner.

A/B and content ranking ops

Rank designs, headlines, or product photos against each other. The Elo board is sortable, replayable, and an honest function of the votes it received.

Taste-graph builders

Turn “A or B?” clicks into a ranked catalog. SQLite by default; PostgreSQL when you outgrow it; same code on either.

Researchers and tinkerers

Import compere as a library and call create_entity / create_comparison / get_ratings directly. No server needed for offline studies.

Quick start

Install, run, vote.

pip install compere
compere --port 8090

# get the next pair to compare (UCB picks it)
curl localhost:8090/mab/next_comparison

# record a verdict
curl -X POST localhost:8090/comparisons/ \
  -H "Content-Type: application/json" \
  -d '{"entity1_id":1,"entity2_id":2,"selected_entity_id":1}'

# read the leaderboard
curl localhost:8090/ratings

Interactive API docs are served at /docs by FastAPI. The full HTTP surface is the eleven endpoints listed in the API reference.

From the blog

All posts →

How compere compares

Honest, narrow comparisons against tools that overlap on one axis or another.