How the F3K Pilot Rating System Works

The Idea

Every F3K pilot has a single number — their rating — that reflects how strong they are right now. Win against tough opponents, your rating climbs. Lose to weaker pilots, it drops. Stop competing for a while, it gradually fades.

The system borrows from chess ELO but adapts it for F3K's unique structure: multiple pilots competing simultaneously in a flight group, where seconds of flight time determine who outperformed whom.


The Match Unit: Flight Groups

In F3K competitions, pilots are assigned to flight groups. Everyone in a group flies the same task, at the same time, under the same weather conditions. This is what makes the rating system fair — we never compare pilots who flew in different conditions.

Each group becomes one "match" for the rating system. A typical competition has 7–14 rounds with 2–4 groups each, so a weekend event generates 20–50 rating events per pilot.


How a Single Group Updates Ratings

When a group finishes flying, the system compares every pilot against every other pilot in that group. For a group of 8 pilots, that's 28 pairwise matchups per pilot.

Worked Example

Four pilots compete in a Task B (Best 2 flights, 4 min max, 10 min working time):

PilotFlight TimesTotal (seconds)Rating Before
Alice3:58 + 3:554731650
Bob3:52 + 3:474591580
Carol3:41 + 3:384391520
Dave3:10 + 2:553651490

For each pair, the system computes two things:

1. Expected Score — what should happen based on current ratings:

E = 1 / (1 + 10^((opponent_rating - your_rating) / 400))

Alice vs Bob: E = 1 / (1 + 10^((1580-1650)/400)) = 0.60 Alice vs Carol: E = 1 / (1 + 10^((1520-1650)/400)) = 0.68 Alice vs Dave: E = 1 / (1 + 10^((1490-1650)/400)) = 0.72

Alice is expected to beat all three — she's the highest rated.

2. Actual Score — what did happen, with margin of victory:

S = 1 / (1 + e^(-(your_points - their_points) / 50))

Alice vs Bob (473 - 459 = +14): S = 0.57 (slight win) Alice vs Carol (473 - 439 = +34): S = 0.66 (solid win) Alice vs Dave (473 - 365 = +108): S = 0.90 (dominant win)

3. Rating Change:

delta = K × (S_total - E_total) / sqrt(N - 1)

Alice's totals: S_total = 0.57 + 0.66 + 0.90 = 2.13, E_total = 0.60 + 0.68 + 0.72 = 2.00

delta = 24 × (2.13 - 2.00) / sqrt(3) = 24 × 0.13 / 1.73 = +1.8

Alice performed slightly better than expected — she gains 1.8 points. Her rating moves from 1650 to 1651.8.

What about Dave?

Dave lost all three matchups, but the system considers how much he was expected to lose:

Dave vs Alice: E = 0.28, S = 0.10 Dave vs Bob: E = 0.37, S = 0.13 Dave vs Carol: E = 0.46, S = 0.19

S_total = 0.42, E_total = 1.11

delta = 24 × (0.42 - 1.11) / 1.73 = -9.6

Dave loses 9.6 points. He was expected to lose, but he lost by larger margins than his rating predicted.


Margin of Victory Matters

The system doesn't treat a win as just a win. Beating someone by 200 seconds counts more than beating them by 5 seconds.

The margin sigmoid converts point differences into scores between 0 and 1:

Point DifferenceActual ScoreMeaning
+2000.98Dominant win
+1000.88Strong win
+500.73Clear win
+250.62Moderate edge
+100.55Slight edge
00.50Draw
-500.27Clear loss
-2000.02Dominant loss

This means maxing out a flight (landing exactly at the target time) matters — every second counts toward your rating.


Beating Stronger Pilots Rewards You More

The expected score is based on the rating gap between you and your opponent. When you beat someone rated much higher than you, your expected score was low — so the surprise factor is large.

Example: The Upset

Scenario: A 1500-rated pilot beats a 1700-rated pilot by 30 seconds.

Expected score against the 1700 pilot: E = 1 / (1 + 10^((1700-1500)/400)) = 0.24 Actual score (30-second margin): S = 1 / (1 + e^(-30/50)) = 0.65

Surprise = S - E = 0.65 - 0.24 = +0.41

Scenario: The same 1500-rated pilot beats a 1480-rated pilot by 30 seconds.

Expected score: E = 1 / (1 + 10^((1480-1500)/400)) = 0.53 Actual score: S = 0.65 (same margin, same score)

Surprise = 0.65 - 0.53 = +0.12

The same flight performance (winning by 30 seconds) generates 3.4x more rating gain when it's against a pilot rated 200 points above you versus one rated 20 points below you.

This is why competing in strong groups at major events is the fastest path to a high rating — and why dominating weak local fields produces only modest gains.


K-Factor: How Fast Ratings Move

New pilots need their ratings to adjust quickly to find their true level. Established pilots should have more stable ratings. The K-factor controls this:

Experience LevelGroups PlayedK-FactorCharacter
Provisional0–1548Rating moves fast — finding your level
Intermediate16–5036Settling in
Established51+24Stable — only big performances shift it

A provisional pilot gaining the same "surprise" as an established pilot will move their rating twice as fast (48/24 = 2x).

What this looks like in practice

A new pilot enters their first competition rated 1500. In a group of 8 with average rating 1550, they score well — finishing 2nd. Their K-factor of 48 might give them +15 points per round. After a 10-round competition, they could jump from 1500 to 1620.

An established pilot (200+ groups) in the same situation with K=24 would gain about +7 per round — moving from 1550 to 1620 over the same event.


Group Size Normalization

Groups range from 3 to 16 pilots. Without normalization, a 16-pilot group would move ratings far more than a 3-pilot group (because there are more pairwise comparisons). The system divides by sqrt(N-1) to keep the impact comparable:

Group SizeNormalization FactorEffect
3 pilotssqrt(2) = 1.41Moderate per-group impact
6 pilotssqrt(5) = 2.24Slightly diluted per matchup
10 pilotssqrt(9) = 3.00Many matchups, each counts less
16 pilotssqrt(15) = 3.87Lots of information, gentle per-pair

Larger groups provide more statistical signal but each individual matchup within them has proportionally less weight.


Inactivity Decay

A rating should reflect current ability, not historical peak. If a pilot stops competing, their rating gradually decreases:

  • Grace period: 6 months of inactivity before any decay begins
  • Decay rate: 3 points per month after the grace period
  • Floor: Rating cannot fall below 50% of the distance between 1500 and your peak

Decay Example

A pilot with a peak rating of 1800 stops competing:

Floor calculation:

floor = 1500 + (peak - 1500) × 0.5
     = 1500 + (1800 - 1500) × 0.5
     = 1650

After 6 months: no change (grace period) After 12 months: 1800 - (6 × 3) = 1782 After 18 months: 1800 - (12 × 3) = 1764 After 24 months: 1800 - (18 × 3) = 1746 After 56 months: would calculate to 1650 — hits floor, stops

The floor ensures that a pilot who was once world-class doesn't decay all the way to beginner level. Their history is acknowledged — but they still need to prove current form by competing.

Decay is Real, Not Cosmetic

When a pilot returns after a break, their rating has already been reduced. They re-enter the system at their decayed rating and must earn their way back up through results. This ensures the leaderboard always reflects who is performing well now.


Ranked vs Unranked

To appear as a ranked pilot on the leaderboard, you need:

  • At least 30 group appearances (roughly 3–4 full competitions)
  • At least 4 distinct competitions

Until these thresholds are met, a pilot has a rating (visible in their profile and history) but doesn't receive a numbered rank on the leaderboard. This prevents a single strong weekend from placing someone at #1.


Round Batching

Within a competition round, multiple groups fly simultaneously. The system processes all groups in a round using the same pre-round ratings, then applies all changes at once. This means:

  • Group A and Group B in the same round don't influence each other's calculations
  • Your round 3 performance is evaluated against ratings that include rounds 1 and 2
  • No pilot gets an advantage from the order groups are processed

Tie Dampening

In some tasks (especially Task A — Last Flight), large portions of a group may achieve the maximum score. When over 80% of pilots tie at the top, the system reduces the K-factor proportionally:

If >80% tied at max: effective_K = K × max(0.3, 1.0 - tie_fraction)

A group where 90% tie: effective_K = K × 0.3 (minimal rating movement) A group where 85% tie: effective_K = K × 0.15... capped at 0.3

This prevents large rating swings from rounds with little meaningful differentiation.


Peak Rating

Every pilot has a peak rating — the highest rating they've ever achieved. This never decays. It represents the best form a pilot has shown and serves as a historical record, separate from the current (potentially decayed) rating.


Summary

MechanismPurpose
Pairwise within groupsFair comparison (same conditions)
Margin-aware scoringEvery second matters
Expected vs actualUpsets rewarded, beating weaker pilots less so
K-factor scalingNew pilots settle fast, veterans stay stable
Group size normalizationConsistent impact regardless of group size
Inactivity decay with floorReflects current form, respects history
Round batchingNo processing-order advantage
Tie dampeningPrevents noise from maxed-out tasks
Ranked thresholdsProven track record required

The result: a single number that tells you how a pilot stacks up against the global F3K field right now — updated after every competition, grounded in head-to-head performance, and resistant to gaming through weak fields or cherry-picked events.