Project Overview

This site aggregates college sailing results, ratings, and probabilistic forecasts to analyze performance across regattas, teams, skippers, and crews.

29502
Competitors
7607
Skippers
10957
Crews
188
Schools
1975
Regattas
31864
Races
Data Pipeline
  1. Scrape TechScore: Regattas, divisions, races, finish orders, and sailor rosters are scraped and normalized.
  2. Load Database: CSVs and scrape outputs populate schools, competitors, regattas, races, and race results. Missing sailors are auto-created and assigned to an “Unknown School” when needed.
  3. Attach PMFs: Precomputed probability mass functions (PMFs) from analysis are linked to skipper race results.
  4. Serve Pages: Flask routes query the ORM and render Jinja templates and charts.
Database Layout
  • schools: name, token, conference
  • competitors: sailor id, name, school, skipper/crew flags
  • regattas: name, date, location
  • races: regatta id, division, race number
  • race_results: race id, skipper competitor, crew competitor, finish position
  • ratings: weekly skipper ratings by competitor and week
  • crew_ratings: weekly crew ratings and performance metrics
  • race_result_probabilities: PMF JSON aligned to skipper race_result_id
Skipper Rating Methodology

Skipper ratings are derived using a Plackett-Luce model with transition loss minimization to capture relative performance across regattas.

Plackett-Luce Model

The Plackett-Luce model treats each race as a ranking over competitors, where the probability of competitor i finishing in position j depends on their skill parameter θᵢ relative to all remaining competitors. This captures the sequential nature of sailing finishes—each position depends on who has already finished ahead.

Transition Loss Minimization

To ensure ratings evolve smoothly over time, we minimize a transition loss that penalizes large week-to-week changes in skill parameters. This regularization prevents overfitting to individual race results while allowing ratings to adapt to genuine performance trends. The objective balances fit to observed rankings with temporal consistency.

Weekly Updates

Ratings are updated weekly using all available race data up to that point, with more recent results weighted more heavily. The model accounts for varying regatta sizes, division strengths, and field compositions to produce comparable ratings across different competitive contexts.

Example: Skill Values vs Win Probability

The Plackett-Luce model uses exponential transformation: P(team i wins) = exp(θᵢ) / Σ(exp(θⱼ) for all j)

Consider a 4-boat race with skill parameters θ = [3.0, 2.5, 2.0, 1.5]. The model calculates:

  • Boat A (θ=3.0): e^3.0 = 20.09, so 20.09/(20.09+12.18+7.39+4.48) = 45.4% win chance
  • Boat B (θ=2.5): e^2.5 = 12.18, so 12.18/(20.09+12.18+7.39+4.48) = 27.6% win chance
  • Boat C (θ=2.0): e^2.0 = 7.39, so 7.39/(20.09+12.18+7.39+4.48) = 16.7% win chance
  • Boat D (θ=1.5): e^1.5 = 4.48, so 4.48/(20.09+12.18+7.39+4.48) = 10.1% win chance

With negative skills θ = [-1, 1, 0.5, 0.2], the exponential transformation becomes: e^θ = [0.37, 2.72, 1.65, 1.22]. Win probabilities are:

  • Boat A (θ=-1): 0.37/(0.37+2.72+1.65+1.22) = 6.1% win chance
  • Boat B (θ=1): 2.72/(0.37+2.72+1.65+1.22) = 44.8% win chance
  • Boat C (θ=0.5): 1.65/(0.37+2.72+1.65+1.22) = 27.2% win chance
  • Boat D (θ=0.2): 1.22/(0.37+2.72+1.65+1.22) = 20.1% win chance

The exponential transformation ensures that small skill differences create meaningful competitive advantages. A 0.5 point increase in skill (3.0→3.5) would boost Boat A's win probability from 45.4% to 52.1%, while a 0.5 point decrease (3.0→2.5) drops it to 27.6%.

Crew Rating Methodology

Crew ratings are derived using a fundamentally different approach that captures the crew's contribution to team performance beyond what the skipper's skill alone would predict.

Weighted Sum Approach

Crew ratings are calculated as a weighted combination of two components:

  • Skipper Skill Component: The baseline performance expected from the skipper's established rating
  • Performance Residual Component: The difference between actual team performance and what the skipper's skill alone would predict

Residual Analysis

The performance residual represents the crew's contribution to team success. When a team performs better than the skipper's skill rating would suggest, the positive residual indicates the crew is adding value. Conversely, negative residuals suggest the crew may be limiting the team's potential performance.

Interpretation

This methodology recognizes that crew performance is inherently contextual—it's measured relative to the skipper they're sailing with. A crew rating reflects not just raw sailing ability, but their ability to enhance (or detract from) their skipper's performance. This approach captures the collaborative nature of sailing where crew and skipper work together as a team.

Important Limitations

Crew ratings rely on the assumption that the average crew on a team has a skill value roughly equal to the average skipper. This creates important limitations:

  • Team Context Dependency: If every crew on a team is an Olympian, then every skipper's baseline performance would be with an Olympian crew, and no result would be too significant
  • Relative Performance: If every crew on a team is poor and one crew is good, that crew may appear better than they actually are compared to crews from other teams
  • Cross-Team Comparison: These crew ratings cannot accurately be compared across teams and rather serve to indicate how well they enhance or reduce performance compared to other crews on their own team

Crew ratings are most meaningful when comparing crews within the same team or when teams have similar overall crew skill levels. They should be interpreted as relative performance indicators rather than absolute skill measures.

Rating Distributions

Skipper Ratings

Crew Ratings

Percentiles (Overall)
PercentileSkipper Rating
1% -2.998
5% -2.355
10% -1.971
25% -1.324
50% -0.369
75% 0.726
90% 1.578
95% 1.968
99% 2.627
PercentileCrew Rating
1% -3.793
5% -2.849
10% -2.354
25% -1.420
50% -0.299
75% 0.887
90% 1.849
95% 2.371
99% 3.190
Percentiles by Conference

Skippers

Conference 1% 5% 10% 25% 50% 75% 90% 95% 99%
All Conferences -2.998 -2.355 -1.971 -1.324 -0.369 0.726 1.578 1.968 2.627
MAISA -3.093 -2.525 -2.119 -1.384 -0.369 0.834 1.635 2.046 2.601
MCSA -2.879 -2.290 -1.985 -1.462 -0.722 0.151 0.861 1.261 1.747
NEISA -2.879 -2.165 -1.742 -0.842 0.328 1.239 1.917 2.236 2.777
NWICSA -2.333 -1.989 -1.626 -0.982 -0.352 0.281 0.792 1.138 1.844
PCCSC -2.942 -2.253 -1.978 -1.495 -0.639 0.610 1.589 2.017 2.730
SAISA -3.052 -2.378 -2.035 -1.345 -0.357 0.553 1.449 1.862 2.501
SEISA -2.677 -2.371 -2.032 -1.517 -0.609 0.242 1.391 1.887 2.399
Unknown -2.967 -2.653 -2.280 -1.630 -0.952 -0.129 0.596 1.148 1.819

Crews

Conference 1% 5% 10% 25% 50% 75% 90% 95% 99%
All Conferences -3.793 -2.849 -2.354 -1.420 -0.299 0.887 1.849 2.371 3.190
MAISA -3.833 -2.935 -2.461 -1.479 -0.191 1.040 1.925 2.393 3.248
MCSA -3.921 -2.928 -2.427 -1.629 -0.683 0.242 1.110 1.615 2.375
NEISA -3.691 -2.616 -2.070 -0.961 0.391 1.535 2.353 2.766 3.395
NWICSA -3.163 -2.142 -1.640 -1.015 -0.281 0.392 1.135 1.500 2.359
PCCSC -3.576 -2.744 -2.274 -1.504 -0.519 0.706 1.797 2.399 3.348
SAISA -3.700 -2.979 -2.455 -1.410 -0.272 0.744 1.669 2.115 3.052
SEISA -3.997 -2.902 -2.436 -1.669 -0.616 0.538 1.501 1.923 2.831
Unknown -3.995 -3.257 -2.780 -1.968 -1.105 -0.129 0.866 1.422 2.420
Skipper Rating Distributions by Conference

Each curve shows a normal approximation for the distribution of latest skipper ratings within a conference. The peak aligns near the conference mean.

Competitors by Class Year

Counts of competitors by normalized graduation year.