A data project by Holden Comeau

Twenty-three million rows of competitive cycling.
One question worth answering.

RaceProof is a population-scale analysis of competitive fitness on Zwift, the world's largest online cycling platform. Built on 850,000+ events and 340,000+ unique competitors, the project measures the relationship between racing behavior, performance outcomes, and subscriber retention.


Background

Fifteen years in commercial technology across sport, media, and consumer data. Co-founded ventures, led multinational teams, and built data products that connect to revenue.

This site is built on a 23-million-row Postgres database I assembled to answer a commercial question for a live sponsorship engagement. The analysis maps subscriber lifetime value, seasonal retention behavior, and competitive population segmentation across nearly a decade of participation data.

Prior work includes 2.5 years building a proprietary performance verification and anomaly detection system for competitive cycling, combining anti-cheat methodology with longitudinal power analysis. That system secured a multi-year exclusivity contract with cycling's international governing body. I also engineered a consumer data product reaching 20M+ active lifestyle consumers, helped launch Strava's commercial Challenges feature alongside their revenue team, co-founded a data analytics company, and led a 15-person multinational team.

I am also an athlete. Three-time NCAA All-American, U.S. Olympic Trials qualifier, world #1 ranked cycling esports competitor with a 40% win rate across 2,000+ events, and coach to a future Olympian. I carry nearly a decade of dual-recorded 1Hz power data from my own racing, which serves as the calibration dataset for the analytical methods in this project.

By the numbers

3x
NCAA All-American
40%
Win rate across 2,000+ competitive cycling events
20M+
Active consumers reached through data products
2.5 yrs
Building performance verification and anomaly detection systems
23M+
Race result rows across 850K events and 340K riders
World #1
Ranked cycling esports competitor
15 yrs
Commercial technology in sport, media, and consumer data
10 yrs
Dual-recorded 1Hz power data from personal racing

The analysis

Four chapters, one thesis.

Each dashboard explores a different dimension of competitive cycling on Zwift. Together they build an evidence-based case for racing as the platform's highest-value engagement model.

01

State of Zwift Racing Draft

Growth, seasonality, rider distribution, and macro participation trends across nearly a decade of competitive events.

02

User Behavior Draft

Rider lifecycle, new vs. returning segmentation, engagement by tenure, W/kg distribution, FTP trends, and DNF patterns.

03

Competitive Fitness Draft

The dose-response relationship between racing frequency and measurable performance gains across the population.

04

Event Intelligence Draft

Supply and demand mapping, peak-hour misalignment, field sizes, and participation structure by event type.

Origin

The analytical approach behind RaceProof didn't start on Zwift. A decade ago, while building a guest analytics product at lululemon, I was working with digital commerce and demand data, trying to understand what drives guest loyalty beyond simple transaction frequency. The question that kept surfacing was whether certain behaviors, at certain intensities, created fundamentally different customer trajectories.

Running a similar analysis on fitness participation data, I found a threshold effect: roughly three sessions per week was the inflection point where engagement behavior shifted from casual and churn-prone to committed and self-reinforcing. Below the threshold, participants treated fitness as an obligation. Above it, something changed. They identified with the activity. They stayed.

That same pattern is now visible across hundreds of thousands of Zwift racers and nearly a decade of competitive cycling data.

RaceProof exists because the question scaled. The dataset is larger, the domain is different, and the analytical tools have matured. But the core insight is the same: the relationship between behavioral intensity and measurable outcomes is not linear. There is a point where more becomes different, where engagement stops being a metric and starts being a mechanism. Finding that point and proving it with data is what this project does.