A data project by Holden Comeau
Twenty-three million rows of competitive cycling.
One question worth answering.
RaceProof is a population-scale analysis of competitive fitness on Zwift,
the world's largest online cycling platform. Built on 850,000+ events and
340,000+ unique competitors, the project measures the relationship between
racing behavior, performance outcomes, and subscriber retention.
Background
Fifteen years in commercial technology across sport, media, and consumer data.
Co-founded ventures, led multinational teams, and built data products that
connect to revenue.
This site is built on a 23-million-row Postgres database I assembled
to answer a commercial question for a live sponsorship engagement. The analysis
maps subscriber lifetime value, seasonal retention behavior, and competitive
population segmentation across nearly a decade of participation data.
Prior work includes 2.5 years building a proprietary performance verification
and anomaly detection system for competitive cycling, combining anti-cheat
methodology with longitudinal power analysis. That system secured a multi-year
exclusivity contract with cycling's international governing body. I also engineered
a consumer data product reaching 20M+ active lifestyle consumers,
helped launch Strava's commercial Challenges feature alongside their revenue team,
co-founded a data analytics company, and led a 15-person multinational team.
I am also an athlete. Three-time NCAA All-American, U.S. Olympic
Trials qualifier, world #1 ranked cycling esports competitor with a 40% win rate
across 2,000+ events, and coach to a future Olympian. I carry nearly a decade of
dual-recorded 1Hz power data from my own racing, which serves as the calibration
dataset for the analytical methods in this project.
Origin
The analytical approach behind RaceProof didn't start on Zwift. A decade ago, while building
a guest analytics product at lululemon, I was working with digital commerce
and demand data, trying to understand what drives guest loyalty beyond simple transaction
frequency. The question that kept surfacing was whether certain behaviors, at certain
intensities, created fundamentally different customer trajectories.
Running a similar analysis on fitness participation data, I found a threshold effect:
roughly three sessions per week was the inflection point where engagement
behavior shifted from casual and churn-prone to committed and self-reinforcing. Below the
threshold, participants treated fitness as an obligation. Above it, something changed.
They identified with the activity. They stayed.
That same pattern is now visible across hundreds of thousands of Zwift racers
and nearly a decade of competitive cycling data.
RaceProof exists because the question scaled. The dataset is larger, the domain is
different, and the analytical tools have matured. But the core insight is the same:
the relationship between behavioral intensity and measurable outcomes is not linear.
There is a point where more becomes different, where engagement stops being a metric
and starts being a mechanism. Finding that point and proving it with data is what
this project does.