Opinion: Results-Based Categorization (Race Categories, Part 3)

February 13, 2020

LAST UPDATED April 4, 2020

In part 1 of this series, I laid out the current Zwift race categorization situation. This included a discussion of the standard FTP-based categories, how enforcing rules post-race falls short, and a look at some actual participation numbers to show how currently-published race results are far from ideal. We ended that post with some ideas for improving the race experience for everyone, including the simple solution of using a rider’s saved FTP to determine their race category. Read part 1 here >

Part 2 followed, where we looked at some ideas for further improving race categorization. This included basing rider categories on more complete rider phenotypes, enabling real-time detection of category violations, and more. The big idea here was an iterative improvement on race categories and enforcement, so when the final phase rolls out, we’ll already have many of the tools in place to make it work. Read part 2 here >

Now let’s look at the final phase: results-based categorization.

Why Results-Based Categories Are Needed

The vast majority of Zwift races currently use the same categorization scheme, which is based on your FTP. Back in part 1 I said: “While some may disagree, I think ZwiftPower’s categories do a good job of breaking riders into groups that can compete well against each other.” And I honestly believe that. The biggest problem with our current w/kg-based categories isn’t the category setup, it’s the sandbaggers who race with impunity below their category, negatively affecting the race experience for legit riders.

But here’s the thing: the sandbagger issue is going to be resolved. That’s what my first two posts were all about, and if Zwift was to implement just a few of the suggestions found there (and it sounds like they are) most of the sandbagging problem will go away.

Once that happens, we’ll have a functional FTP-based categorization system on our hands. And that’s when we’ll start to notice its shortfalls!

Here are a few of the common complaints regarding FTP-based race categories:

Favors heavier riders: a 100kg rider’s 3.2w/kg (320 watts) is much faster on flat ground than a 75kg rider’s 3.2w/kg (240 watts).
No skill component: a highly-skilled rider could dominate a category, winning every race without ever bumping against the w/kg ceiling. Conversely, an unskilled rider near the middle or bottom of their category’s power limits won’t stand a chance of winning, but they are forced to race in that category anyway.
Forever losing: if a rider happens to have power which places them at the bottom of a category (for example, a B with an FTP of 3.3w/kg), they stand very little chance of ever winning a race, yet can never drop to the C category.
Hard to Understand: “I just want to race, I don’t know which category I’m in!” The current system presents a barrier to entry because it requires riders to know their FTP w/kg: a number which many (most?) cyclists don’t know.

A results-based system, if properly implemented, would eliminate each of these complaints. It really is the way forward for Zwift racing, which is why I’m sure it will happen eventually. Let’s look at how such a system could work.

Functionality “Must Haves” and “Should Haves”

The biggest challenge in creating a proper results-based system is that it’s complex. No ranking system will be perfect, and while we have good models in real-world cycling and gaming, Zwift presents some unique challenges that aren’t accounted for in those systems.

So the way to begin is to outline what a workable categorization system would accomplish. Here are my proposed requirements – the must-haves which would be required in order for the system to function on a basic level:

Simple Startup: this is Zwift. Newbies should be able to hop into their first race, have a great time, and get a result on their very first try.
No Sandbagging: controls must be in place so sandbaggers don’t spoil the race experience for others.
Immediate Results: as soon as I cross the line, I should be able to see my result, and how it has affected my overall ranking. All disqualifications should be automatic and already in place before the race ends.
Strength of Field Included: points earned or lost must take into account both the rider’s finishing position and the strength of the overall field. It’s not as simple as receiving X points for placing in Xth place. Example: an average B racer who gets 4th place in a race that includes 50 riders, many who are stronger than her, should see more of a ranking improvement from her result than when she wins a race against 5 riders who are all weaker than her.

And here are my additional wishes – the “should haves” which would really make the system great, but aren’t required in order for it to function.

Easy Downgrades: a rider who is coming back from injury or sickness should be able to race in an easier category, provided their involvement doesn’t ruin the experience for others.
Race Up if Desired: riders should be able to race in a more challenging category, if that’s what they want to do.
Flexible Categories: when we have races ranging from 5 riders in a category to several hundred, it makes sense to develop a system that allows for flexibility in terms of the number of categories and their breakpoints. When rider counts are high, race organizers could choose to have more/narrower categories, making the competition even fiercer.
Racer Dashboard: racers need easy access to a few key stats which show where they rank in the Zwift universe and how their latest results have affected their rankings. Ideally this would be available in-game, in Companion, and on zwift.com.

Structuring the Categorization System

Spoiler alert: I’m not going to attempt to lay out the details of a categorization system in this post. Even if I was an expert on categorization systems (I’m not), it’s too complex of an idea to detail in a single blog post. I will, however, reference a few systems and what makes them interesting.

USAC Points

The system used by USA Cycling has obviously been proven in the real world. ZwiftPower uses this system for its ranking points (with a minor modification for small race fields), so this system is actually already in use by Zwifters. Read how it works on ZwiftPower >

ELO Rankings

The ELO Rating System was originally designed for chess players, but is now commonly used in modified forms for ranking players in major sports and video games. This seems to be the most commonly-used ranking system in gaming. Read more about ELO >

TrueSkill Ranking System

Developed by Microsoft Research and used for game matchmaking in Xbox live, TrueSkill has been used to rank players in many different big-name games from Halo to Forza Motorsport. It works well in multiplayer and team scenarios and includes a number quantifying the degree of uncertainty the system has about a gamer’s skill. This would seem to be an important metric in a sport like cycling, where newbies arrive often and one’s ability can change based on fitness level, injury, etc. Read more about TrueSkill >

Cycligent Virtual Rankings

This was the first true results-based race ranking system for Zwifters. Rolled out in January 2017, it ceased operations thanks to GDPR and not long afterward, CVR decided to create its own virtual cycling platform (CVRCade). This system’s structure was very complex, but how it all worked actually made sense if you could wrap your head around it. Read more about it in this ZwiftInsider post.

Dynamic Categories

Things could get pretty wild if Zwift allowed for dynamic categories – that is, categories which aren’t locked in place. Think about it: what if the number of categories in an event could dynamically change based on the number of participants signed up, with the system automatically grouping riders by rank into categories of, say, 40+ riders?

Nobody wants to race against just a few riders, but in an event with 500 signups, why not have (for example) 6 categories instead of 4? This allows riders in each category to more closely compete with each other so more riders feel they’ve “got a chance” and therefore will push harder for the chance of glory.

It could work in the other direction as well: if a race only had 20 participants, maybe it would only have 1 category.

This is all just brainstorming, of course. But I do think there are many interesting possibilities here which could let Zwift racing stand out from outdoor racing in meaningful ways. Perhaps we’ll end up with some races being based on “stable” categories, and others being dynamic. Why not?

When

Based on the recent Minterview, we know Zwift is close to implementing some controls to reduce the negative effects of sandbaggers. That’s great news! Eric Min made it sound like those changes would roll out in the next month or two.

What about results-based categorization? When can we realistically expect that to happen? I can only guess at a timeline, but my guess is that it’s not going to happen any time soon. ZwiftHQ doesn’t seem to have this as a high priority item, so I’d say we’re looking at 6-18 months before such a system rolls out.

Your Thoughts

How important is it for Zwift racing to switch to results-based categories? Got any great ideas for how such a system would work? Share your thoughts below!

< Read Part 2 of this Race Category Enforcement Series