This is the next installment in the series looking at Zwift racing development following on from my articles “Zwift – What’s Next“, “Matchmaking: How a Simple System Could Revolutionize Zwift Racing“, and “How can Zwift Develop a Platform for Fair Racing“.
These articles garnered a significant community response, and from these comments it felt like there was consensus amongst the community for the direction we would like to see things move. I contacted Zwift to see if I could share these ideas more directly, and in parallel a dialogue was opened up on the Zwift Forums with input from Zwift’s Head of Content and Programming, Mark Cote.
It’s great news that Zwift are now engaging with the community directly on this topic – speaking to Mark it is clear that this approach will help Zwift be more accurate as they develop a solution, as well as help us all understand what is feasible and some of the challenges the development team will face as they work through this.
So what are Zwift planning to do? What are the big challenges they will face, and what can we do as a community of racers to help?
Let’s start by looking at what has been shared on the forum thread to this point. Discussion has generally revolved around two development opportunities – a results-based ranking system with matchmaking is generally considered to be the ‘utopian’ ideal, but the more pressing issue seems to be category enforcement at race entry and improvements to better split the categories to disincentivize sandbagging.
We want fair competition, just like all of you.
Ultimately, we believe in an ELO style ranking system, like has been raised by many of you. This is not a simple add given our current infrastructure, but ultimately this is where we look to go.
If you were to make a power-based system for category recommendations or auto categorization:Mark Cote
– What data would you use to establish a rider’s performance?
– How would you categorize riders?
This was a great question to get the community thinking. First of all it is clear that a rankings/matchmaking system is a significant development – an entire new backend management system and significant changes to the user experience would be needed, but this long term vision is shared by both Zwift and the community. The question then becomes one for the short-medium term: how can we improve the category system as it exists today?
Mark then went on to share some of the current thinking:
“Thanks all for the thoughtful feedback. Since you asked where our current work and thinking is. With ZwiftPower starting here:
1. Data to Establish Rider’s Performance
We’ve been migrating the Power Duration Curves server side over the past many months and building a microservice that allows us to call power data based on a few different attributes. The power curve has a ton of data from 1s to 2hrs and allows us to pull really any data across this array. ZwiftPower takes this off of your last three races. We’re currently considering a maximal aggregation of power curve array values over the past 30 days. This would include more data that is currently considered for your category.
2. How Would You Categorize Riders?
With regards to categories, we’ve discussed the balance of race density relative to power bands. On one end, we want races to have a good field size and we want that field to be fairly matched. i.e. It would be awesome if autocategorization was enabled for ALL events and there were infinite bands, tied to pack density, but this is super difficult to do in the short term.
So right now, we have 4 categories (potentially +1 for A+). The feedback we’ve generally heard is that D’s could almost be split into three while C/B/A could each respectively be split. This might be too much, so your feedback on the 4 categories and the W/kg splits is what we’re asking for feedback on.“
This is some really interesting insight into the progress the Zwift team are making. Storing a rider’s power curve data allows for a much broader understanding of their strengths and weaknesses, which could allow for a more even splits across categories. Probably the biggest benefit of this approach is the impact on sandbagging – if the arbitrary line to separate categories becomes more dynamic, it becomes very difficult to manage your performance to deliberately stay in a lower category.
The key thing is that this category is enforced up front and from a UI perspective, a rider’s category is made clear in-game. This has also been confirmed as the preferred approach by Zwift:
“By auto categorization, we fix it on the front end, not penalize on the backend – this is generally preferred.
Following more interesting discussion from the community, Mark updated with further progress:
“We’re working on next steps right now. There are a few mega initiatives going on the software team right now that are in front of Competition Fairness so I can’t confirm timing. As this is looking to be in the near term queue, Flint and I have been engaging along with a few other PMs on calls with some of you on this forum to gather feedback and thoughts. There is active work on Power Curves and outlining the systems and rules for upcoming competition series, some by Zwift and some in partnership with WTRL. There is upcoming work on recommended and auto categorization, so this feedback will fuel those directions.
I hope this gives some clarity – I know we all want details and a launch date, but we do not have that yet.”
I caught up with Mark to discuss some further ideas and how the current work is progressing, and he was able to share the below. This is really exciting progress and I am sure you will agree with me that this dialogue with the community is a welcome change. I will hopefully be able to share more articles in the future with some insight from the Zwift development team.
“The team at Zwift is working hard to bring some of the most requested features to life, such as the recent release of route badge achievements tracking. Our forum conversations on fairness in competition shows how aligned we are with the vision of the community, but there are other priorities ahead of this work. Fortunately, with a great partner in WTRL, we are hoping to test some of these early ideas within upcoming events.“
WTRL released the Zwift Classics race schedule this week, and it appears that these races will use a 6-class system instead of the 4-category system used in the past. Is this WTRL testing some early auto-categorization ideas? We’ll get our answer soon, as the series begins July 13th.
This is good news.
Frankly I’d be happy if they aligned the Cat colour codes with the power spectrum.
I don’t know why, but the color code really bothers me. I think I first noticed it on a long race up Ventoux when I didn’t have anything else to focus on and the fact that the colors don’t follow the color spectrum just drove me batty. Yellow, blue, green, red? What now? Blue, green, yellow, red, would have made more sense, though they could have thrown A as orange and A+ as red instead of black and it would work even better.
Glad I’m not the only person who gets annoyed by that, should totally be Red(A+), Orange(A), Yellow(B), Green(C), Blue(D). Or if they’re going to add more Cats then keep A+ as Black and then at the other end of the spectrum have White as the lowest which would give 4 main Cats rainbow colour coded with the extremes bookending them in Black and White
I think it is essential not to forget those who can’t produce a lot of power to make these events/races inclusive for all. I think you will need at least 6 power groups .From my personal perspective a “just ” B rider producing 3.3 W/kg won’t get close to a fellow B rider at 3.9 W/kg. Being up front with classes rather than penalising after the race is the way to go . Looking forward to the changes Do we really need women and men in separate events in a virtual world ? Perhaps just have a separate women’s results.… Read more »
I agree with the 6 power groups as I’m a just B rider also. Cant compete with 3.8/9 riders..
I don’t think it’s quite that simple though, I’m one of the 3.8w/kg B riders, but at 66/67kg I really struggle to hold on to the back of the 100kg riders in crits on the flats/downhills.
With you on this one!
since I got promoted to ‘just’ B racing lost a lot of fun. Now my Fitness levels declined a bit and I’m back in C. Riding with the front group is so much better than soloing at the back. Hopefully having more power groups will help.
Also I am glad Zwift is finally recognizing the problem. Although it took them well over a year since they first said sandbagging is a priority for them… (I remember Eric Min talking about it in January or February 2020. )
Personally I’d like to see ZP (or any in house alternate) integrated into the game and implement a results based category system. The ratings are already there on ZwiftPower and surely it must be possible to use thi in order to categorise. Completely get that there are a number of strong riders who won’t as of yet have ranking points due to the requirement of having to connect with ZP but if this data was available for all by integrating into game it seems to be like a no brainer to use data that already exists.
I think that’s the long term aim. Integrating something like that in to the core client is a huge piece of work though, when we look at how long other seemingly simple changes have taken.
Yep understood and completely recognise from the article that that is the long term goal. It’s a shame it is quite so cumbersome! I’ve personally experienced how demoralising being a *very very* low A rider can be with the sheer size (often 4-5.5) of the category and a lot of our riders experience similar in various categories, but I think especially with upgrades to B and A.
Same here, I stopped racing once I became an ‘A’ category.. struggling to be competitive in a ‘B’ category race (top 20 at best in ZRL) but now put into no mans land.. Very difficult to categorise due to vast abilities in zwift but a lot or most of the races I see or can race in are relatively flat or power courses which don’t suit me. Maybe a system which is pure power output or W/kg dependant on the course profile variability?
I was exactly the same as you and then waited to go back down to CatB. For a while I was worried about being put back up, but I realised that I could avoid that by INCREASING my weight – just 2-3 kg above my real weight. This keeps my W/kg suppressed enough to stay in Cat B and still gives me a great challenge. For those of you just falling into the bottom of the next category and knowing that’s the best you’ll realistically get to, I totally recommend increasing your weight in Zwift. You can then compete fairly… Read more »
Categorization based on race rankings and being automatically dumped into a category (you don’t get to choose) that is determined based on what the event coordinator sets (they should be able to choose between (1) a range of race rankings, (2) size of each category, or (3) number of categories) would pretty much be my ideal and it sounds like they may be thinking along those lines. Automatically putting races in categories would decrease sandbagging and backend disqualifications (so you don’t have to go running over to ZP to figure out what your “real” finishing position was) – and probably… Read more »
How is possible to participate in the races when the riders use different smart trainers.How is possible to have accurate results if the riders participate in the race from different places.We know how easy is to transmit data from the trainer to one other device and after to the zwift platform. The only one solution is certified zwift centers with equal equipment and live streaming.
Don’t throw the bathwater out with the baby.
Do some riders cheat? Sure. But most don’t do it. And even if someone does cheat, so what? As long as the racing is enjoyable, I don’t care if someone is only fooling themselves.
If they can make it so that the circumstances like I saw yesterday no longer possible, A riders joining in B and C grade races. Then that’s a great start.
This depends on your goal. For pro eSport you need comparability and you can already see examples of how they do that. For enjoyable racing, your line can be lower. Here we are talking about 1) Preventing sandbaggers affecting race outcomes by blowing apart the field from the gun 2) Making racing more accessible to more people. Let me give you a concrete example: I’ve been that C cat rider promoted to B. Being right at the bottom of a category is tough! But what makes it tougher is if there are sandbaggers who can drive big accelerations. I don’t… Read more »
Power classes are just the wrong solution full stop. I mean, sure, by all means make it a little bit less broken than it is at present by preventing people entering the wrong category, but trying to make the power classes more accurate or relevant is just turd-polishing. If it isn’t based on race performance, it’s designed to fail from the start.
I agree completely. The question is, would we rather have no changes whilst Zwift work on a proper robust rankings system, or some interim measures whilst whilst they still work on a system in the background. I hope they do put the resources in to a full rankings system, and I hope they communicate progress as it develops so it hits the mark.
The implementation of any specific result ranking system is orthogonal to using a result ranking system in the Zwift client and backend. There already is a ranking system implemented in Zwiftpower. It could be used to test and roll out changes in the Zwift client and backend. This could (and should) be done to allow event organizers to select between ZwiftPower Power Categories or ZwiftPower Individual Ranking. Implementation of a better (Elo-based hopefully) ranking system could then be done, and each event could select between ZwiftPower Power Categories, ZwiftPower Individual Ranking or ZwiftPower Ranking Elo Ranking etc. As all of… Read more »
Exactly. It was a bad system to implement from the start. It should have always been results based.
If it’s results based, the lower cats will still have a stream of riders showing up just to get their promotion. It should be a combination of both results and power data for intial categorisation.
If they make racing more fair, and therefore more fun, there will be more people doing it and filling up the categories.
Yes. You can see D riders racing along with Bs in many races. I’ve encountered this in real life too. Strong riders who believe they should start at the bottom and work their way up. It’s just a misunderstanding of Zwift’s category based racing in some cases I think. It must really make D racing very hard for genuine D riders.
In a lot of ELO systems, you get placed at the median when you start, so the bottom categories would only be filled with riders filtering down. Some more “serious” races could have extra requirements for the number of completed events so that miscategorized newbies wouldn’t ruin the experience for others.
I definitely agree that initial race placement should be based on power profile (lets say, being a D starts you out at a ZP ranking of 600, C level power starts you out at a ZP ranking of 500, etc.) and then you move up or down from there as your results indicate.
Can you explain? I don’t see how results can be compared across the variety of Zwift events. I understand why FTP and W/Kg isn’t perfect but I don’t understand how results based could be better.
What’s annoying is when the cats don’t move the ppl when their clearly out of cat often I’ve seen b riders over 4.2w\kg in there zp profile and their still b! How does that happen weds consistency and not just 20min as some don’t race over 20mins so they stay b even though their over 5w\kg
For one, the current system isn’t only based on w/kg but also on total watts.
I was still cat A for probably 6 months after getting to 4.6 w/kg simply because I needed to get above 4.9 w/kg to reach an FTP of 300 watts.
If your weight is 58kg, you don’t move from B to A unless you get above 4.3 w/kg because of the 250+ watts requirement.
And most zwift racers are 36kg
if a rider enters a lower cat then they should be shadow-banned that race; this means they will see everything and race it as normal – BUT – will not get results; and the rest of the Cat who are correctly seeded, will not see that shadowed rider (they will remain invisible the entire race); its always the guys who race lower cats and claim to be, “resting”, “havent riden for a week”, the usual excuses, and then are the ones powering at the front splitting the cat and dropping riders the whole race. <- these guys are what make… Read more »
I would just like this as an option. I’m just about in the A cat, but found racing In B much more fun. I’d quite happily race Bs if I could see them, but they couldn’t see me and I didn’t appear in results.
Its still a more fun workout than trying to find an A race I can compete in
What I would most like to see in racing is separate start times for each category
This already happens. It’s configurable for an event, so up to the event organiser to decide if it is a mass start or not.
https://zwiftpower.com/profile.php?z=62302 10-0 in D
Yeah, he’s a well known sandbagger.
When you see all that RED in his Naturalize Power, its a dead give away he’s sandbagging.
This guy needs to be removed, as he’s just a troll in Zwift.
Read the rule book, #Noob.
Clearly a loser in their normal category trying to feel better about themself in a lower category. Oh wait, is that you?
i have been the King of B,C Now king D at 100Kg , i eat may way there, i did the work.
I don’t think D cat needs further separation, and this coming from someone who just upgraded to C and was dead in D cat multiple times. Yes the power disparity between a bottom D and a top D is gargantuan, especially if you factor in that there are a lot of heavyweights in D so that you are often out powered by 100w which is massive at that level. However there are not many D’s to begin with and people generally don’t stay in D for very long. I did the TFC Mad Monday series, which was split in 8… Read more »
One of the things the person from Zwift alluded to was a long term goal of setting a desired field density. That would be awesome. Whether they use race rankings (my preference) or some better power profile that takes into account things other than 20 minute power, shortly before the race starts, they just automatically break the entire race into fields grouping it so that each field has X number of races who are all in the same grouping of rankings (or power). So you’re with the 20 (or 50 or whatever) people who are nearest to you in power… Read more »
This is the dream. See the matchmaking article
I agree and think it’s a dream that a lot of us share. I would prefer it as a toggle-able option that event organizers could choose, so that someone like James Bailey (above) could choose that all of his races would be in categories set by 100-level chunks of race rankings so that the folks who raced in each of the 7 different occurrences of the Herd Summer Racing Series each weekend in each category would at roughly the same level as people racing in the same category in a different occurrence of that race that weekend and people would… Read more »
Glad to hear that the Zwift team are looking into this! I completely agree that the end goal should be a results based ranking system. If an intermediary solution is an update to the power based system then it has to factor in raw power as well as power to weight ratio (beyond the one cut for each category of requiring 250W for A, 200W for B etc.). Light weight riders, exceeding these minimum raw power outputs, can’t compete. Be it on a flat or hilly course, heavier riders go faster for the same w/kg, which the current system is… Read more »
What is wrong with USAC/UCI Ranking. Win races and CAT UP! End of story.
I bounce between A /A+. I’m a legit 68kg rider from Australia. I find in A grade 99% of racers are also legit and it’s generally pretty easy to spot a cheat at the top end so it doesn’t bother us too much. When we’re all flying along at 4.5w/kg and a cheat bursts off the front and sits on 6w/kg we generally bag him out and flag, if they make it through the race they’re never on zwiftpower. If you’re serious about results that’s where you look. I’m amazed how few C and D grade riders use zwiftpower, it’s… Read more »
Yeah, but up in the As, you don’t really have to deal with sandbaggers splitting your field and blowing things up because you’re already at the top (unless you count an A+ sneaking into an A-only field). There really needs to be mandatory minimums for which category you can choose (or they should make you race as a ghost no one else can see if you choose to race below your category). At least in the D’s and then C’s where I’ve raced, sandbaggers are a huge problem and actual cheaters are much less visible. Sure, we should just let… Read more »
this is the first category D can anyone explain?
No 20 min race in the last 90 days, that why there is D there. one ride he did do that was more 20 min was at 122watts or so. 95%
thanks,then that’s right, however it is strange
A race is a race. How about everyone is equal at the start, and your category and placing for each is set in the finishing results. In short, your effort determines your category. I don’t know about the rest of you, but some days I feel like an A, while others I’d be struggling just holding wheel in B. There again, there would be the sandbagging issue, so perhaps we should be watching the avg heart rate of each rider more than avg power. If you’re putting in the effort, and laboring, you wouldn’t really be sandbagging, right?
I hope they do *finally* do something to improve the Cat system and sandbagging and people joining the wrong Cat in races, however I really don’t understand why we are STILL at a point where people can even choose to join lower Cats! I understand they don’t want to put off the more casual racer/rider however surely it would be very very very simple to simply have the existing Cats (for now) plus 1 new Cat of “NR” for “Non Ranking” or whatever you want to call it whereby when joining a race you can freely join the NR Cat… Read more »