Recently, I ran into an annoying glitch – every now and then, all other Zwifters would disappear from the screen and the “Zwifters nearby” panel on the right-hand side of the screen would disappear as well.
This seemed to be happening in almost every Tour de Zwift event which my son or I were riding. It felt like a disaster: you can no longer draft, you lose your speed, and by the time other riders reappear you find yourself dropped and far behind.
Zwift attributes this to a poor Internet connection. Although this makes sense, it did not sound like the whole story. What does “poor Internet connection” mean and how do I fix it? I could browse the Internet on the same computer that runs Zwift while my son was riding and complaining about the issue. Ride Ons were coming through while other riders were gone. If the Internet was gone, nothing should work!
This warranted an investigation. I would like to share this with other Zwifters, since it may help you if you’re encountering similar issues.
Using Zwiftalizer to Spot Network Problems
First of all, I found that uploading my Zwift log file to Zwiftalizer.com is a quick and easy way to test if there were any significant internet dropouts. Zwiftalizer generates a graph for network performance at the very bottom of the report, way below the graphs with ANT+ and Bluetooth signals. If your Internet connection had outages, the network performance graph will show multiple red bars.
The example below is a simulation of a poor Internet connection which I created by disconnecting the network cable for short periods of time and connecting it back. My Zwift log did not record a 2-second disconnect, but was able to detect and save in the log file every disconnect that lasted 4 seconds or longer. Every time the network was interrupted for more than 6 seconds, all other riders would disappear by the end of the 6th second. A 4-second interruption had no visible impact on Zwift.
But what if this log does not show any network errors? What if it shows a perfectly straight horizontal line? Below is the log from a Tour de Zwift ride when I lost all other Zwifters for nearly 10 minutes – and yet, nothing appears to be wrong with my Internet!
The Latency Deadend
I sent a message to Zwift support and asked for guidance. They brought to my attention that they found UDP warning messages in the Zwift log file and recommended that I check my network latency using a generic Internet speed testing web site.
I was still asking myself how latency could be relevant to this issue (after all, Zwift may run for as long 6 seconds without any incoming data until riders disappear- this would be a huge latency!) but I decided to check anyway. And I had an even better idea: my firewall log shows IP addresses of every server which Zwift talks to. Why don’t I ping one of those servers (preferably a server which sends UDP data)?
I was lucky to find that one of several cloud computing servers to which Zwift talks during a session was accepting pings (the others did not). I found no issues with latency: it was around 23 ms for a round trip to a Zwift server (this is a good, low number). So, the latency hypothesis was a dead end. Yet, I found something else: 2.1% of pings were lost along the way and never came back – which was a bit alarming! (In retrospect, it turned out that packets were lost within my home network and I could get the same result by pinging any random web site).
An Important Discovery
There was one more unexpected and important finding, this time from the log of the firewall running on my PC: my own firewall was rejecting part of data packets coming from Zwift servers! The fraction of rejected packets varied. Most of the time it was low, a fraction of a percent, but in the moments before losing other Zwifters it was particularly high: I estimated that as much as 4.6% of Zwift’s data packets were rejected!
With this information in hand, a picture of the root cause of my problem started emerging. My Internet connection was stable if measured on the scale of seconds and minutes, but a fraction of data packets were getting lost. It seems Zwift is sufficiently robust to continue operating without issues as long as the fraction of lost packets remains below a certain threshold, perhaps a couple of percent. But once a cumulative losses becomes high enough, an issue starts!
It became clear that I needed to find a way how to fix one or both of two sources of data loss: the issue caused by my wireless network and the issue caused by my firewall.
Fixing My Network Losses
I have a fairly complex wireless system at my home. In my home office, I use a separate router in the bridge mode to be able to connect my work phone which requires ethernet. The same bridge feeds Internet via a short ethernet cable to the computer which runs Zwift.
I ran a sequence of tests and found that data loss only happens when I use that bridge. If I connect my PC directly to the cable modem using a long ethernet cable (instead of wi-fi), my latency drops by about 2-3 ms, while packet losses disappear. If I connect the PC directly to wi-fi to circumvent the bridge router, packet loss also disappears. Hence, the problem with my home wi-fi turned out to be easy to solve. I did not even need to hardwire my PC to the cable modem! My best guess why this happened is because my bridge router sits under the desk in the office, in the area with a relatively poor wi-fi signal (around -73 dB, per WiFi Analyzer).
The second source of losses, related to firewall, is a little harder to understand…
Fixing Firewall Losses
TCP vs UDP: a Zwift Primer
Before I explain what I changed to fix that second issue, let me take a step back and discuss a couple of technical questions, so that you are on the same page with me. First of all, how is it possible that a fraction of data packets gets lost, but the log file (graphed by Zwiftalizer) shows a stable internet connection? This is where it becomes technical.
Zwift uses two Internet connection protocols. They do not disclose what data they send and how they send them, but it appears that the TCP protocol, the more reliable of the two, is likely used for things like connections to databases, retrieving routes, uploading rides and pictures, text messaging, and Ride-Ons. The second protocol used by Zwift is UDP, which Zwift appears to use to send your current position and power to the servers and to get information about nearby riders back from the server. These data packets are sent back and forth continuously, so Zwift can show the true positions of other Zwifters at every moment.
Zwiftalizer analyzes and graphs only the status of TCP connections. Since TCP requires two-way negotiations with confirmation of data delivery and includes the option of resending lost data packets, Zwift very quicky learns about connectivity issues and creates an error message in the log.
UDP, on the other hand, is one-way communication without confirmation of data receipt. Therefore, it may take longer for Zwift to notice an issue. When it gets data at a slower pace or does not get at all for a short time, it cannot determine if the network is not working, a Zwift server is not working, or if data packets are just delayed because a server is overloaded and cannot catch up with the demand.
A comparison of my firewall log with Zwift’s log suggests that each time Zwift suspects the UDP connection is poor, it tries connecting to a backup server. If that does not help to restore full connectivity, it creates a somewhat obscure warning message in the log file which looks like this:
This warning message is the indication of a potentially significant UDP data flow interruption. If one or two data packets are lost every now and then, Zwift does not seem to care.
These messages are a valuable tool for troubleshooting. Just open your log file and any text editor to search for “UDP”.
Stateful Firewalls: the Basics
Back to the firewall. How is it possible that a firewall would reject some packets but not the others, if they come from the same server? I have a hypothesis for that. Many, if not all consumer and small business-grade firewalls by default allow only application traffic. They maintain so-called session tables for TCP and UDP to quickly decide if they should pass or block each incoming data packet, without scrutinizing the contents of the data packets for signs of malware.
When the local Zwift program sends a UDP data packet to an external server, the server’s IP address and port are added to the list. If the firewall gets an incoming UDP packet, it searches the session table for a matching IP address and port. If it finds a matching entry, it decides that it is a legitimate data packet and lets it through.
If there was no traffic for a certain time (which can be very short, perhaps just a few seconds or less – the higher the firewall security level, the shorter the time), the entry is deleted from the state table. This is a safety measure, so that no other program could hijack the traffic.
After that, the firewall will start rejecting further incoming UDP packets until a new UDP entry is created in the state table. This matches what I see in the firewall log – UDP packets from remote port 3022 are rejected as “IP traffic”, which means the firewall can no longer associate them with a specific application on the computer. This is a way to say that there is no matching entry in the state table.
Setting Up a Firewall for Zwift UDP Traffic
Zwift knows about the potential impact of issues with UDP traffic and provides the following guidance to users:
Most users ignore these recommendations: “I installed Zwift, it works, so everything is good with the firewall, right?”
As I learned the hard way, apparently not! In my case, my firewall did not let ALL traffic through. It let MOST of the traffic through, with some exceptions. The good news is, it is easy to fix. Instead of relying on generic application-related rules to recognize Zwift-related traffic, one can create a special firewall rule to let it all through.
A special note is due here: Zwift could have done a better job explaining what needs to be done.
For one thing, they only made a reference to the router firewall, but did not mention a firewall that runs on the computer. A firewall on the router may block Zwift completely, but it is very unlikely that it would create intermittent issues. Router firewalls, to the best of my knowledge, do not use application traffic rules. If Zwift works most of the time, your router firewall is “out of the equation” – but the firewall on your computer is not!
Secondly, the Zwift team was not sufficiently precise with what port to open. They left one word out. Internet traffic runs between a remote port on the server and a local port on the computer. They usually have different numbers. Zwift meant to say that your firewall should allow all incoming UDP traffic from remote port 3022. Local port can be about anything: the Zwift application changes it every time it resets the UDP connection, and it is never 3022.
Each firewall has its own interface, and the sequence of steps required to create a new rule varies from one to the next. Microsoft Windows Defender is a very common firewall because it comes standard with Windows. Defender makes it fairly difficult to configure the rule required for Zwift, but here is how it is done:
- Open Windows Defender Firewall
- Click on “Advanced Settings”
- Click on “Inbound Rules”
- Click on “New Rule”
- Pick the second radio button on the list, “Port”, and click “Next >”
- Pick radio buttons “UDP” and “All Local Ports” and click on “Next >”
- Pick radio button “Allow the connection” and click on “Next >”
- Leave “Domain”, “Private” and “Public” boxes checked and click on “Next >”
- Type in a description of your new rule into the “Name” field. For example, you could type “Zwift – allow inbound UDP traffic from remote port 3022”. Click “Finish”.
Now, your newly created rule will appear in the list of Inbound Rules. But we are not done yet! We still need to add the external port to the rule:
- Right-click on this new rule which you just created.
- Chose Properties from the context menu. This will open a tabbed window.
- Find the tab “Protocols and Ports”. Everything in this tab looks like it is greyed out, but in fact you can edit it.
- Change Remote Port to “Specific Ports” and type in “3022” on the line below. Then click OK.
The picture below shows what you should see on this last step. If you use a firewall from a different vendor, you should see something similar, although steps to it may be different.
Wrapping It Up
I run Zwift on a computer with a third-party firewall. I cannot speak to every existing firewall. I cannot tell if all of them, or many of them, are likely to create this issue. The expiration time of UDP state table entry may vary in a wide range. The firewall log is an easy way to find out if any rejections of traffic from external port 3022 have happened.
My experience so far has been that implementation of this rule alone, even without a change to my wi-fi configuration, seems to have fixed the issue. Before the change, we saw riders drop-outs in almost every TdZ race. After the rule was put in place, four races, several rides, and no issues!
So here we are: one persistent and intermittent issue, a week of investigations, two root causes, fixes found for both of them, and everything is now running smoothly. Back to racing!
P.S. A year ago, Eric published an article on a similar topic: how to ride all alone in Zwift. He intentionally blocked UDP connections by blocking outgoing traffic to remote port 3022. This stopped local Zwift from establishing UDP connection with the servers and thereby stopped both sharing of his information with other riders and other riders’ information with him. In my case, I intentionally unblocked incoming traffic coming from remote port 3022.
There is no harm to also explicitly allow outgoing traffic that is headed to remote UDP port 3022, but I did not see any firewall issues with outgoing traffic and did not see a need in yet another rule.
P.P.S. The tools which I used for troubleshooting:
- Zwift log files
- PingPlotter 5 (This is a commercial software. It automatically sends pings every few seconds, plots a graph of latency. and marks lost pings. Fully functional trial version is available)
- Firewall log files
- “WiFi Analyzer” program on my Android phone to check signal levels in my home wi-fi network. Similar programs available for iOS as well.
Questions or Comments?