It's the Latency, Stupid

Stuart Cheshire, May 1996.

(Revised periodically)

Copyright © Stuart Cheshire 1996-2001

Years ago David Cheriton at Stanford taught me something that seemed very obvious at the time -- that if you have a network link with low bandwidth then it's an easy matter of putting several in parallel to make a combined link with higher bandwidth, but if you have a network link with bad latency then no amount of money can turn any number of them into a link with good latency.

It's now many years later, and this obvious fact seems lost on the most companies making networking hardware and software for the home. I think it's time it was explained again in writing.

Fact One: Making more bandwidth is easy.

Imagine you live in a world where the only network connection you can get to your house is a 33kbit/sec modem running over a telephone line. Imagine that this is not enough for your needs. You have a problem.

The solution is easy. You can get two telephone lines, and use them together in parallel, giving you a total of 66kbit/sec. If you need even more you can get ten telephone lines, giving you 330kbit/sec. Sure, it's expensive, and having ten modems in a pile is inconvenient, and you may have to write your own networking software to share the data evenly between the ten lines, but if it was important enough to you, you could get it done.

It may not be cheap, but at least it's possible.

People with ISDN lines can already do this. It's called "bonding" and it uses two 56 (or 64) kbit/sec ISDN channels in parallel to give you a combined throughput of 112 (or 128) kbit/sec.

Fact Two: Once you have bad latency you're stuck with it.

If you want to transfer a large file over your modem it might take several seconds, or even minutes. The less data you send, the less time it takes, but there's a limit. No matter how small the amount of data, for any particular network device there's always a minimum time that you can never beat. That's called the latency of the device. For a typical Ethernet connection the latency is usually about 0.3ms (milliseconds -- thousandths of a second). For a typical modem link the latency is usually about 100ms, about 300 times worse than Ethernet.

If you wanted to send ten characters over your 33kbit/sec modem link you might think the total transmission time would be:

80 bits / 33000 bits per second = 2.4ms.

but it doesn't. It takes 102.4ms because of the 100ms latency introduced by the modems at each end of the link.

If you want to send a large amount of data, say 100K, then that takes 25 seconds, and the 100ms latency isn't very noticable, but if you want send a smaller amount of data, say 100bytes, then the latency is more than the transmission time.

Why would you care about this? Why do small pieces of data matter? For most end-users it's the time it takes to transfer big files that annoys them, not small files, so they don't even think about latency when buying products. In fact if you look at the boxes modems come in, they proudly proclaim "14.4 kbps", "28.8 kbps" and "33.6 kbps", but they don't mention the latency anywhere. What most end-users don't know is that in the process of transferring those big files their computers have to send back and forth hundreds of little control messages, so the performance of small data packets directly affects the performance of everything else they do on the network.

Now, imagine the same scenario as before. You live in a world where the only network connection you can get to your house is a modem running over a telephone line. Your modem has a latency of 100ms, but you're doing something that needs lower latency. Maybe you're trying to do computer audio over the net. 100ms may not sound like very much, but it's enough to cause a noticable delay and echo in voice communications, which makes conversation difficult. Maybe you're trying to play an interactive game over the net. The game only sends tiny amounts of data, but that 100ms delay is making the interactivity of the game decidedly sluggish.

What can you do about this?

Nothing.

You can compress the data, but it doesn't help. It was already small to start with, and that 100ms latency is still there.

You can get 80 phone lines in parallel, and send one single bit over each phone line, but that 100ms latency is still there.

Once you've got yourself a device with bad latency there's absolutely nothing you can do about it (except throw out the device and get something else).

Fact Three: Current consumer devices have appallingly bad latency.

A typical Ethernet card has a latency less than 1ms. The Internet backbone as a whole also has very good latency. Here's a real-world example:

So the Internet is doing pretty well. It may get better with time, but we know it can never beat the speed of light. In other words, that 85ms round-trip time to Boston might reduce a bit, but it's never going to beat 43ms. The speed's going to get a bit better, but it's not going to double. We're already within a factor of two of the theoretical optimum. I think that's pretty good. Not many technologies can make that claim.

Compare this with a modem. Suppose you're 18km from your ISP (Internet Service Provider). At the speed of light in fibre (or the speed of electricity in copper, which is about the same) the latency should be:

18000 / (180 x 10^6 m/s) = 0.1ms

The latency over your modem is actually over 100ms. Modems are currently operating at level that's 1000 times worse than the speed of light. They have a long way to go before they get close to what the rest of the Internet is achieving.

Of course no modem link is ever going to have a latency of 0.1ms. I'm not expecting that. The important issue is the total end-to-end transmission delay for a packet -- the time from the moment the software makes the call to send the packet to the moment the last bit of the packet arrives the destination and the packet delivered to the software at the receiving end. The total end-to-end transmission delay is made up of fixed latency (including the speed-of-light propagation delay), plus the transmission time. For a 36 byte packet the transmission time is 10ms (the time it takes to send 288 bits at a rate of 28800 bits per second). When the actual transmission time is about 10ms, working to make the latency 0.1ms would be silly. All that's needed is that the latency should not be so huge that it completely overshadows the transmission time. For a modem that has a transmission rate of 28.8kb/s, a sensible latency target to aim for is about 5ms.

Fact Four: Making limited bandwidth go further is easy.

If you know you have limited bandwidth, there are many techniques you can use to reduce the problem.

Compression

If you know you have limited bandwidth, compression is one easy solution.

You can apply general purpose compression, such as gzip, to the data.

Even better, you can apply data-specific compression, because that gets much higher compression ratios. For example, still pictures can be compressed with JPEG, Wavelet compression, or GIF. Moving pictures can be compressed with MPEG, Motion JPEG, Cinepak, or one of the other QuickTime codecs. Audio can be compressed with uLaw, and English text files can be compressed with dictionary-based compression algorithms.

All of these compression techniques trade off use of CPU power in exchange for lower bandwidth requirements. There's no equivalent way to trade off use of extra CPU power to make up for poor latency.

All modern modems have compression algorithms built-in. Unfortunately, having your modem do compression is nowhere near as good as having your computer do it. Your computer has a powerful, expensive, fast processor in it. Your modem has a feeble, cheap, slow processor in it. There's no way your modem can compress data as well or as quickly as your computer can. In addition, in order to compress data, your modem has to hold on to the data until it has a block that's big enough to compress effectively. That adds latency, and once added, there's no way for you to get rid of latency. Also, the modem doesn't know what kind of data you are sending, so it can't use the superior data-specific compression algorithms. In fact, since most images and sounds on Web pages are compressed already, the modem's attempts to compress the data a second time is futile, and just adds more latency without giving any benefit.

This is not to say that having a modem do compression never helps. In the case where the host software at the endpoints is not very smart, and doesn't compress its data appropriately, then the modem's own compression can compensate somewhat for that deficiency and improve throughput. The point is that modem compression only helps dumb software, and it actually hurts smart software by adding extra delay. For someone planning to write dumb software this is no problem. For anyone planning to write smart software this should be a big cause for concern.

Bandwidth-conscious code

Another way to cope with limited bandwidth is to write programs that take care not to waste bandwidth.

For example, to reduce packet size, wherever possible Bolo uses bytes instead of 16-bit or 32-bit words.

For many kinds of interactive software like games, it's not important to carry a lot of data. What's important is that when the little bits of data are delivered, they are delivered quickly. Bolo was originally developed running over serial ports at 4800 bps and could support 8 players that way. Over 28.8 modems it can barely support 2 players with acceptable response time. Why? A direct-connect serial port at 4800 bps has a transmission delay of 2ms per byte, and a latency that is also 2ms. To deliver a typical ten byte Bolo packet takes 22ms. A 28800 bps modem has transmission delay of 0.28ms per byte, but a latency of 100ms, 50 times worse than the 4800 bps serial connection. Over the 28.8 modem, it takes 103ms to deliver a ten byte packet.

Send less data

A third way to cope with limited bandwidth is simply to send less data.

If you don't have enough bandwidth to send high resolution pictures, you can use lower resolution.

If you don't have enough bandwidth to send colour images, you can send black and white images, or send images with dramatically reduced colour detail (which is actually what NTSC television does).

If you don't have enough bandwidth to send 30 frames per second, you can send 15fps, or 5fps, or fewer.

Of course these tradeoffs are not pleasant, but they are possible. You can either choose to pay more money to run multiple circuits in parallel for more bandwidth, or you can choose to send less data to stay within the limited bandwidth you have available.

If the latency is not good enough to meet your needs you don't have the same option. Running multiple circuits in parallel won't make your latency any better, and sending less data won't improve it either.

Caching

One of the most effective techniques throughout all areas of computer science is caching, and that is just as true in networking.

If visit a web site, your Web browser can keep a copy of the text and images on your computer's hard disk. If you visit the same site again, all your Web browser has to do check that the copies it has stored are up to date -- i.e. check that the copies on the Web server haven't been changed since the date and time the previous copies were downloaded and cached on the local disk.

Checking the date and time a file was last modified is a tiny request to send across the network. This kind of request is so small that the throughput of your modem makes no difference -- latency is all that matters.

Recently companies have started providing CDROMs of entire Web sites to speed Web browsing. When browsing these Web sites, all the Web browser has to do is check the modification date of each file it accesses to make sure that the copy on the CDROM is up to date. It only has to download files that have changed since the CDROM was made. Since most of the large files on a Web site are images, and since images on a Web site change far less frequently than the HTML text files, in most cases very little data has to be transferred.

Since for the most part the Web browser is only doing small modification date queries to the Web server, the performance the user experiences is entirely dominated by the latency of the connection, and the throughput is virtually irrelevant.

Another analogy

Even smart people have trouble fully grasping the implications of these latency issues. It's subtle stuff.

The Cable TV industry is hyping "cable modems" right now, claiming that they're "1000 times 'faster' than a telephone modem." Given the lack of public awareness of the importance of latency, I wouldn't be in the least surprised if many of them have latency that is just as bad, or maybe even worse, than telephone modems. (The results from some early prototype cable modems, however, look quite promising. Lets hope the production ones are as good.)

Another device in a similar position is the DirecPC satellite dish, which is supposed to be "14 times faster than a 28.8 modem". Is it really? Here are some excerpts of what Lawrence Magid had to say about it in his article in the San Jose Mercury News (2nd February 1997):

The system is expensive, requires a relatively elaborate installation and configuration and, in the end, doesn't necessarily speed up your access to the World Wide Web.

I set up two nearly identical PCs side by side. One was connected to the Net at 28.8kbps and the other with DirecPC. In most cases the satellite system displayed Web pages a bit faster than the one with a modem, but not by much.

In some cases, the modem-equipped PC was faster, especially with sites that don't have a great deal of graphics.

Alluring as its promise may be, DirecPC for now doesn't offer spectacular advantages for normal Web surfing, even though it does carry a spectacular price.

Do we see a pattern starting to emerge yet?

Part of the problem here is misleading use of the word "faster".

Would you say that a Boeing 747 is three times "faster" than a Boeing 737? Of course not. They both cruise at around 500 miles per hour. The difference is that the 747 carries 500 passengers where as the 737 only carries 150. The Boeing 747 is three times bigger than the Boeing 737, not faster.

Now, if you wanted to go from New York to London, the Boeing 747 is not going to get you there three times faster. It will take just as long as the 737.

In fact, if you were really in a hurry to get to London quickly, you'd take Concorde, which cruises around 1350 miles per hour. It only seats 100 passengers though, so it's actually the smallest of the three. Size and speed are not the same thing.

On the other hand, If you had to transport 1500 people and you only had one aeroplane to do it, the 747 could do it in three trips where the 737 would take ten, so you might say the Boeing 747 can transport large numbers of people three times faster than a Boeing 737, but you would never say that a Boeing 747 is three times faster than a Boeing 737.

That's the problem with communications devices today. Manufacturers say "speed" when they mean "capacity". The other problem is that as far as the end-user is concerned, the thing they want to do is transfer large files quicker. It may seem to make sense that a high-capacity slow link might be the best thing for the job. What the end-user doesn't see is that in order to manage that file transfer, their computer is sending dozens of little control messages back and forth. The thing that makes computer communication different from television is interactivity, and interactivity depends on all those little back-and-forth messages.

The phrase "high-capacity slow link" that I used above probably looked very odd to you. Even to me it looks odd. We've been used to wrong thinking for so long that correct thinking looks odd now. How can a high-capacity link be a slow link? High-capacity means fast, right? It's odd how that's not true in other areas. If someone talks about a "high-capacity" oil tanker, do you immediately assume it's a very fast ship? I doubt it. If someone talks about a "large-capacity" truck, do you immediately assume it's faster than a small sports car?

We have to start making that distinction again in communications. When someone tells us that a modem has a speed of 28.8 kbit/sec we have to remember that 28.8 kbit/sec is its capacity, not its speed. Speed is a measure of distance divided by time, and 'bits' is not a measure of distance.

I don't know how communications came to be this way. Everyone knows that when you buy a hard disk you should check what its seek time is. The maximum transfer rate is something you might also be concerned with, but the seek time is definitely more important. Why does no one think to ask what a modem's 'seek time' is? The latency is exactly the same thing. It's the minimum time between asking for a piece of data and getting it, just like the seek time of a disk, and it's just as important.

Lessons to learn

ISDN has a latency of about 10ms. Its throughput may be twice that of a modem, but its latency is ten times better, and that's the key reason why browsing the web over an ISDN link feels so much better than over a modem. If you have the option of ISDN, and a good ISP that supports it, and it is not too expensive in your area, then get it.

One of the reasons that telephone modems have such poor latency is that they don't know what you're doing with your computer. An external modem is usually connected through a serial port. It has no idea what you are doing, or why. All it sees is an unstructured stream of bytes coming down the serial port.

Ironically, the Apple Geoport telecom adapter, which has suffered so much criticism, may offer an answer to this problem. The Apple Geoport telecom adapter connects your computer to a telephone line, but it's not a modem. All of the functions of a modem are performed by software running on the Mac. The main reason for all the criticism is that running this extra software takes up memory slows down the Mac, but it could also offer an advantage that no external modem could ever match. Because when you use the Geoport adapter the modem software is running on the same CPU as your TCP/IP software and your Web browser, it could know exactly what you are doing. When your Web browser sends a TCP packet, there's no need for the Geoport modem software to mimic the behaviour of current modems. It could take that packet, encode it, and start sending it over the telephone line immediately, with almost zero latency.

Sending 36 bytes of data, a typical game-sized packet, over an Apple Geoport telecom adapter running at 28.8kb/s could take as little as 10ms, making it as fast as ISDN, and ten times faster than the current best modem you can buy. For less than the price of a typical modem the Geoport telecom adapter would give you Web browsing performance close to that of ISDN. Even better, all the people who already own Apple Geoport telecom adapters wouldn't need to buy anything at all -- they'd just need a software upgrade. Even better, Microsoft wouldn't be able to just copy it for Windows like they do with everything else they see on the Mac, because Wintel clones don't have anything like a Geoport for Microsoft to use. What a PR triumph for Apple that would be! It really would show that Apple is the company that understands the Internet. I'm know that in practice there would be other factors that prevent us from getting the delay all the way down to 10ms, but I'm confident that we could get a long way towards that goal.

So far Apple has shown no interest in making use of this opportunity.

Bandwidth Still Matters

Having said all this, you should not conclude that I believe that bandwidth is unimportant. It is very important, but in a way that most people do not think of. Bandwidth is important not only for it's own sake, but also for it's effect on overall latency. As I said above, the important issue is the total end-to-end transmission delay for a packet.

Many people believe that having a private 64kb/sec ISDN connection is just as good, or even better than having a 1/150 share of a 10MB/sec Ethernet. Telephone companies argue that ISDN is just as good as new technologies like cable modems, because while cable modems have much higher bandwidth, that bandwidth is shared between lots of users, so the average works out the same. This idea, that you can average packets as if they were a fluid in a pipe, is flawed, as the following example will show:

Say we have a game where the state of the virtual world amounts to 40K of data. We have a game server, and in this simple example, the game server transmits the entire current game state to the player once every 10 seconds. That's 40K every 10 seconds, or an average of 4K/sec, or 32kb/sec. That's only half the capacity of a 64kb/sec ISDN line, and 150 users doing this on an Ethernet is only half the capacity of the Ethernet. So far so good. Both links are running at only 50% capacity, so the performance should be the same, right? Wrong. On the Ethernet, when the server sends the 40K to a player, the player can receive that data as little as 32ms later (320kb / 10Mb/sec). If the server is not the only machine sending packets on the Ethernet, then there could be contention for the shared medium, but even in that case the average delay before the player receives the data is only 64ms. On the ISDN line, when the server sends the 40K to a player, the player receives that data 5 seconds later (320kb / 64kb/sec). In both cases the users have the same average bandwidth, but the actual performance is very different. In the Ethernet case the player receives the data almost instantly, but in the ISDN case, by the time the player gets the game information it is already 5 seconds out of date.

The standard mistake is to assume that a 40K chunk every ten seconds and a uniform rate of 4K/second are the same thing. They're not. If they were then ISDN, ATM, and all the other telephone company schemes would be good ideas. The telephone companies assume that all communications are like the flow of fluid in a pipe. You just tell them the rate of flow you need, and they tell you how big the pipe has to be. Audio streams, like voice, are like the flow of fluid in a pipe, but computer data is not. Computer data comes in lumps. The standard mistake is to say that if I want to send 60K of data once per minute, that's exactly the same as sending 1K per second. It's not. A 1K per second connection may be sufficient *capacity* to carry the amound of data you're sending, but that doesn't mean it will deliver the 60K lump of data in a timely fashion. It won't. By the time the lump finishes arriving, it will be one minute old. Just because you don't send data very often doesn't mean you want it delivered late. You may only write to your aunt once a year, but that doesn't mean that on the occasions when you do write her a letter you'd like it to take a year to be delivered.

The conclusion here is obvious. If you're given the choice between a low bandwidth private connection, or a small share of a larger bandwidth connection, take the small share.

Again, this is painfully obvious outside the computer world. If a politician said they would build either a large shared freeway, or a million tiny separate private footpaths, one reserved for each citizen, which would you vote for?

Survey

A lot of people have sent me e-mail disputing what I've said here. A lot of people have sent me e-mail simply asserting that their modem isn't slow at all, and the slow performance they see is due to the rest of the Internet being slow, not their modem link.

To try to get to the truth of the matter, I'm going to do a small-scale survey. If you think your modem has low latency, please try an experiment for me. Run a "traceroute" to some destination a little way away. On the West coast of the US lcs.mit.edu might be a good host to trace to. From the East coast of the US you can trace to core-gateway.stanford.edu. In other places, pick a host of your choice (or use one of those two if you like).

On Unix, you can run a trace by typing "traceroute " (if you have traceroute installed). On the Mac, get Peter Lewis's Mac TCP Watcher and click the "Trace" button. On Windows '95, you have to open a DOS window and type a command like in Unix, except on Windows '95 the "traceroute" command is called "TRACERT". Jack Richard wrote a good article about traceroute for Boardwatch Magazine.

When you get your trace, send it to me, along with any other relevant information, like what brand of modem you're using, what capacity of modem (14.4/28.8/33k/64k ISDN, etc.), whether it is internal or external, what speed serial port (if applicable), who your Internet Service Provider is, etc.

I'll collect results and see if any interesting patterns emerge. If any particular brands of modems and/or ISPs turn out to have good latency, I'll report that.

To start things off, here's my trace:

Name:  Stuart Cheshire
Modem: No modem (Quadra 700 built-in Ethernet)
ISP:   BBN (Bolt, Beranek and Newman)

Hop      Min    Avg    Max    IP              Name
 1  3/3  0.003  0.003  0.004  36.186.0.1      jenkins-gateway.stanford.edu
 2  3/3  0.003  0.006  0.013  171.64.1.161    core-gateway.stanford.edu
 3  3/3  0.004  0.004  0.004  171.64.1.34     sunet-gateway.stanford.edu
 4  3/3  0.003  0.003  0.004  198.31.10.3     su-pr1.bbnplanet.net
 5  3/3  0.004  0.004  0.005  4.0.1.89        paloalto-br1.bbnplanet.net
 6  2/3  0.006  0.006  0.007  4.0.1.62        oakland-br1.bbnplanet.net
 7  3/3  0.036  0.036  0.037  4.0.1.134       denver-br1.bbnplanet.net
 8  3/3  0.036  0.160  0.406  4.0.1.190       denver-br2.bbnplanet.net
 9  3/3  0.056  0.058  0.059  4.0.1.130       chicago1-br1.bbnplanet.net
10  3/3  0.056  0.058  0.059  4.0.1.194       chicago1-br2.bbnplanet.net
11  3/3  0.076  0.077  0.078  4.0.1.126       boston1-br1.bbnplanet.net
12  3/3  0.076  0.076  0.076  4.0.1.182       boston1-br2.bbnplanet.net
13  3/3  0.077  0.077  0.078  4.0.1.158       cambridge1-br2.bbnplanet.net
14  3/3  0.080  0.081  0.083  199.94.205.1    cambridge1-cr1.bbnplanet.net
15  3/3  0.080  0.145  0.212  192.233.149.202 cambridge2-cr2.bbnplanet.net
16  3/3  0.079  0.081  0.084  192.233.33.3    ihtfp.mit.edu
17  3/3  0.083  0.096  0.104  18.168.0.6      b24-rtr-fddi.mit.edu
18  3/3  0.082  0.082  0.084  18.10.0.1       radole.lcs.mit.edu
19  3/3  0.082  0.085  0.089  18.26.0.36      mintaka.lcs.mit.edu

You can see it took my Mac (Quadra 700 running Open Transport) 3ms to get to jenkins-gateway. This is not particularly fast. With a good Ethernet interface it would be less than 1ms. From there, it took 1ms to get to paloalto-br1 (near to Stanford) and another 2ms to get to oakland-br1 (across the bay from San Francisco).

From oakland-br1 to denver-br1 took 30ms, from denver-br1 to chicago1-br1 took 20ms, and from chicago1-br1 to boston1-br1 took another 20ms.

The last stretch from boston1-br1 to mintaka.lcs.mit.edu took another 6ms.

So to summarise where the time's going, there's 6ms spent at each end, and 70ms spent on the long-haul getting across the country. Remember those are round-trip times -- the one-way times are half as much.

Now, let's find out what the breakdown looks like when we try the same experiment with a modem. Send in your results! Hopefully we'll find at least one brand of modem that has good latency.

Note: October 1997. Now that I've got a decent collection of results, please only send me your results if they're a lot faster (or slower) than what's already on the list. Also, please send me results only for consumer technologies. If you're company has a T-1 Internet connection, or if you are a student in University houseing with a connection even faster than that, then it's not a great suprise to find that your connection has good latency. My goal here is to find what consumer technologies are available that offer good latency.

Are we there yet?

The good news is that since I first wrote this rant I've started to see a shift in awareness in the industry. Here are a couple of examples:

From Red Herring, June 1997, page 83, Luc Hatlestad wrote:

Matthew George is the vice president of techhnology at Engage... To Mr George, latency issues are more about modems than about network bandwidth. "Network latency in and of itself is not material to game playing; at least 70 to 90 percent of latency problems we see are due to the end points: the modems," he says.

From MacWeek, 12th May 1997, page 31, Larry Stevens wrote about the new 56k modems:

Greg Palen, director of multimedia at Galzay Marketing Group, a digital communications, prepress and marketing company in Kansas City, Kan., is one of those taking a wait-and-see attitude. "We can always use more bandwidth, but modem speed is not the primary issue at this point. The main issue is latency.

Some modem makers are finally starting to care about latency. One major modem manufacturer has contacted me, and we've been investigating where the time goes. It seems that there is room for improvement, but unfortunately modems will never be able to match ISDN. The problem is that over a telephone line, electrical signals get "blurred" out. In order to decode just one single bit, a 33.6kb/s modem needs to take not just a single reading of the voltage on the phone line at that instant, but that single reading plus another 79 like it, spaced 1/6000 of a second apart. A mathematical function of those 80 readings gives the actual result. This process is called "line equalization". Better line equalization allows higher data rates, but the more "taps" the equalizer has the more delay it adds. The V.34 standard also specifies particular scrambling and descrambling of the data, which also take time. According to this company, the theoretical best round-trip delay for a 14.4kb/s modem (no compression or error recovery) should be 40ms, and for a 33.6kb/s modem 64ms. The irony here is that as the capacity goes up, the best-case latency gets worse instead of better. For a small packet, it would be faster for your modem to send it at 9.6kb/s than at 33.6kb/s!

I don't know what the theoretical optimum for a 56kb/s modem is. The sample rate with these is 16000 times per second (62.5us between samples) but I don't know how many taps the equalizer has.

Further Reading


Page maintained by Stuart Cheshire
(Check out my latest construction project: Swimming pool by Swan Pools)