Network World invited 18 vendors to participate in this test: Airespace (now being purchased by Cisco), Aruba, Avaya, Bluesocket, Chantry (purchased by Siemens after our tests concluded), Cisco, Colubris, Enterasys Networks, Extreme Networks, Foundry Networks, Legra, Meru Networks, Nortel, Proxim, Reefedge, Symbol, Trapeze and Vernier.

Only four vendors - Aruba, Chantry, Cisco, and Colubris - accepted our invitation.

We tested VOIP over wireless LANs (WLAN) in terms of voice quality and roaming times. For this project, the system under test (SUT) consisted of two 802.11b-capable access points and (optionally) a switch or router connecting the access points.

Besides the SUT, the test bed consisted of up to 14 WLAN handsets and the SVP Server H.323 gateway from SpectraLink Corp. as well as test and measurement equipment from VeriWave. We also used an impairment generator from Spirent Communications in some roaming tests.

VeriWave developed a VOIP Analysis Suite especially for this project. This application describes audio quality using R-value (see Part 1, an ITU specification derived from measurements of packet loss, jitter, and delay. The VeriWave software computes R-value from direct measurements of these other metrics. The tool also measures roaming times.

As noted in the R-value specification, there is a very strong correlation between subjective voice scoring methods such as Mean opinion scores and R-values.

We measured voice quality under a variety of conditions: With QoS disabled and enabled; with background data traffic present and absent; and with various numbers of concurrent calls (see Part 2). We also tested roaming, with all calls sent through a single access point, or sent across two access points and through a switch (see Part 3).

Starting simple
The simplest tests were baselines involving a single access point with QoS disabled. We brought up one call and measured audio quality over a 30-second period. Because the SpectraLink handsets use G.711 codecs, the amount of bandwidth consumed by the handsets is a constant regardless of the amount of noise on the line. Because the amount of data used by one call is relatively small - about 67 kbit/s in each direction, or 134 kbit/s total on one 802.11b channel - we did not conduct one-call tests with background data. In this test, we measured R-value, average and maximum delay, and jitter.

We re-ran the test with six and seven concurrent calls, both with and without background data present. In tests with data, we configured the VeriWave test instrument to generate a stream of User Datagram Protocol (UDP) packets at a rate of 1 Mbit/s.

We then enabled QoS enforcement and repeated all the various test cases. Once this we complete, we then re-ran all tests - both with and without QoS - using two access points, with half the handsets on each access point.

To test roaming, we began by having all handsets associate to one access point and brought up calls. We then powered up a second access point and verified that no handsets "pre-roamed" prior to launching our test. Once the test measurement was under way, we powered off the first access point, thus forcing all handsets to roam to the second access point.

The VeriWave test equipment calculated roaming time at the application layer. It noted the interval between the last packet seen on a call on the first access point to the first packet seen from the other phone on the second access point. This application-layer focus meant the access points not only had to re-associate handsets but also resynchronise with the SpectraLink call server.

QoS enforcement: What happened?
The QoS test was the most surprising. Three of the four vendors in this test failed to protect voice traffic in the presence of data, even using QoS mechanisms specifically intended to do so. Why did these QoS mechanisms fail, even with the vendors' best experts configuring them?

There are a number of factors that might explain our results.

  • Timing is everything. The importance of keeping delay and jitter low can't be overemphasised when it comes to voice traffic. Some QoS mechanisms work in terms of bandwidth and frame loss; if a given traffic class consumes more than a set amount of bandwidth, packets belonging to that class get dropped. That's not sufficient for voice. Even "strict priority" mechanisms, which always service a given traffic class first regardless of the consequences for other classes, only work properly if they receive high-priority packets in the first place. That's not so easy to do with 802.11 traffic. The IEEE protocols involve large amounts of management traffic and also require that every data frame be acknowledged. At the same time, the SpectraLink phones send out one frame about every 30ms and seem to falter anytime four or five packets in a row are dropped. These twin constraints make it critical that wireless LAN (WLAN) systems nail down the timely delivery of as much traffic as possible. In our tests, only the Aruba system did that.
  • It's not about the bandwidth. At most, we threw less than 3 Mbit/s of traffic at each system - and that includes both voice and data. Given that four vendors cracked the 6 Mbit/s mark in previous tests, the initial suspicions of some vendors that we simply overloaded the systems with data don't hold up. This test emphasized timely servicing of high-priority traffic, not high data rates.
  • Access points aren't hot rods. The typical "thin access point" consists of a relatively modest CPU, a limited amount of RAM and firmware. By itself, those components aren't enough to ensure timely traffic delivery. Instead, vendors must rely on precise scheduling mechanisms in their switches (as Aruba does); reduce the number of concurrent calls (which improves performance, as we will discuss in covering tests with two access points); or deploy voice-only WLANs and don't allow data clients (not a practical alternative for most corporations).