Based on recent events, I have been able to spend a tremendous amount of time working with and thinking about Quality of Service, and exactly what it means to us as network engineers. Modern networks support so many types of traffic beyond what we would normally consider “data types,” like file sharing, email, or web traffic that concepts like QoS have transitioned from useful tools needed to overcome issues associated with bandwidth deficits, to almost surgical implements that allow us to guarantee the delivery of sensitive traffic like voice and video.
That left me thinking about exactly what kind of communication obstacles we are trying to overcome with QoS, and what is the methodology behind QoS deployment. It only makes sense that we need to understand what necessitates QoS in the first place before we even try to understand what the QoS suite of features actually does.
Shortage of Bandwidth
Bandwidth, or the lack of bandwidth, is the single largest contributing factor leading to the need for QoS in a modern network. Even today where we have access to high speed internetwork connections like Metro-Ethernet or Fiber Optic Cabling to the service provider, we still find ourselves having to prioritize, filter, and guarantee one type of packet type over others. Yes an infinite amount of bandwidth can cure most if not all of our network problems; but bandwidth, like money, is not in infinite supply. In the networking world, bandwidth equates to throughput. The single highest concentration of packets that can be transported across a given media in a fixed amount of time is often measured in bits-per-second (bps). Some technologies operate using fixed-rate values (like FastEthernet or GigabitEthernet), where others have variable-rates (like Frame-Relay or ATM). Initially, it would be easy to think of bandwidth as the sole contributor to needing Quality of Service, but there are other values that are part of the equation. This leads us the next concept.
Delay is defined as the time it takes for packets to be delivered end-to-end across the network. Delay is the broadest of the topics that we will be discussing, because there are so many types of delay and so many places in the network where delay is generated. What is more important to note is that all the different types of delay add up. The sum of all the delays in the data path is referred to the as end-to-end delay or sometimes as total network latency. We will take a critical look at each of the individual types of delay:
This type of delay regards the amount of time needed for an interface to encode data on a physical media (sometimes called the wire). Serialization delay is a constant value associated with the maximum assigned speed of an interface, and the total amount of data that will traverse that interface. As an example, it would take a 64K Serial connection 2 seconds to serialize 128,000 bits of data. If our network has 10 of these serial connections between the source and the destination, there would 20 seconds of Serialization Delay added to our end-to-end delay. But serialization delay is not the only factor we have to consider.
Where serialization delay is how fast data can be placed on the “wire,” propagation delay is the latency induced based on how fast a single bit of that data can travel across the “wire.” Obviously there are many different values associated with different types of media, and there is a complicated formula where the physical length the media in meters is divided by a value approximate to the speed of light. The important thing is that we now know that it takes time to place packets on the physical media, and it takes time for those packets to travel across the physical media to the next device.
Packets cannot be encoded instantly, or transported across the physical layer topology instantly; they certainly cannot be processed instantly. Our routers need time to route data, and our switches need time to switch data; the amount of time it takes to process data on a switch or a router depends on a number of factors including but not limited to CPU speed, Memory, and whether the data fast-switched, process-switched or maybe CEF-switched. All these values add up to how much time it takes for a device to physically move a packet from an inbound port to an outbound port. Other values, like how congested the links needed to move the packets or the overall size of a devices routing table, also come into play. As an example, it will take a route longer to “parse” a long routing table than a shorter one; thus it will take it longer to find what interface to use to send the packet out. Where serialization delay was additive across all the interfaces, forwarding delay is additive across all devices in the data path.
Sometimes, even in the best designed networks in the world, packets simply arrive in less than satisfactory ways. Sometimes this means they arrive out of order, or there is a significant variation in the delay associated with some packets. The variable delay value is referred to as “jitter,” and it can have a devastating impact on applications that rely on the timely delivery of a continuous stream of IP packets. Applications that fall in this category include voice over IP (VoIP) and video conferencing. Imagine what would happen if the packets for a VoIP call arrived out of order and played out that way. There are solutions like “de-jitter” buffers that can be used to queue up the packets long enough for the router to put them in the right order, or the play them back in a uniform fashion, but even these take time and thus add to our delay. This leaves us with the last form or delay we will discuss.
We discussed processing/forwarding delay; now we need to look at what happens when the router and its interfaces are congested with traffic. This means that traffic has arrived to be sent, but it has to wait for traffic that was previously queued to be serialized onto the “wire.” In scenarios where data is being serialized out an interface and additional data arrives on a device that needs to exit that same interface, we encounter a situation where data will be dropped. This process is referred to as “tail-drop,” and it can cause other issues if that data is something like TCP traffic where the sender will resend the data causing even more congestion. This is where we will discuss our first QoS mechanism. Software queuing or fancy queuing is where the device will set aside a portion of memory to store packets while they wait to be transmitted. The size or depth of this queue is not infinite, and if it fills up, we will see tail drops; but right now we need to understand that just like all the other process we have discussed, this one also takes time, and as such introduces delay. I know you just said, “But I thought QoS was meant to combat delay, not cause it?” My answer to that is, “everything in networking comes at a cost!” Yes, using queuing will introduce additional delay in our end-to-end latency, but that small amount of delay may serve to prevent something more devastating or annoying like packet loss.
Packet loss occurs as a result of congestion. We discussed the idea of “tail-drop” and how it can occur during periods of congestion when there are no software queues to hold packets, or when those software queues become full. The loss of packets is not a good thing, but some applications are more resilient than others and have mechanisms that allow them to recover the loss of a large number of packets. Other applications like voice can be brought their knees if they lose more than 1% of their packets. So again, we may need to offer voice “unfair preferential treatment” over other data types. This will be the recurring theme in QoS: take from one to give to another. But keep in mind even this process can and will add delay of some type.
When it comes to QoS, there are three principle methods for implementation:
In a nutshell, best-effort QoS means essentially “no” QoS. All packets are forwarded in a first-in first-out (FIFO) fashion. The key factor is that no traffic is provided any preferential treatment. This method of QoS is default on high speed interfaces on Cisco IOS devices, and for all practical purposes is very scalable and easy to configure. The internet is a perfect example of a Best-Effort network.
Integrated Services (IntServ)
When we discuss Integrated Services QoS we are talking about an end-to-end guarantee of bandwidth. Sometimes this is referred to as “hard” QoS. Integrated Services QoS relies on application signaling to create a fixed allocation or reservation of resources on all devices in the transit path from the source to the destination. If these resources cannot be “reserved” then the request is denied, and the application will not operate. A great example of an IntServ protocol is the Resource Reservation Protocol (RSVP).
IntServ requires the utilization of some type of Admission Control Protocol, and therefore all devices in the path end-to-end must support that protocol. This method of QoS is not very scalable because there is only a finite amount of resources that can be reserved, and due to the overhead added to all devices in the data plane needed to maintain “Stateful” traffic flow information.
Differentiated Services (DiffServ)
At last we have a scalable QoS solution. DiffServ replaced the need to rely on applications to signal their requirement for resource reservation, and replaced it with the notion of classification and marking. Now as administrators, we have the capability to select traffic of interest (classification) and signal to downstream devices (marking) how this traffic should be treated. How this traffic is treated is specified by the creation of policies that are applied on a hop-by-hop basis throughout the network. This per-hop approach allows us to assure a specific level of service to each of the traffic flows we have specified. The important thing to keep in mind regarding this form of QoS is that for all its scalability, it is considered a “soft QoS,” meaning that it does not absolutely guarantee resources like IntServ does. This means DiffServ QoS may designed end-to-end, but it is implemented on a hop-by-hop basis.
There is a whole world of scenarios where QoS can help us deliver business or health critical information in a reliable fashion. The important things to remember is that no matter how creatively we apply QoS, we can provide more resources than we have available. What we can do is offer some types of traffic better access to resources over others. The deployment of QoS may actually be defined as the choice between numbers of lesser problems in favor of overcoming a greater problem. More often than not, QoS will be a temporary measure in environments where budget constraints or service provider capabilities will not stretch to meet bandwidth needs. The key to mastering QoS is as much about understanding what it cannot do as much as it is about knowing what it can do.