QoS (Quality of Service) Features–Part I

What Is QoS?


QoS stands for Quality of Service. The term is being used by Cisco to refer to IP-based features that allow specification and delivery of services much like the Quality of Service features in ATM.

When you get right down to it, there isn’t all that much a router can do to control traffic, since it is not the originator of most of the traffic. The router can drop traffic — although we’d prefer it didn’t do so. It can put some queued frames out an interface before others. It can be selective about accepting traffic — another form of dropped traffic. And, with TCP, it can selectively drop the occasional packet as an indirect signal to slow down. With cooperative hosts, the router can try to accept reservations and hold bandwidth for applications that need it.

Acronyms, features or topics that fall under QoS include: Priority Queuing (PQ), Custom Queuing (CQ), Fair and Weighted Fair Queuing (WFQ), Random Early Detection (RED) and its Distributed, Weighted variant (DWRED), Resource Reservation Protocol (RSVP), Traffic Shaping, Committed Access Rate (CAR), Policy Routing, QoS Policy Propagation via BGP (QPPB), NetFlow, and Cisco Express Forwarding (CEF).

The first set of functions relate to queuing, to managing congestion. They are sometimes referred to as “Fancy Queuing”. These include Priority Queuing (PQ), Custom Queuing (CQ), and Weighted Fair Queuing (WFQ). These features allow the router to control which frames are sent first on an interface. If there are too many frames (congestion), then we are, in effect, also selecting which frames get dropped.

These functions, Priority Queuing (PQ), Custom Queuing (CQ), and Weighted Fair Queuing (WFQ), are the subject of this article. They are also discussed as one small part of the Cisco certified ACRC course.

The next feature on the list, Weighted Random Early Detection, is intended to prevent or reduce congestion — trying to reduce problems, rather than mitigating the consequences once the problem has already occurred.

RSVP allows for applications to reserve bandwidth, primarily WAN bandwidth. It is designed to work with WFQ or Traffic Shaping on the outbound interface.

Traffic Shaping and Committed Access Rate (CAR) control traffic. It seems like a better acronym could have been chosen: CAR controls traffic? Anyway, CAR controls the rate of inbound traffic, allowing specification of what to do with traffic that is coming in faster than policy. Traffic Shaping paces outbound traffic, controlling use of bandwidth. Traffic Shaping also allows matching the speed of the output access link across a WAN cloud, so that a faster central hub access circuit doesn’t cause carrier or remote link congestion.

CAR, Policy Routing, and QPPB can also set the IP precedence bits (TOS bits), which are used by some of the above mechanisms to favor some traffic over other traffic.

Finally, NetFlow and CEF are switching techniques used in high-performance routers. They assist in providing QoS by providing efficient packet delivery and statistics on the traffic, statistics to manage traffic flow, trunk sizing, and network design  with.

Priority Queuing


About Priority Queuing

Priority Queuing is the oldest of the queuing techniques. Traffic is prioritized with a priority-list, applied to an interface with a priority-group command. The traffic goes into one of four queues: high, medium, normal, or low priority. When the router is ready to transmit a packet, it searches the high queue for a packet. If there is one, it gets sent. If not, the medium queue is checked. If there is a packet, it is sent. If not, the normal, and finally the low priority queues are checked. For the next packet, the process repeats. If there is enough traffic in the high queue, the other queues may get starved: they never get serviced.

You can regard Priority Queuing as being drastic. It says that the high priority traffic must go out the interface at all costs, and any other traffic can be dropped. It is generally intended for use on low bandwidth links.


Configuring Priority Queuing


To assign traffic meeting certain characteristics to a queue (high, medium, normal, or low), use one of the following commands:

The first of these takes a protocol, like ip, ipx, appletalk, rsrb, dlsw, etc., to classify traffic. The queue-keyword can be one of: fragments, gt, lt, list, tcp, and udp. The keyword-value specifies the port for tcp or udp, or the size for gt (greater than) and lt (less than). The word list allows you to specify an access list characterizing the traffic. And fragments means just that, IP fragments (which should probably get expedited handling, so as to not have to retransmit all the fragments again if one is lost).

The second command above is similar, but classifies traffic based on the interface it arrived on.

The list-number is any number in the range 1-16. All statements in one policy use the same number.

To change the default queue for all other traffic:

To change the queue sizes from the defaults 20, 40, 60, 80 (don’t go overboard on this if you see output drops, you may make things worse):

To apply the priority queueing policy for outbound packets on an interface:

Relevant EXEC Commands

Sample Configuration

The following configuration sets up a priority list where DLSw traffic goes into the high priority traffic, as does telnet transmissions. The remaining IP that matches access list 101 goes to the medium queue, and any thing else goes in the low queue. (Standard joke: you’ve planned to send your boss’s traffic into the low queue, to make sure the congestion gets noticed). You’ve mildly upped the default queue sizes. And this policy is in effect for packets being sent out serial 0.

Custom Queuing

About Custom Queuing

Custom Queuing uses 17 queues to divide up bandwidth on an interface. Queue 0, the system queue, is always serviced first. It is used for keepalives and other critical interface traffic. The remaining traffic can be assigned to queues 1 through 16. These queues are serviced in round-robin fashion.

Here’s how it works. Packets are sent from each queue in turn. As each packet is sent, a byte counter is incremented. When the byte counter exceeds the default or configured threshold for the queue, transmission moves on to the next queue. The byte count total for the queue that just finished has the threshold value subtracted from it, so that it starts its next turn penalized by the number of bytes that it went over its quota. This provides additional fairness to the mechanism.

If you think about it, you can’t send half of a packet. That’s why this mechanism might well exceed quota on any given round of transmission from a queue. But on the next round, the queue is penalized for taking more than it’s fair share, so in the long run it averages out.

Custom Queuing is aimed at fair division of bandwidth. For instance, you might set it up to allow IP roughly 50% of a link, DLSw 25%, and IPX 25%. When congestion is taking place, the limits are enforced. If there is unused bandwidth, say from IPX, it is divided equally among any excess traffic from the other classes of traffic, IP and DLSw. To implement this, you would tweak the thresholds for the relevant queues, say making them 3000, 1500, and 1500 bytes respectively. Some fine tuning to average packet MTU size can make this more precise.

Configuring Custom Queuing

The commands for CQ are very similar to those for PQ. The difference is that you put the traffic into queues numbered 1-16, rather than named high, medium, normal, low. Hence we build our CQ policy with:

You can specify the default queue, the one that receives any unmatched traffic, with the command:

(The default default queue is 1).

You can specify the number of packets allowed in any queue with the command:

The threshold for a queue can be changed with the following command:

The default threshold for the queues is 1500 bytes.

And the CQ policy is applied to outbound frames on an interface with:

Relevant EXEC Commands



Sample Configuration

The following configuration is similar to that for PQ, except  that we’re not making DLSw and Telnet traffic top priority any more. Instead, we’re using four (4) queues (since default traffic goes to queue 10). The thresholds are 1500, 1500, 3000, and 1500, so Telnet in queue 3 gets 3000/7500 = 40% of the bandwidth, and the other queues get 20% each.


Weighted Fair Queuing (WFQ)

About WFQ

Weighted fair queueing provides automatically sorts among individual traffic streams without requiring that you first define access lists. It can manage one way or two way streams of data: traffic between pairs of applications or voice and video. It automatically smooths out bursts to reduce average latency.

In WFQ, packets are sorted in weighted order of arrival of the last bit, to determine transmission order. Using order of arrival of last bit emulates the behavior of Time Division Multiplexing (TDM), hence “fair”. In Frame Relay, FECN, BECN, and DE bits will cause the weights to be automatically adjusted, slowing flows if needed.

From one point of view, the effect of this is that WFQ classifies sessions as high- or low-bandwidth. Low-bandwidth traffic gets priority, with high-bandwidth traffic sharing what’s left over. If the traffic is bursting ahead of the rate at which the interface can transmit, new high-bandwidth traffic gets discarded after the configured or default congestive-messages threshold has been reached. However, low-bandwidth conversations, which include control-message conversations, continue to enqueue data.

Weighted fair queuing uses some parts of the protocol header to determine flow identity. For IP, WFQ uses the Type of Service (TOS) bits, the IP protocol code, the source and destination IP addresses (if not a fragment), and the source and destination TCP or UDP ports.

Distributed WFQ is available in IOS 12.0 on high-end interfaces and router models.


Configuring Fair Queuing (FQ)

congestive-discard-threshold: Number of messages allowed in each queue in the range 1 to 4096, default 64.

dynamic-queues: Number of dynamic queues used for best-effort conversations. Values are 16, 32, 64, 128, 256,
512, 1024, 2048, and 4096. The default is 256.

reservable-queues: Number of reservable queues used for reserved (RSVP) conversations, range 0 to 1000. The default is 0. If RSVP is enabled on a WFQ interface with reservable-queues set to 0, the reservable queue size is automatically set to bandwidth divided by 32 Kbps. Specify a reservable-queue size other than 0 if you wish different behavior.

Fair queuing is enabled by default for physical interfaces whose bandwidth is less than or equal to 2.048 Mbps, except for Link Access Procedure, Balanced (LAPB), X.25, or Synchronous Data Link Control (SDLC) encapsulations. Enabling custom queuing or priority queuing on an interface disables fair queueing. Fair queuing is automatically disabled if you enable autonomous or SSE switching on a 7000 model. Fair queueing is now enabled automatically on multilink PPP interfaces. WFQ is not supported on tunnels.

Configuring Weighted Fair Queuing (WFQ)

When congestion occurs, the weight for a class or group specifies the percentage of the output bandwidth allocated to that group. A weight of 60 gives 60% of the bandwidth during congestion periods.

Start by specifying what type of fair queuing is in effect on an interface:

If you omit tos and qos-group, you get flow-based WFQ. Otherwise you get TOS (precedence)-based or QoS-group based WFQ on the interface. You then set the total number of buffered packets on the interface. Below this limit, packets will not be dropped. Default is based on bandwidth and memory space available.

You also specify the limit for each queue. Default is half the aggregate limit.

The documentation suggests you not alter the queue limits without a good reason. To specify the depth of queue for a class of traffic:

Finally, to specify weight (percentage of the link) for a class of traffic:

The percentages on an interface must add up to no more than 99 (percent).

Relevant EXEC Commands

Sample Configuration

Fair Queuing

This restores the defaults on a T1 serial link.

Weighted Fair Queuing – QoS Group based

The following configuration sets up two QoS groups, 2 and 6, corresponding to precedences 2 and 6. It then specifies WFQ in terms of those two QoS groups.

Weighted Fair Queuing – Precedence (TOS) based

The following configuration directly specifies WFQ based on precedences 1, 2, and 3: