Campus QoS part 3- Egress Queuing, Dropping and Scheduling

VN:F [1.9.6_1107]
Rating: 5.0/5 (3 votes cast)
By Vik Malhi on August 25th, 2011

Introduction

This is the third and final part of the series of blogs covering all aspects of QoS related to the Catalyst 3750. (Part 1 covered classification and marking and  Part 2 covered Ingress queueing).

In this article we will focus on Egress Queuing, Dropping and Scheduling. Whereas on the ingress side traffic is being placed onto the backplane of the switch (ring), on the egress side traffic is being sent to the destination or egress port.

Congestion can occur when multiple devices (ingress ports) are sending data to a single device (egress port). Congestion on the egress side is much more of an issue when compared with ingress congestion.

Congestion management and avoidance on the egress side has three distinct phases- (1) queueing, (2) dropping and (3) scheduling/dequeuing.

 

 

 

 

 

 

 

 

 

 

 

 

Egress Queuing

The first step on the egress side of the switch is to map each frame to one of 4 egress queues and assign a threshold value that is used by Weighted Tail Drop (WTD). We’ll talk about WTD in a little while but for now just consider that each frame is placed in Q1 thru Q4 based on DSCP or COS depending on the QoS label. “QoS label”- what’s this? At the ingress switchport we can either trust/set layer 2 or layer 3 markings using “mls qos trust cos” or “mls qos trust dscp” at the ingress port or alternatively a service-policy. This marking is then copied across internally within the switch and forms the QoS label. This is used for all future QoS-related actions on the frame whilst inside the switch. This includes deciding which queue/threshold the frame is placed into on both the ingress and egress side.

To find out which queue each type of traffic is utilizing use one of the following two commands:

 

 

 

 

 

 

 

 

 

 

 

 

 

As mentioned earlier there are 4 queues on the egress side. The sizes of the queues (the percentage of the per port buffer space allocated to each queue) are defined within the queue set. An interface can belong to either queue set 1 or 2. The default is queue set 1. The queue set allows you to configure the egress queuing parameters using an additional layer of granularity than doing so in global configuration. An egress port to another switch or router may require a different set of properties to, say, an egress port connected to a phone or computer. Let’s take a look at how the buffer space is allocated once AutoQoS has been configured on one of the interfaces on the switch.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

By default all interfaces belong to queue set 1- if we wanted an egress port to have the buffer spaces between Q1-4 allocated 16%:6%:17%:61 then we should configure the port with queue-set 2 as shown below:

 

 

You are able to modify the buffer sizes of the queues by changing the parameters in the queue set- below is an example of queue set 1 being modified so that all of the queues are of equal size.

 

 

Weighted Tail Drop on the Egress side

 

Like the buffer space allocation, the WTD percentages are defined within the queue set and an interface belongs to either queue set 1 or 2. WTD provides a way to begin dropping lower priority traffic ahead of higher priority traffic in the event that a particular queue is becoming increasingly utilized.

 

The idea of WTD is to decrease the chance of a specific queue from becoming congested. Let’s take an example whereby both DSCP 0 and DSCP 8 are being placed into egress queue 4. Our ordinary traffic (best effort) is being marked with DSCP 0 and our scavenger class traffic is being marked with DSCP 8. Rather than wait for Q4 to become completely full (no more buffers available for Q4) and then indiscriminately drop both traffic types, we are able to begin dropping frames that have QoS label DSCP 8 much sooner than we begin dropping frames with QoS label DSCP 0. This helps us since our worst type of traffic will not be responsible for causing the congestion for a particular queue.

 

All traffic is mapped to Queue x and Threshold y where x = 1, 2, 3 or 4 and y = 1, 2 or 3. Take a look at the example below (these tables are the actual tables having run AutoQos):

 

 

 

 

 

 

 

 

You can see that DSCP 0 is being mapped to Q4T3 and DSCP 8 is being mapped to Q4T1.

Let’s take a look at the parameters of Q4 that are applied to queue set 1.

 

 

 

 

 

 

 

 

First of all Q4 has 54% of the port buffer space allocated to it. Of that 54% only 67% has been reserved. So in reality we are only reserving two thirds of the buffers that we could have done for Q4. The other one third is our tax. This is our contribution to the “common pool” which the switch can assign to any other queue on any other port if needed. Queue 4 will be considered full when 67% of the allocated queue memory is being used (the terms buffer and memory are interchangeable). So to summarize this paragraph: allocated memory = 54% of interface memory and we are only reserving 67% of this for Q4.

Q4T1 is configured at 20% which means that when memory or buffer utilization of Q4 has reached 20% of the allocated queue memory/buffer (54% of the total memory for this port) we will begin dropping packets that are being mapped to Q4T1.

Q4T2 is configured at 50%- when Q4 has reached 50% of the allocated queue memory/buffer utilization we will begin dropping packets that are being mapped to Q4T2.

When do we start dropping packets that are being mapped to Q4T3? We are reserving 67% of the memory that has been allocated for Q4 but we are saying that the maximum memory that this queue can have is 400%- above and beyond our reserved 67% we would need to ask the taxman for some temporary funds to stave of the threat of a packet being discarded. In other words when Q4 has reached 100% of it’s memory utilization (67% of 54% of the total) Q4 will try and grab unallocated memory from the common pool (up to 400% of the allocated memory for the queue). If there is enough memory in the common pool then all is well and good and the packet is not discarded. If however there is no spare memory in the common pool then the packet will be discarded.

Confused? You are not alone. Let’s look at WTD without using the common pool buffer. Look at the table below and we have made some changes. You are going to feel a whole lot better (I gave you the bad news first).

 

 

 

 

 

 

 

 

Q4 has 25% of the buffers allocated to it. We are reserving all of the buffer allocation and paying nothing to the taxman (common pool). Now that I mention it, this would be nice in the real world too!! We are setting the maximum memory to 100% which means that we will not attempt to grab any memory from the common pool once we are in the queue-full state. If our reserved/allocated memory is being utilized then we will drop the packet as opposed to attempt to grab some of the unreserved / common pool buffers.

 

Traffic being mapped to Q4T1 will be placed into Q4 so long as the buffer is not already 50% full.

Traffic being mapped to Q4T2 will be placed into Q4 so long as the buffer is not already 75% full.

Traffic being mapped to Q4T3 will be placed into Q4 so long as the buffer is not already completely full. And by full we mean our share of the total port memory (25%).

 

Egress Scheduling: 1P3Q3T or 4Q3T

The scheduler determines the algorithm of how the switch removes traffic from the 4 queues. The total amount of bandwidth being sent to the egress port is the aggregate of the quantity of traffic removed from each of the 4 queues. Whereas WTD can be categorized as a congestion avoidance mechanism (we do our best to stop lesser packets from filling up our queues), scheduling can be categorized as managing the congestion within the queues. Once a packet has made it into one of the 4 queues, the threshold value has no significance.

Unlike the buffers and threshold settings, the properties of the scheduler are defined within each interface (as opposed to the queue set). Like the ingress scheduler, SRR is used- the difference is that the “S” can stand for shaped as well as shared. In shaped mode, the egress queues are guaranteed a percentage of the bandwidth and they are rate-limited to that amount. Shaped traffic does not use more than the allocated bandwidth even if the link is idle. In shared mode, the queues share the bandwidth among them according to the configured weights. The bandwidth is guaranteed at this level but not limited to it. For example, if a queue is empty and no longer requires a share of the link, the remaining queues can expand into the unused bandwidth and share it among them.

 

Let’s take a look at a few examples. Once mls qos has been enabled on the switch the output below shows the default egress scheduler implementation for each interface.

 

 

 

 

 

 

First of all the Priority Queue is disabled- we will talk about this later but for now the significance of the PQ being disabled is that the SRR values for Q1 come into play. If there is zero “shaped” value configured for a particular queue then the “shared” value will be applicable for that particular queue. So by looking at the output above Q1 is shaped to 25 and queues 2-4 are shared at a value of 25. We discount the share value of Q1 since Q1 has been configured with a non-zero shape value.

 

The meaning of the “25” being applied to Q1 is very different to the “25” being applied to Q2, Q3 and Q4. In shaped mode the value is absolute which means that it is in effect a percentage and the values for Q2, Q3 and Q4 make no difference to how much bandwidth Q1 is being shaped to. Now onto one of the more common mistakes- the “shape 25 0 0 0” means that Q1 is shaped to 4% of the egress port bandwidth (4Mbps for a 100Mbps interface) since the 25 is an inverse ratio (1/25).

 

The “share 25 25 25 25” should be read as “share xx 25 25 25” since the Q1 share value is not used. The 25 assigned to Q2-4 is in this case a weight as opposed to an absolute value. In other words it is the relative and not the independent value that is meaningful. So each of queues 2, 3 and 4 have a minimum of 25/(25+25+25) of the remaining (96%) bandwidth. In other words Q2, Q3 and Q4 each have a third of the bandwidth not being assigned to Q1. In this situation “share 1 33 33 33” may make the configuration more readable.

 

A couple of clarifications- if Q1 is idle then the 4% being assigned to Q1 can be used by another queue if needed. If Q1 is becoming over-subscribed and the other 3 queues are empty then Q1 cannot be allocated bandwidth that has been assigned to one of the other queues since it is being rate-limited or shaped to 4%. Contrast this with the other queues that are all operating in shared mode. If Q2 is becoming over-subscribed and the other 3 queues are empty then Q2 can be allocated 100% of the egress port bandwidth since the “share” only specifies a minimum bandwidth guarantee.

 

Let’s take a look at the egress scheduler when AutoQos has been configured on an interface:

 

 

 

 

 

 

Once the Priority Queue has been enabled on an interface then the SRR shape and share values for Q1 are irrelevant. The scheduler will service the PQ until it is empty. Once the PQ has been completely serviced then Q2-4 will be serviced. Q1 is always the PQ on this platform (Q4 for the 3550).

The “shape 25 0 0 0” has no significance since Q1 has the PQ enabled and hence the “25” being allocated to Q1 is not used. This is an important point-  there is no rate-limiting of the Priority Queue taking place! If you wanted to rate limit Q1 to a specific value then you must disable the priority queue. If you have the PQ enabled then you may want to make your config more readable by setting the shape value of Q1 to 0.

The “share 10 10 60 20” defines the weights assigned for Q2-4. The “10” assigned for Q1 has no meaning since the PQ is enabled for this interface. So Q2 thru 4 have the bandwidth split 10/(10+60+20):60/(10+60+20):20/(10+60+20) or 1/9th:6/9th:2/9th for Q2:Q3:Q4 respectively. It might be wise to make the values of Q2-4 add up to 100 since now you will now be dealing in effect with percentages and make the value for Q1 equal to “1” since the PQ is being used (you cannot set a share value to zero).

Finally- you are also able to limit the bandwidth of an individual egress port as shown below:

 

 

 

 

 

 

 

When you configure this command to 85 percent, the port is idle 15 percent of the time. The line rate drops to 85 percent of the connected speed, which is 85 Mb/s for our 100Mbps interfaces.

Phew- that’s about it regarding the Catalayst 3750 QoS features. Sorry it took me so long to get the series completed but better late than never. Good luck with your studies!

 

Vik Malhi

Instructor, IPexpert Inc

 

Campus QoS part 3- Egress Queuing, Dropping and Scheduling, 5.0 out of 5 based on 3 ratings
Share and Enjoy:
  • RSS
  • Twitter
  • Facebook
  • Google Bookmarks
  • Digg
  • Print
  • Technorati
  • Slashdot
  • LinkedIn
  • del.icio.us
  • Reddit
  • Sphinn
  • Mixx
  • Blogplay
  • Netvibes
  • NewsVine
  • Live
  • Ping.fm
  • MySpace
  • Yahoo! Bookmarks
  • Yahoo! Buzz

Tags: , , , , , , , ,

3 Responses to “Campus QoS part 3- Egress Queuing, Dropping and Scheduling”

  1. Ron schutte says:

    hi Vik,
    Reading your part 3, good stuff

    But i have two points for clarification
    Frist:
    …..
    Now onto one of the more common mistakes- the “shape 25 0 0 0” means that Q1 is shaped to 4% of the egress port bandwidth (4Mbps for a 100Mbps interface) since the 25 is an inverse ratio (1/25).
    …..
    It is 1/25 of port speed ! So 2mb of 100mb link. Or am i reading it wrong

    Second:

    A couple of clarifications- if Q1 is idle then the 4% being assigned to Q1 can be used by another queue if needed
    ….
    Shaped queues are not shared, or am i wrong on That to?

    Well i hope you can clarify these points.

    Regards Ron

    VA:F [1.9.6_1107]
    Rating: 0.0/5 (0 votes cast)
  2. Vik Malhi says:

    On point 2:”Shaped queues are not shared, or am i wrong on That to?”

    SRR is EITHER SHAPE or SHARE. Never both. If we shape to 4% then that is the maximum bandwidth assigned for that particular queue. If the queue is idle then this bandwidth is not reserved- it’s freed up until it is required again.

    VA:F [1.9.6_1107]
    Rating: 5.0/5 (1 vote cast)

Leave a Reply