How do you design?
Based on the design of the operational command, knowledge that can be read from the documentation, etc., I believe the following
Strategy.0
If possible, it is better to have PaloAlt FW do only traffic coloring and leave QoS control to better devices such as core routers.
If this is not possible, the following should be followed.
Strategy.1
The "Egress Maximum" of the interface should not be matched to the physical bandwidth.
In other words, set it sufficiently larger than the physical bandwidth, or do not set it per se.
Strategy.2
The sum of all classes of "Egress Guaranteed" should match the physical bandwidth.
And the "Egress Maximum" for each class should be set no greater than 10% of it.
Strategy.3
Traffic that you do not want to drop should have its priority set to "Real-time".
However, traffic that belongs to Real-time should have a sufficiently low chance of bursting.
If the likelihood of extreme bursts is high, the priority should not be set to "Real-time".
That's my stance.
It seems to me that there is not much material on QoS in PaloAlt FW. For example, "Real-time" uses "own separate queue".
https://docs.paloaltonetworks.com/pan-os/8-1/pan-os-web-interface-help/network/network-network-profiles-qos
I have not been able to find any documentation from docs.paloaltonetworks.com that would allow me to understand what the term "own separate queue" in the above document means as it is used.
However, some speculation about QoS architecture is possible. I have some evidence, but I can't show it here. If you have the same questions as me and have experience with experiments in the lab, I'd be happy to hear some information.
First of all, the realization of QoS in PaloAlt FW is supposed to be done by policing. Unfortunately, shaping is not possible, in my opinion.
I also believe that the "Egress Guaranteed" realization is implemented using a "Dual Token Bucket".
https://www.cisco.com/c/dam/en/us/td/i/000001-100000/60001-65000/60001-61000/60515.ps/_jcr_content/renditions/60515.jpg
In other words, by setting CIR in "Egress Guaranteed" and PIR in "Egress Maximum", it seems that the other parameters (Tp, Tc, etc.) required for the "Dual Token Bucket" are automatically set.
However, there is another congestion avoidance mechanism at work besides the "Dual Token Bucket", and that is Weighted Random Early Detection (WRED).
https://www.cisco.com/c/dam/en/us/td/i/000001-100000/15001-20000/16501-17000/16759.ps/_jcr_content/renditions/16759.jpg
These guesses are not greatly mistaken, as the "WRED drop" and "policing drop" counters are separated:
https://live.paloaltonetworks.com/t5/general-topics/what-do-wred-drops-and-policing-drop-on-qos-mean/td-p/28911
So, under what circumstances do "WRED drop" and "policing drop" occur?
I believe the following.
First, in the documentation "QOS PROFILE SETTINGS", you will find the following text.
"When contention occurs, traffic that is assigned a lower priority is dropped. Real-time priority uses its own separate queue."
From the above, we can read that for each interface, there are two queues.
It is also presumed that the WRED algorithm works for one of the queues, but not for traffic that belongs to the real-time priority.
Also, each queue length is considered to be automatically set from the "Egress Maximum" set for the interface.
Assuming these conjectures indicate that for traffic with priorities other than real-time, a drop can occur even for traffic volumes below "Egress Guaranteed".
Hence, I believe that it is better for PaloAlt FW to only color the traffic and leave QoS control to core routers and the like, and if that is not possible, to make it sufficiently larger than the physical bandwidth, or to set the interface's "Egress Maximum I believe that you should not set itself (it may depend on the quality of the line, but I believe that it will perform better if you leave it at about 200% of the physical bandwidth).
nanashin and others, have you used this successfully in production? Any empirical results you can share?
"Egress Maximum I believe that you should not set itself (it may depend on the quality of the line, but I believe that it will perform better if you leave it at about 200% of the physical bandwidth).
Thanks for this excellent writeup!
I do agree the QoS implementation in the Palo Alto Firewall is suboptimal and should be left to better equipped devices while in a highly sensitive network as there is a lack of finesse and indeed packets could be dropped even when still inside the 'guarantee' range if the class is not set to real-time.