Browsed by
Tag: Edge Routing

CCIE Studies: Performance Routing PfR/OER

CCIE Studies: Performance Routing PfR/OER

Prologue

Hey fellow CCIE’s candidates and networking geeks. Today I want to step deep into the realm of PfR or Performance Routing. First let’s go back in time to the predecessor, Optimized Edge Routing or OER. As crazy as this sounds, OER came out in 2006 with IOS 12.3 . So, technically before all this SDN fanfare, Cisco actually decoupled the control (part of it at least) and data plane with OER/PfR back in the dizay.

DID THAT JUST BLOW YOUR MIND? THAT JUST HAPPENED! <GRIN> 2013-07-23 12.28.34 am

OER/PfR was created to help with a major issue that plagues many mid-market customers even to this day, proper load sharing and/or balancing on the edge of the network. Who wants to have redundant Internet connections, possibly even with diverse providers and have one of those connection sit there idle until something blows up? The short answer, pretty much nobody. Your paying for that circuit, you should be using it. Well, Shaun why not just use BGP? Well that’s a great question! You sure could and advertise part of your networks off one connection and the remaining networks off the other connection. That would achieve a level of load sharing inbound to the enterprise. Traffic egressing out of the enterprise could also be split to share the two connections. Sometimes the issue with BGP peering is the complexity and requirements. When I worked at the SP, a class C (/24) was the longest prefix that you could advertise. I heard it’s now a /23, but that has not been confirmed. Working with ARIN for a direct assignment of two IPv4 /24’s will be an exercise in patience. Remember we are running out of IPv4 space, perhaps you could get some IPv6 block for half price… J/K All that said, it can be a pain in the you know what to make this happen and not all companies have the resources to manage that type of edge peering agreement with the providers.

Well that’s where OER/PfR comes into play. Let’s keep this simple because OER/PfR can be quite a deep subject. Rather than base forwarding decisions on destination and lowest cost metric, why not take a path’s characteristics into consideration such as jitter, delay, utilization, load distribution, packet loss/health, or even MOS score? That’s the power of OER/PfR!!!

This is right from Cisco.com.
http://www.cisco.com/en/US/products/ps8787/products_ios_protocol_option_home.html

“PfR can also improve application availability by dynamically routing around network problems like black holes and brownouts that traditional IP routing may not detect. In addition, the intelligent load balancing capability of PfR can optimize path selection based on link use or circuit pricing.”

So, what did we do without BGP or OER/PfR? Typically, static routes with a floating static route for the redundant link using IP SLA/objecting for state monitoring (far end reachability). Again we are paying for something we can’t use. To quote Brian Dennis from INE. “It’s something we always accepted, like STP. You paying for something you can’t use”. The good news, you don’t need to live in that world any more. We have evolved with technologies like Fabric Path/TRILL, vPC, OER/PfR, SDN. Man, it’s a good time to be into networking!

Let’s think about some use cases: Internet connection load sharing/balancing, application specific traffic steering based on performance (latency), loss/delay sensitive hosted IP telephony traffic, leverage burstable based circuits, etc…

In summary, PfR allows the network to intelligently choose link resources as needed to reduce operational costs. Sounds like a sales pitch right? Well I am a Cisco SE after all, it’s in my DNA plus I found that diddy in one of the PfR FAQs.

OK, now that you have an good background on the origins of OER/PfR, let’s talk about the major difference between OER and PfR. In short, OER was destination prefix based and PfR expanded the capabilities to include route control on a per application basis.

Let’s also get one major thing out of the way first before we drill into the specifics. With a holistic view of the EDGE network your able to accomplish this level of traffic engineering on a per application level. If there is something wrong within the PfR network devices the traffic will FALL BACK to old school forwarding. Got that? No catastrophic failure where the routers are sticking their hands up screaming for help.

Requirements:

OK, let’s talk a little about the components required for a PfR edge network.

***IOS 15.1+ minimum recommended for production network***

Versioning: Major versions must match! If running 12.4(T) the version is 2.x. Is running IOS 15 the version is 3.x. It’s OK to have say a 2.1 and a 2.2, but not a 2.x and a 3.x version, this is NOT supported. 

Border Router (BR): In the data plane of the edge network, monitors prefixes and reports back to MC. 
Master Controller (MC):
 Centralized control plane for central processoring and database for statistics collection. 
1x Internal Interface-
BRs ONLY peer with each other over internal interfaces (directly connected or via tunnel). Also used between BR and MC.
2x External Interfaces- OER/PfR expects traffic to flow between internal and external interfaces.
Route Control: Parent Route REQUIRED! This explanation is right from the Cisco FAQ.

A parent route is a route that is equal to, or less specific than, the destination prefix of the traffic class being optimized by Performance Routing. The parent route should have a route through the Performance Routing external interfaces. All routes for the parent prefix are called parent routes. For Performance Routing to control a traffic class on a Performance Routing external interface, the parent route must exist on the Performance Routing external interface. BGP and Static routes qualify as Performance Routing parent routes. In Cisco IOS Release 12.4(24)T and later releases, any route in RIB, with an equal or less specific mask than the traffic class, will qualify as a parent route.

For any route that PfR modifies or controls (BGP, Static, PIRO, EIGRP, PBR), having a Parent prefix in the routing table eliminates the possibility of a routing loop occurring. This is naturally a good thing to prevent in routed networks.

Now, since I’m an active CCIE candidate I’m gonna say this, IOS 12.4(T) has bugs with PfR. For one, the command operative syntax is still “OER” and certain functionality just seems downright broken. My lab consists of real 3560’s and ISR routers, so it’s not like I’m using emulation/GNS/dynamips and that’s my issue. I cannot stress enough, if doing a POC in a non-PROD environment feel free to use IOS 12.4(T). In a “real world” production environment, never settle for less than 15.1. ASR 1K requires IOS XE 2.6 or higher for PfR support.

Hardware Platform Support: ISR G1(RIP), G2, ASR, 7600, Cat6500, and 7200’s (RIP)
Classic IOS Feature Set Required: SP Services/Advance IP/Enterprise/Advance Enterprise
Universal IOS Image: Data Package required

Configuration:

Pfr faq fig3.jpg

I was going to use a complex CCIE sample config, but there are so many good examples of PfR already on the Cisco PfR Wiki.

http://docwiki.cisco.com/wiki/PfR:Solutions

Instead, let me concentrate on the basic requirements starting with the border router.

BR Config: 

key chain PFR
 key 1
  key-string PFR

oer border
 logging
 local Loopback0
 master 8.8.8.8 key-chain PFR

ip route 0.0.0.0 0.0.0.0 Serial1/2 (PARENT ROUTE)
ip route 0.0.0.0 0.0.0.0 Serial1/1 (PARENT ROUTE)

MC Config: 

oer master
logging
!
border 8.8.8.8 key-chain PFR
interface Serial1/2 external
interface Serial1/1 external
interface Serial1/0 internal
!
learn
throughput
periodic-interval 0
monitor-period 1
mode route control
resolve utilization priority 1 variance 10
no resolve delay
no resolve range

THAT’S IT!!! 

Granted, this is the most basic form of route control, but it will inject a route for the monitored prefix based on interface throughput utilization. I believe the default is 75% utilized.

Here are some useful commands to monitor/troubleshoot PfR.

“show pfr/oer master”

OER state: ENABLED and ACTIVE
Conn Status: SUCCESS, PORT: 3949
Version: 2.2
Number of Border routers: 1
Number of Exits: 2
Number of monitored prefixes: 1 (max 5000)
Max prefixes: total 5000 learn 2500
Prefix count: total 1, learn 1, cfg 0
PBR Requirements met
Nbar Status: Inactive

Border Status UP/DOWN AuthFail Version
8.8.8.8 ACTIVE UP 03:29:17 0 2.2

Global Settings:
max-range-utilization percent 20 recv 0
mode route metric bgp local-pref 5000
mode route metric static tag 5000
trace probe delay 1000
logging
exit holddown time 60 secs, time remaining 0

Default Policy Settings:
backoff 300 3000 300
delay relative 50
holddown 300
periodic 0
probe frequency 56
number of jitter probe packets 100
mode route control
mode monitor both
mode select-exit good
loss relative 10
jitter threshold 20
mos threshold 3.60 percent 30
unreachable relative 50
resolve utilization priority 1 variance 10

Learn Settings:
current state : STARTED
time remaining in current state : 115 seconds
throughput
no delay
no inside bgp
no protocol
monitor-period 1
periodic-interval 0
aggregation-type prefix-length 24
prefixes 100
expire after time 720

“show pfr/oer master border detail” 

Border Status UP/DOWN AuthFail Version8.8.8.8 ACTIVE UP 03:31:46 0 2.2
Se1/2 EXTERNAL UP
Se1/1 EXTERNAL UP
Se1/0 INTERNAL UP

External Capacity Max BW BW Used Load Status Exit Id
Interface (kbps) (kbps) (kbps) (%)
——— ——– —— ——- ——- —— ——
Se1/2 Tx 1544 1158 0 0 UP 2
Rx 1544 0 0
Se1/1 Tx 1544 1158 0 0 UP 1
Rx 1544 0 0

“show ip cache flow”

IP packet size distribution (25713 total packets):
1-32 64 96 128 160 192 224 256 288 320 352 384 416 448 480
.000 .040 .000 .200 .000 .001 .000 .000 .000 .000 .000 .000 .000 .000 .000

512 544 576 1024 1536 2048 2560 3072 3584 4096 4608
.003 .000 .007 .000 .743 .000 .000 .000 .000 .000 .000

IP Flow Switching Cache, 4456704 bytes
2 active, 65534 inactive, 1007 added
16475 ager polls, 0 flow alloc failures
Active flows timeout in 1 minutes
Inactive flows timeout in 15 seconds
IP Sub Flow Cache, 533256 bytes
2 active, 16382 inactive, 1151 added, 1007 added to flow
0 alloc failures, 0 force free
1 chunk, 1 chunk added
last clearing of statistics never
Protocol Total Flows Packets Bytes Packets Active(Sec) Idle(Sec)
——– Flows /Sec /Flow /Pkt /Sec /Flow /Flow
TCP-Telnet 10 0.0 256 144 0.1 19.5 6.9
TCP-other 59 0.0 68 110 0.2 9.0 2.3
ICMP 13 0.0 1470 1500 1.3 52.3 3.5
Total: 82 0.0 313 1146 1.8 17.1 3.1

SrcIf SrcIPaddress DstIf DstIPaddress Pr SrcP DstP Pkts

“show pfr/oer master traffic-class”

OER Prefix Statistics:
Pas – Passive, Act – Active, S – Short term, L – Long term, Dly – Delay (ms),
P – Percentage below threshold, Jit – Jitter (ms),
MOS – Mean Opinion Score
Los – Packet Loss (packets-per-million), Un – Unreachable (flows-per-million),
E – Egress, I – Ingress, Bw – Bandwidth (kbps), N – Not applicable
U – unknown, * – uncontrolled, + – control more specific, @ – active probe all
# – Prefix monitor mode is Special, & – Blackholed Prefix
% – Force Next-Hop, ^ – Prefix is denied

DstPrefix Appl_ID Dscp Prot SrcPort DstPort SrcPrefix
Flags State Time CurrBR CurrI/F Protocol
PasSDly PasLDly PasSUn PasLUn PasSLos PasLLos EBw IBw
ActSDly ActLDly ActSUn ActLUn ActSJit ActPMOS ActSLos ActLLos
——————————————————————————–
7.7.7.0/24 N defa N N N N
INPOLICY 0 8.8.8.8 Se1/2 STATIC
U U 0 0 0 0 0 0
U U 0 0 N N N N

“show oer border routes static”

Flags: C – Controlled by oer, X – Path is excluded from control,
E – The control is exact, N – The control is non-exact

Flags Network Parent Tag
CE 7.7.7.0/24 0.0.0.0/0 5000

Epilogue:

Well folks, that’s all the steam I have left after pouring out my heart on PfR/OER. I hope this post was informative. Please drop me a line if you have any questions or I was not clear on any of my points. I appreciate any and all feedback. In my mind, Cisco gave us a glimpse into the future of networking way back in 2006. With data center technologies evolving on a daily basis, it’s only a matter of time before there is an MC for the enterprise network rather than just the edge. Heck Google is doing that already with 25% of all the Internet traffic TODAY! Until next time, keep those blinky lights flashing.

shaun