Browsed by
Category: Technology

CCIE #40755 (Routing & Switching)

CCIE #40755 (Routing & Switching)

“It’s gonna take time, a whole lot of precious time, it’s going to take patience and time to do it right child.”
“It’s gonna take money, a whole lot of spending money, it’s going to take plenty of money, to do it right”

-George Harrison
Song: I got my mind set on you 

I’m pretty sure George had the ladies on his mind and NOT the CCIE when he wrote that song. I can tell you no other lyrics resonate as strong as these when it comes to my personal journey of becoming inducted into the League of Extraordinary Engineers. Yes my friends, after 5+ LONG years, I’m officially in da club. My number is 40755 and oh boy does it feel AWESOME.

Because this journey was very difficult, I would go as far to say it’s the most difficult educational challenge I committed myself to, it’s only right that I share my story with other CCIE candidates to instill hope and encouragement. If it was easy, everyone would be a CCIE. Just keep that in mind as you embark on your own journey.

And so the story begins in 2008 when I passed the CCIE R&S written and only had a small window to take the v3 lab. This was sometime in september if I recall correctly. I was naive in thinking this is going to be cake, I mean how hard could this lab really be? I was thinking that I may only need 1-2 attempts, but I should have it done by the end of the year no problem. Well my first lab was v3 (lab guide printed on REAL paper in binder) and I actually did pretty good. My major issues were managing the clock and weakness on certain on security related services. Other than that it was a noble attempt. This gave me confidence and when I went to reschedule I realized something awful. The blueprint changed and there were no more seats left for the v3 lab. Now hear comes the madness, I was offered a “free” beta lab for the v4 which I accepted the challenge. Let’s just say that after taking the v4 beta, I was humbled in a the most extreme way. Now begins a radical format change (changes) to the lab. Open ended questions, troubleshooting, removal of open ended questions. I tried very hard to adapt to these changes, but as a poor test taker to begin with it was very challenging to say the least.

I was working at a small ISP in Central, PA at the time of this endeavor. God opened up a great door of opportunity in August of 2010 and I jumped in feet first… Where did I go??? CISCO!!!

While this major transition is occurring we’re also expecting our third child. I started on August 1st and Leo was born on August 28th. Man life was crazy and through all this I was sticking to my studies. I forget the details, but since my CCIE written was first passed in 2008, I had to take the written again before I could schedule another lab. I did this december of 2010 and would actually wait a full year before taking the v4 exam again. My third attempt was in Nov of 2011, this is where it gets interesting. I took the lab in San Jose instead of RTP this time. I flew out of Philadelphia airport and my laptop was stolen out of my checked in luggage. The TSA agent even left one of those “inspected by TSA” tickets in the bag. It was a surgical strike as only my laptop and power cable were removed from the bag. All my study notes were on that laptop… Needless to say, this was one heck of a trip. I did not pass, but did OK. The troubleshooting section was VERY tough.

Now pay attention because this is where I made the biggest mistake. I took almost a full year before my next attempt. NEVER DO THIS!!! If you can manage it, keep coming back every 30-60 days if possible. No more than 90 days. Things just got so busy between life and work that I waited yet ANOTHER year before diving back. By this time RTP had a new proctor (David) and let me tell you all this. He is by far my favorite proctor. David constantly encouraged me and drove me to keep coming back ASAP. With his recommendation and such a strong support system behind me I was able to pass after my 3rd consecutive attempt. It feels great to have my life back and know I can focus on the most important thing that was neglected… My family. While my wife and children supported me through this endeavor, there is no doubt that it took it’s toll on all of us. I could not have done this without the support of my family, friends, and colleagues. THANK YOU!!!

Passing lab experience:

September 28th, 2013

I drove down to RTP, NC from Central PA early Friday morning. My stomach was bothering me the night before probably due to nerves. I get so sick just thinking about the exam that I’m miserable every time I went to building 3. I get to RTP at about 3pm on Friday and ate a bland meal at Chipotle in Morrisville. I went back to the hotel room and practiced INE labs and reviewed my TS notes. My weak areas are still services because there are so many and being an expert in all of them is impossible (at least for me), but there are some that I take pride in my knowledge like EEM and multicast. Here’s the worst part. I could NOT sleep. I think I may of had 45min – 1hour, but that’s it. No matter what I tried I could not fall asleep. In addition, my stomach is a wreak. I drink half a bottle of pepto in hopes of relief. It did not come… Now for those of you who know me. I don’t drink or smoke. Heck eating some spicy foods is about as risky of a move that I make when it comes to what goes in my body. I NEVER drank anything like red bull or monster in my life. Those of you know know me would probably say that I’m wired to begin with. Why the heck would I even need something like that in the first place. Well this morning I did and my buddy John told me it helped him get through the lab the prior week before. So I drove to sheetz early in the morning and bought a red bull and start bucks energy drink. I settled on the Starbucks and drank the whole can. It was tasty, but what the heck is 80mg of caffeine going to do to me? I’ll tell you what it did. I became Bevis aka cornholio. I was so wired within 30 minutes of drinking that I forgot I was even tired. When I got to Building 3 we all went in and I began right away. Thanks to the power of caffeine, I was typing at like 150 WPM. Hit some major roadblocks in TS, but the energy infusion was too powerful an ally for TS to overcome. I felt good based on my results that Starbucks and I conquered TS. OK, well perhaps the Holy Spirit and me because there were some miraculous things that happened in the last 15-20 minutes.

I don’t even waste time, I jump right into configuration and heck I don’t think I even used the bathroom up to this point. No time for potty breaks. I get my configuration and my smile is ear to ear after reading though it. Let’s just say this, it was a test that jives with my skills. I felt good about the objective this config had set before me. I felt like I was running in auto pilot mode. My typing is loud and fast and I’m starting to feel bad because none of the other candidates were using ear plugs. I must have sounded like an old school author with his typewriter. By lunch I’m done with all L2/L3 and started on some of the services. Best time I had yet. Lunch is quick and I get back to it. By 1:30, I’m done with everything I could possibly configure. I take the next 45 minutes for verification, config backups, and reload. I’m pretty sure at a little after 2pm, I ended the lab. My heart was still racing, but something strange happened to my body. My guess is all the caffeine wore off as well as the adrenaline and I was crashing. I actually went into the break room and sat in the chair for a quick power nap. David stopped by and we talked a little about the lab. I felt really good about it and told him “If I don’t pass it this time, your might see a grown man crying”. To which he replies, “that’s nothing new”. Now comes the worst part… WAITING. I grab some food and head back to the hotel room. My intention was to eat and sleep, but again I could not fall asleep. My body and mind are a complete disaster. I’m waiting for this email with the results and it probably won’t be till tomorrow I find out if I did it. So, I do something that I have not really done in the last 5 years. Enjoy life’s simple pleasures. I go to the local movie theater and see Riddick. It was OK, but no pitch black. By this time you would think sleep was inevitable right? WRONG! I can’t sleep one wink. I get in the shower at 3:30am and check out of the hotel by 4am. I’m on the road heading back to PA. I keep checking my email every chance I get, still nothing. I stop in VA for some rest and decided to check my email. THIS IS IT! I have a message. The anticipation is killing me, do I even want to look at this now… I did and this is what I got!

  •  Your CCIE status is Certified ( CCIE# 40755 )
  • Your next CCIE Recertification due by September 28, 2015

I notify everyone via FB, Twitter, text, IM, calls, you name it. Then I crash in the car only to wake up at like 10am. My excitement level at this point is sky high. I can’t contain myself when talking to people on the phone. I’m thinking about all the things I wanted to do when I passed. Get a custom tag with my number, finally buy the pinball machine I have talked about for years, but the most important thing was this… Reconnect with my wife and family. When I reflected on my attitude, especially when studying for each lab attempt it was like I was a non-existent husband/father. So, it’s with great happiness and peace that I enjoy life again and return back home both physically and mentally.

In closing, I leave you candidates to be with the following wisdom.

1) Be prepared to make great sacrifices on this journey

2) Never give up

3) While it’s one of the most challenges journeys you can embark on, it’s also the most rewarding

4) Never give up

5)  Always keep in perspective that all your hard work will make you a better engineer regardless if you pass or not

6) Never give up

7) If you need a boost, drink some serious caffeine before taking the lab.

8) NEVER GIVE UP!

I want to again thank God, my family, friends, colleagues, INE, for the support and encouragement that was essential for my success. Oh! one more thing…

“And this time I know it’s for real, The feelings that I feel, I know if I put my mind to it, I know that I really can do it”

Man, that song was really made for CCIE candidates.

CCIERouting_and_Switching_UseLogo

CCIE Studies: Performance Routing PfR/OER

CCIE Studies: Performance Routing PfR/OER

Prologue

Hey fellow CCIE’s candidates and networking geeks. Today I want to step deep into the realm of PfR or Performance Routing. First let’s go back in time to the predecessor, Optimized Edge Routing or OER. As crazy as this sounds, OER came out in 2006 with IOS 12.3 . So, technically before all this SDN fanfare, Cisco actually decoupled the control (part of it at least) and data plane with OER/PfR back in the dizay.

DID THAT JUST BLOW YOUR MIND? THAT JUST HAPPENED! <GRIN> 2013-07-23 12.28.34 am

OER/PfR was created to help with a major issue that plagues many mid-market customers even to this day, proper load sharing and/or balancing on the edge of the network. Who wants to have redundant Internet connections, possibly even with diverse providers and have one of those connection sit there idle until something blows up? The short answer, pretty much nobody. Your paying for that circuit, you should be using it. Well, Shaun why not just use BGP? Well that’s a great question! You sure could and advertise part of your networks off one connection and the remaining networks off the other connection. That would achieve a level of load sharing inbound to the enterprise. Traffic egressing out of the enterprise could also be split to share the two connections. Sometimes the issue with BGP peering is the complexity and requirements. When I worked at the SP, a class C (/24) was the longest prefix that you could advertise. I heard it’s now a /23, but that has not been confirmed. Working with ARIN for a direct assignment of two IPv4 /24’s will be an exercise in patience. Remember we are running out of IPv4 space, perhaps you could get some IPv6 block for half price… J/K All that said, it can be a pain in the you know what to make this happen and not all companies have the resources to manage that type of edge peering agreement with the providers.

Well that’s where OER/PfR comes into play. Let’s keep this simple because OER/PfR can be quite a deep subject. Rather than base forwarding decisions on destination and lowest cost metric, why not take a path’s characteristics into consideration such as jitter, delay, utilization, load distribution, packet loss/health, or even MOS score? That’s the power of OER/PfR!!!

This is right from Cisco.com.
http://www.cisco.com/en/US/products/ps8787/products_ios_protocol_option_home.html

“PfR can also improve application availability by dynamically routing around network problems like black holes and brownouts that traditional IP routing may not detect. In addition, the intelligent load balancing capability of PfR can optimize path selection based on link use or circuit pricing.”

So, what did we do without BGP or OER/PfR? Typically, static routes with a floating static route for the redundant link using IP SLA/objecting for state monitoring (far end reachability). Again we are paying for something we can’t use. To quote Brian Dennis from INE. “It’s something we always accepted, like STP. You paying for something you can’t use”. The good news, you don’t need to live in that world any more. We have evolved with technologies like Fabric Path/TRILL, vPC, OER/PfR, SDN. Man, it’s a good time to be into networking!

Let’s think about some use cases: Internet connection load sharing/balancing, application specific traffic steering based on performance (latency), loss/delay sensitive hosted IP telephony traffic, leverage burstable based circuits, etc…

In summary, PfR allows the network to intelligently choose link resources as needed to reduce operational costs. Sounds like a sales pitch right? Well I am a Cisco SE after all, it’s in my DNA plus I found that diddy in one of the PfR FAQs.

OK, now that you have an good background on the origins of OER/PfR, let’s talk about the major difference between OER and PfR. In short, OER was destination prefix based and PfR expanded the capabilities to include route control on a per application basis.

Let’s also get one major thing out of the way first before we drill into the specifics. With a holistic view of the EDGE network your able to accomplish this level of traffic engineering on a per application level. If there is something wrong within the PfR network devices the traffic will FALL BACK to old school forwarding. Got that? No catastrophic failure where the routers are sticking their hands up screaming for help.

Requirements:

OK, let’s talk a little about the components required for a PfR edge network.

***IOS 15.1+ minimum recommended for production network***

Versioning: Major versions must match! If running 12.4(T) the version is 2.x. Is running IOS 15 the version is 3.x. It’s OK to have say a 2.1 and a 2.2, but not a 2.x and a 3.x version, this is NOT supported. 

Border Router (BR): In the data plane of the edge network, monitors prefixes and reports back to MC. 
Master Controller (MC):
 Centralized control plane for central processoring and database for statistics collection. 
1x Internal Interface-
BRs ONLY peer with each other over internal interfaces (directly connected or via tunnel). Also used between BR and MC.
2x External Interfaces- OER/PfR expects traffic to flow between internal and external interfaces.
Route Control: Parent Route REQUIRED! This explanation is right from the Cisco FAQ.

A parent route is a route that is equal to, or less specific than, the destination prefix of the traffic class being optimized by Performance Routing. The parent route should have a route through the Performance Routing external interfaces. All routes for the parent prefix are called parent routes. For Performance Routing to control a traffic class on a Performance Routing external interface, the parent route must exist on the Performance Routing external interface. BGP and Static routes qualify as Performance Routing parent routes. In Cisco IOS Release 12.4(24)T and later releases, any route in RIB, with an equal or less specific mask than the traffic class, will qualify as a parent route.

For any route that PfR modifies or controls (BGP, Static, PIRO, EIGRP, PBR), having a Parent prefix in the routing table eliminates the possibility of a routing loop occurring. This is naturally a good thing to prevent in routed networks.

Now, since I’m an active CCIE candidate I’m gonna say this, IOS 12.4(T) has bugs with PfR. For one, the command operative syntax is still “OER” and certain functionality just seems downright broken. My lab consists of real 3560’s and ISR routers, so it’s not like I’m using emulation/GNS/dynamips and that’s my issue. I cannot stress enough, if doing a POC in a non-PROD environment feel free to use IOS 12.4(T). In a “real world” production environment, never settle for less than 15.1. ASR 1K requires IOS XE 2.6 or higher for PfR support.

Hardware Platform Support: ISR G1(RIP), G2, ASR, 7600, Cat6500, and 7200’s (RIP)
Classic IOS Feature Set Required: SP Services/Advance IP/Enterprise/Advance Enterprise
Universal IOS Image: Data Package required

Configuration:

Pfr faq fig3.jpg

I was going to use a complex CCIE sample config, but there are so many good examples of PfR already on the Cisco PfR Wiki.

http://docwiki.cisco.com/wiki/PfR:Solutions

Instead, let me concentrate on the basic requirements starting with the border router.

BR Config: 

key chain PFR
 key 1
  key-string PFR

oer border
 logging
 local Loopback0
 master 8.8.8.8 key-chain PFR

ip route 0.0.0.0 0.0.0.0 Serial1/2 (PARENT ROUTE)
ip route 0.0.0.0 0.0.0.0 Serial1/1 (PARENT ROUTE)

MC Config: 

oer master
logging
!
border 8.8.8.8 key-chain PFR
interface Serial1/2 external
interface Serial1/1 external
interface Serial1/0 internal
!
learn
throughput
periodic-interval 0
monitor-period 1
mode route control
resolve utilization priority 1 variance 10
no resolve delay
no resolve range

THAT’S IT!!! 

Granted, this is the most basic form of route control, but it will inject a route for the monitored prefix based on interface throughput utilization. I believe the default is 75% utilized.

Here are some useful commands to monitor/troubleshoot PfR.

“show pfr/oer master”

OER state: ENABLED and ACTIVE
Conn Status: SUCCESS, PORT: 3949
Version: 2.2
Number of Border routers: 1
Number of Exits: 2
Number of monitored prefixes: 1 (max 5000)
Max prefixes: total 5000 learn 2500
Prefix count: total 1, learn 1, cfg 0
PBR Requirements met
Nbar Status: Inactive

Border Status UP/DOWN AuthFail Version
8.8.8.8 ACTIVE UP 03:29:17 0 2.2

Global Settings:
max-range-utilization percent 20 recv 0
mode route metric bgp local-pref 5000
mode route metric static tag 5000
trace probe delay 1000
logging
exit holddown time 60 secs, time remaining 0

Default Policy Settings:
backoff 300 3000 300
delay relative 50
holddown 300
periodic 0
probe frequency 56
number of jitter probe packets 100
mode route control
mode monitor both
mode select-exit good
loss relative 10
jitter threshold 20
mos threshold 3.60 percent 30
unreachable relative 50
resolve utilization priority 1 variance 10

Learn Settings:
current state : STARTED
time remaining in current state : 115 seconds
throughput
no delay
no inside bgp
no protocol
monitor-period 1
periodic-interval 0
aggregation-type prefix-length 24
prefixes 100
expire after time 720

“show pfr/oer master border detail” 

Border Status UP/DOWN AuthFail Version8.8.8.8 ACTIVE UP 03:31:46 0 2.2
Se1/2 EXTERNAL UP
Se1/1 EXTERNAL UP
Se1/0 INTERNAL UP

External Capacity Max BW BW Used Load Status Exit Id
Interface (kbps) (kbps) (kbps) (%)
——— ——– —— ——- ——- —— ——
Se1/2 Tx 1544 1158 0 0 UP 2
Rx 1544 0 0
Se1/1 Tx 1544 1158 0 0 UP 1
Rx 1544 0 0

“show ip cache flow”

IP packet size distribution (25713 total packets):
1-32 64 96 128 160 192 224 256 288 320 352 384 416 448 480
.000 .040 .000 .200 .000 .001 .000 .000 .000 .000 .000 .000 .000 .000 .000

512 544 576 1024 1536 2048 2560 3072 3584 4096 4608
.003 .000 .007 .000 .743 .000 .000 .000 .000 .000 .000

IP Flow Switching Cache, 4456704 bytes
2 active, 65534 inactive, 1007 added
16475 ager polls, 0 flow alloc failures
Active flows timeout in 1 minutes
Inactive flows timeout in 15 seconds
IP Sub Flow Cache, 533256 bytes
2 active, 16382 inactive, 1151 added, 1007 added to flow
0 alloc failures, 0 force free
1 chunk, 1 chunk added
last clearing of statistics never
Protocol Total Flows Packets Bytes Packets Active(Sec) Idle(Sec)
——– Flows /Sec /Flow /Pkt /Sec /Flow /Flow
TCP-Telnet 10 0.0 256 144 0.1 19.5 6.9
TCP-other 59 0.0 68 110 0.2 9.0 2.3
ICMP 13 0.0 1470 1500 1.3 52.3 3.5
Total: 82 0.0 313 1146 1.8 17.1 3.1

SrcIf SrcIPaddress DstIf DstIPaddress Pr SrcP DstP Pkts

“show pfr/oer master traffic-class”

OER Prefix Statistics:
Pas – Passive, Act – Active, S – Short term, L – Long term, Dly – Delay (ms),
P – Percentage below threshold, Jit – Jitter (ms),
MOS – Mean Opinion Score
Los – Packet Loss (packets-per-million), Un – Unreachable (flows-per-million),
E – Egress, I – Ingress, Bw – Bandwidth (kbps), N – Not applicable
U – unknown, * – uncontrolled, + – control more specific, @ – active probe all
# – Prefix monitor mode is Special, & – Blackholed Prefix
% – Force Next-Hop, ^ – Prefix is denied

DstPrefix Appl_ID Dscp Prot SrcPort DstPort SrcPrefix
Flags State Time CurrBR CurrI/F Protocol
PasSDly PasLDly PasSUn PasLUn PasSLos PasLLos EBw IBw
ActSDly ActLDly ActSUn ActLUn ActSJit ActPMOS ActSLos ActLLos
——————————————————————————–
7.7.7.0/24 N defa N N N N
INPOLICY 0 8.8.8.8 Se1/2 STATIC
U U 0 0 0 0 0 0
U U 0 0 N N N N

“show oer border routes static”

Flags: C – Controlled by oer, X – Path is excluded from control,
E – The control is exact, N – The control is non-exact

Flags Network Parent Tag
CE 7.7.7.0/24 0.0.0.0/0 5000

Epilogue:

Well folks, that’s all the steam I have left after pouring out my heart on PfR/OER. I hope this post was informative. Please drop me a line if you have any questions or I was not clear on any of my points. I appreciate any and all feedback. In my mind, Cisco gave us a glimpse into the future of networking way back in 2006. With data center technologies evolving on a daily basis, it’s only a matter of time before there is an MC for the enterprise network rather than just the edge. Heck Google is doing that already with 25% of all the Internet traffic TODAY! Until next time, keep those blinky lights flashing.

shaun

Apple Still Got It: Mac Pro (aka R2-Q5)

Apple Still Got It: Mac Pro (aka R2-Q5)

WOW! Being in the technology business for over 12 years, I admit it’s hard to get excited over new product announcements. Most of the time it’s minor tweaks and updates to the hardware platforms, faster processors, better performance, increased scale, new coat of paint and polish to the OS. You know the norm.

I was thinking this would be the story for Apple’s WWDC today, but boy was I wrong about the Mac Pro.
Apple went ahead and gave firm confirmation they are still in the game to reinvent and revolution what we know about industrial design. One look at the new Mac Pro and you can’t help but think. Will this change everything we know about the aesthetic and design of a performance workstation? In today’s world, performance workstations are sort of a niche. Mobile devices and laptops are by and large the most prevalent “personal computers”. I for one, am very excited about the new Mac Pro. I’m so excited, that I will pre-order one and I don’t even have a purpose for it. I just want that futuristic R2-Q5 on my home office desk as technology art work.

Apple has made me believe once again. Thank you!

Be sure to check out the keynotes for iOS 7, OSX Mavericks, the new MacBook Air, and of course my favorite Mac Pro aka R2-Q5.

http://www.apple.com/mac-pro/ 

http://www.apple.com/apple-events/june-2013/ apple-wwdc-mac-pro-innards

 

R-2Q52

Cisco UCS: Virtual Interface Cards & VM-FEX

Cisco UCS: Virtual Interface Cards & VM-FEX

Hello once again! Today I decided to talk about some Cisco innovations around of UCS platform. I’m going to try my best to keep this post high-level and EASY to understand as most things “virtual” can get fairly complex.

First up is Virtual Interface Card (VIC). This is Cisco’s answer to 1:1 mapped blade mezzanine cards in blade servers and other “virtual connectivity” mezzanine solutions. Instead of having a single MEZZ/NIC mapped to a specific internal/external switch/interconnect we developed a vNIC optimized for virtualized environments. At the heart of this technology is FCoE and 10GBASE-KR backplane Ethernet. In the case of the VIC 1240, we have 4x 10G connections that connect to the FEX, this connectivity is FCoE until the traffic gets to the fabric interconnect outside the chassis. The internal mapping to the server/blade allows you to dynamically create up to 128 PCIe virtual interfaces. Now here is the best part, you can define the interface type (NIC/HBA) and the identity (MAC/WWN). What does that mean? Easy policy based, stateless, and agile server provisioning. Does one really need 128 interfaces per server??? Perhaps in an ESX host you want the “flexibility and scale”. Oh yea, there is ANOTHER VIC that supports 256 vNICs and has 80Gbps to the backplane!!! That model is the 1280 VIC.

NOTE: 8 interfaces are reserved on both the 128/256 VICs for internal use and the actual number of vNICs presented to the server may be limited by the OS. 

Update: 

Just had a great conversation with a customer today and I want to take a minute to break down the math.

Today we have the 2208 FEX (I/O) module for the 5108 chassis. Each one supports 80G (8×10) uplinks to the Fabric Interconnect. This give a total of 160G to each chassis if all uplinks were utilized.

On the back side of each 2208 I/O is 32 10G ports (downlinks) for a total of 320G to the midplane. We are now at 640G total (A/B side). Take the total amount of blades per chassis and multiple that by 80G. 8 (blades) * 80G (eight traces per blade of 10G) = 640G. 🙂

Just keep in mind that the eight traces to each blades are 4x10G on the (A) side and 4x10G on the (B) side.

OK great I got all this bandwidth in the chasis, what can I do with all that? How about we carve out some vNICs. With the VIC 1240 mezz card you got 128 vNICs and 40Gb to the fabric. Not good enough? How about the VIC 1280 with 256 vNICs and 80Gb to the fabric. Just remember that your vNICs are going to have an active path mapped to either side (A/B) and can fail over to the other side in the event of an issue.  All the (A) side active side vNICs are in a hardware portchannel. Conversely the same holds true for the (B) side vNICs.

So Shaun, what’s you point to all this math? Choice and flexibility. You want 20Gb to the blade, you got it. You want 40G to the blade, done. 80G to the blade, no problem. 160G to the blade, OK but it has to be a full width. <GRIN>

Data Center: Nexus vPC Technology

Data Center: Nexus vPC Technology

Hi Cisco friends! I had a great question from a customer today regarding failure scenarios and vPC. On the surface, I thought this is an easy one. However, when I really gave it deep thought it really depends on the type of failure. Was the failure on the peer-link, peer keepalive, vPC member port, or the worst case dual active/double failure?

Let’s go through some of the failure examples.

vPC Member Port Failure
If one vPC member port goes down – for instance, if a link from a NIC goes down – the member is removed from the PortChannel without bringing down the vPC entirely. Conversely, the switch on which the remaining port is located will allow frames to be sent from the peer link to the vPC orphan port. The Layer 2 forwarding table for the switch that detected the failure is also updated to point the MAC addresses that were associated with the vPC port to the peer link.

vPC Complete Dual-Active Failure (Double Failure)
If both the peer link and the peer-keepalive link are disconnected, the Cisco Nexus switch does not bring down the vPC, because each Cisco Nexus switch cannot discriminate between a vPC device reload and a combined peer-link and peer-keepalive-link failure.

The main problem with a dual-active scenario is the lack of synchronization between the vPC peers over the peer link. This behavior causes IGMP snooping to malfunction, which in turn causes multicast traffic to drop. As described previously, a vPC topology intrinsically protects against loops in dual-active scenarios. Each vPC peer, upon losing peer-link connectivity, starts forwarding BPDUs on vPC member ports. With the peer-switch feature, both vPC peers send BPDUs with the same bridge ID to help ensure that the downstream device does not detect a spanning-tree misconfiguration. When the peer link and the peer-keepalive link are simultaneously lost, both vPC peers become operational primary.

vPC Peer-Link Failure
To prevent problems caused by dual-active devices, vPC shuts down vPC member ports on the secondary switch when the peer link is lost but the peer keepalive is still present.

When the peer link fails, the vPC peers verify their reachability over the peer-keepalive link, and if they can
communicate they take the following actions:

● The operational secondary vPC peer (which may not match the configured secondary because vPC is
nonpreemptive) brings down the vPC member ports, including the vPC member ports located on the fabric
extenders in the case of a Cisco Nexus 5000 Series design with fabric extenders in straight-through mode.

● The secondary vPC peer brings down the vPC VLAN SVIs: that is, all SVIs for the VLANs that happen to be configured on the vPC peer link, whether or not they are used on a vPC member port.

Note: To keep the SVI interface up when a peer link fails, use the command dual-active exclude interface-vlan.

At the time of this writing, if the peer link is lost first, the vPC secondary shuts down the vPC member ports. If this failure is followed by a vPC peer-keepalive failure, the vPC secondary keeps the interfaces shut down. This behavior may change in the future with the introduction of the autorecovery feature, which will allow the secondary device to bring up the vPC ports as a result of this sequence of events.

vPC Peer-Keepalive Failure

If connectivity of the peer-keepalive link is lost but peer-link connectivity is not changed, nothing happens; both vPC peers continue to synchronize MAC address tables, IGMP entries, and so on. The peer-keepalive link is mostly used when the peer link is lost, and the vPC peers use the peer keepalive to resolve the failure and determine which device should shut down the vPC member ports.

 Best Practices: 

Define a vPC domain (should match between peers, MUST NOT MATCH BETWEEN 7K and 5K in Double-Sided vPC) This Step is Required! “(config>vpc>domain)#vpc domain <id>
Define Role Priority: Lower Priority wins Primary Role, try and match your STP root bridge with the primary role. If using “peer-switch” the STP root will be the same on both peers. “(config>vpc>domain)# role priority <xxx>”

If roles shift (they are not preemptive) you would need to change the operational primary after a failure to a value of 36767 and shut/no shut the peerlink to restore the originally configured primary. 

If the Peer Switch is also preforming L3 switching the “peer-gateway” command is recommended.

The “vpc peer-gateway” allows HSRP routers to accept frames destined for their vPC peers.  This feature extends the virtual MAC address functionality to the paired router’s MAC address.  The feature is needed when certain storage/load balancing vendors break RFC rules by ignoring the ARP reply by an HSRP active router and reply directly to the host. Without this enabled packets could traverse the peer link and end up being dropped.

Enable vPC AutoRecovery
“(config-vpc-domain)# auto-recovery”

Beginning with Cisco NX-OS Release 5.2(1), you can configure the Cisco Nexus 7000 Series device to restore vPC services when its peer fails to come online by using the auto-recovery command. You must save this setting in the startup configuration. On reload, if the peer link is down and three consecutive peer-keepalive messages are lost, the secondary device assumes the primary STP role and the primary LACP role. The software reinitializes the vPCs, bringing up its local ports. Because there are no peers, the consistency check is bypassed for the local vPC ports. The device elects itself to be STP primary regardless of its role priority and also acts as the master for LACP port roles.

ARP SYNC
The ARP table sync feature overcomes the delay involved in ARP table restoration that can be triggered when one of the switches in the vPC domain goes offline and comes back online and also when there are peer-link port channel flaps. Enabling ARP on a vPC domain improves convergence times for unicast traffic.

To enable Address Resolution Protocol (ARP) synchronization between the virtual port channel (vPC) peers, use the ip arp synchronizecomand. To disable ARP synchronization, use the no form of this command.

(config-vpc-domain)# ip arp synchronize

Content Source:
http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/design_guide_c07-625857.pdf
http://www.cisco.com/en/US/docs/switches/datacenter/nexus5000/sw/command/reference/vpc/n5k-vpc_cmds_i.html#wp1316724

Cisco: Algo Boost Nexus 3548 Preview/Unbox

Cisco: Algo Boost Nexus 3548 Preview/Unbox

I got something very cool last week. It came overnight from my good friend Frank in NY. What we have here is a very special privilege folks. It’s a prototype of the Nexus 3548 ultra low latency switch using our custom ASIC called Algo Boost/Monticello. Instead of killing you with all the details I decided to create a video of the un-boxing and special features walkthrough. Enjoy!

https://www.youtube.com/user/4g1vn/featured

 

 

Private VLANs (PVLANs)

Private VLANs (PVLANs)

I recently had one of my customers asked about private VLANs and the benefits/use cases. I thought this was a good opportunity to refresh my knowledge of PVLANs because it was a weak area of mine during my last CCIE lab.

What are Private VLANs?

The main objective with PVLANs is conserving IP space, but still allowing L2 separation for security purposes. Typically, VLAN design calls for a single IP subnet for each VLAN. Here we are able to create multiple (secondary VLAN/s) VLANs for isolation, but conserve space by using a single subnet for all the secondary VLANs.

 

Terminology:
Primary VLAN: These are promiscious ports that can send and recieve frames with any other port Type (P-Port,C-Port,I-Port).
Secondary VLAN or VLANs:

  • Isolated: Any switch ports associated with an Isolated VLAN can reach the primary VLAN, but not any other Secondary VLAN. In addition, hosts associated with the same Isolated VLAN cannot reach each other. Only one Isolated VLAN is allowed in one Private VLAN domain.
  • Community: Any switch ports associated with a common community VLAN can communicate with each other and with the primary VLAN but not with any other secondary VLAN. There can be multiple distinct community VLANs within one Private VLAN domain.

Port Types: Promiscuous, Community, Isolated (easy way to remember this is PCI).

What is the use case?
Service Providers with multi-tenant buildings- HSRP Routers, switch, and multiple SMB customers sharing the same public IP space.
Hotel Internet Access- It would be undesired for the guests to see each other.
DMZ- Instead of creating multiple DMZ’s you could isolated the hosts within a DMZ from each other.
Cloud/Co-location Facilities- You want to use the same subnet but restrict traffic between customers/services.

Are there any vulnerabilities or exploits?
Here is the deal, I’ll be the first to say while by design this is not possible. I have found with network/data security to NEVER say NEVER. That being said. I-Ports are not allowed to send frames/IP packets to any other destination other than the P-Port (uplink). So, even if the D-MAC address, or  VLAN ID is changed, it will only flow to the P-port.

KEEP THIS SIMPLE FACT IN MIND. ALL PROTECTION IS AT L2 for PVLANs.
For more infomration on what I’m talking about, take a look at this link describing “Private VLAN attack”.
*** “Local-Proxy-arp” would have to be enabled on the router for the same subnet, “Proxy-Arp” for different subnets. ***
For the best protection make sure you leverage ACL’s to block unnecessary L3 communications between hosts.  

Summary:
If a SP was going to use a separate VLAN ID for every customer we would be limited to 4000+ customers and that would really waste IP space if we were assigning separate subnet to each VLAN like best practice dictates. With PVLANs the broadcast domain of a single VLAN is split into sub-domains (I-ports and C-Ports). What does all this mean? Well for one you block L2 frames (CDP/STP/etc..) from the isolated hosts and are able to scale multi-tenant solutions while conserving IP space.

“If the private VLAN feature is properly deployed, it can be used at
Layer 2 to segregate individual users or groups of users from each
other: this segregation allows a network designer to more effectively
constrain Layer 2 forwarding so as to, for instance, block or contain
unwanted inter-device communication like port scans or Address
Resolution Protocol (ARP) poisoning attacks.” – RFC5517

My Test Configuration:
Private VLANs can ONLY be configured in VTP TRANSPARENT or VTPv3 mode.

VLAN 500 (Primary VLAN P-Port)
VLAN 501 (Secondary VLAN I-Ports)
VLAN 502 (Secondary VLAN C-Ports)
FA0/1 is the promiscuous port with the router attached. 

sw1#sh vlan private-vlan

Primary Secondary Type Ports
——- ——— —————– ——————————————
500 501 isolated Fa0/1, Fa0/31, Fa0/32
500 502 community Fa0/1, Fa0/33, Fa0/34, Fa0/35

sw1#ip routing

vlan 500
private-vlan primary
private-vlan association 501-502
!
vlan 501
private-vlan isolated
!
vlan 502
private-vlan community

interface FastEthernet0/1
switchport private-vlan mapping 500 501-502
switchport mode private-vlan promiscuous

interface FastEthernet0/31
switchport private-vlan host-association 500 501
switchport mode private-vlan host
!
interface FastEthernet0/32
switchport private-vlan host-association 500 501
switchport mode private-vlan host
!
interface FastEthernet0/33
switchport private-vlan host-association 500 502
switchport mode private-vlan host
!
interface FastEthernet0/34
switchport private-vlan host-association 500 502
switchport mode private-vlan host
!
interface FastEthernet0/35
switchport private-vlan host-association 500 502
switchport mode private-vlan host

interface Vlan500
ip address 10.1.1.1 255.255.255.0
private-vlan mapping 501,502 (optional if the SVI is the gateway)

end 

Verify port mapping with “sh vlan private”
Verify isolation with a ping to 255.255.255.255 from each router/host.

References:
http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a008013565f.shtml 
http://en.wikipedia.org/wiki/Private_VLAN
http://www.cisco.com/en/US/docs/switches/lan/catalyst3560/software/release/12.2_52_se/configuration/guide/swpvlan.html
http://tools.ietf.org/html/rfc5517 

 

Cisco: Nexus 2000 (FEX) Configurations

Cisco: Nexus 2000 (FEX) Configurations

It’s been way too long since I posted on my blog. Well, I have been studying for the CCIE Data Center lab and wanted to pass on some very critical information on fabric extender (FEX) configurations. One of the most common questions that our Cisco friends ask me is “Why do I need to create a port channel for the FEX-Fabric links?”. Well let’s dive into the WHY, and then explore the HOW.

Let’s first start with a foundation on what FEX is and how it works. FEX is a highly scalable, low latency data center access layer solution. What makes it so awesome? The fact that is managed as a line card vs. a separate ToR/EoR/MoR switch. Take your Nexus 7000 (core) or 5000 (agg/access) and they play the role of the “PARENT” switch. The FEX 2000 is essentially a remote line card to the parent switch. The FEX supports 1/10G and FCoE for consolidated I/O.

Nexus 2000 Comparison:

2248-TP (E) : This is the most common FEX, 48 ports of 100/1000 BASE-T host interfaces and 4 x 10G (SFP+) fabric uplinks. The (E) varient has additional shared buffer (32MB) locally vs. using what’s on the parent switch.

2224-TP: Same as above just 24 host ports instead of 48.

2148T: First generation FEX. The host interface can only operate at 1G. Not recommended any longer, go with the 2248-TP instead.

2232PP:  This is our de-facto 10G FEX. It has 32 x 1/10G and FCoE host interfaces (SFP+) and 8 fabric uplinks (SFP+). This switch also supports DCB.

2232TM: There is also a 10G BASE-T varient of the above FEX. THIS ONE DOES NOT SUPPORT FCoE, but does support DCB. 

OK, now that that is out of the way, let’s get back to the question at hand. How do I configure the connectivity to the parent switch?

I’ll get stright to the point. Use EtherChannel for the fabric uplink interfaces.

So, what are my options anyways?

Static Pinning and EtherChannel are your two options.

Why do we like EtherChannel fabric interfaces anyways?

Well the bottom line is that with Static Pinning you do not have automatic failover capability. Sure it’s deterministic, but if one of those fabric links goes down, so do the associated host interfaces. Let’s say your using two links and set the ‘pinning max-links 2’, half of the FEX host interfaces are mapped to one fabric uplink and the other half of the host ports is mapped to the other fabric uplink. Let’s look at a visual representation of static pinning.

The major issue with this configuration is that if the fabric uplink goes down, so do the ports associated to that interface. There is NO failover. This is WHY we want to use EtherChannel instead of static pinning.

Now let’s talk about the other (preferred) solution. EtherChannel is 110% the way to go. Let’s look at the visual representation.


As you can see from this diagram, we are load balancing the host interfaces (HIF) across all the fabric uplinks. Here is a note from the Cisco Configuration Guide.

Note

A fabric interface that fails in the EtherChannel will not trigger a change to the host interfaces. Traffic is automatically redistributed across the remaining links in the EtherChannel fabric interface.

When you configure the Fabric Extender to use an EtherChannel fabric interface connection to its parent switch, the switch load balances the traffic from the hosts that are connected to the host interface ports by using the following load-balancing criteria to select the link:

  • For a Layer 2 frame, the switch uses the source and destination MAC addresses.
  • For a Layer 3 frame, the switch uses the source and destination MAC addresses and the source and destination IP addresses.
    And finally the configuration (HOW) for all this awesomeness.

    1) ENABLE FEX GLOBALLY:
    N5K(config)# feature fex
    2) Configure the member interfaces for FEX connectivity:
    N5K(config)# interface e1/1, e1/2
    N5K(config-if)# switchport mode fex-fabric
    N5K(config-if)# fex associate 100
    N5K(config-if)# channel-group 100
    3) Create the EtherChannel Interface: This probably is done already based on the previous command, but it doesn’t hurt to make sure.
    N5K(config)# interface po100
    N5K(config-if)# switchport mode fex-fabric
    N5K(config-if)# fex associate 100
    4) Create the FEX:
    N5K(config)# fex 100
    N5K(config-fex)# description FEX_100<<PO100>>
    N5K(config-fex)# pinning max-links 1 (This must 1 for an EtherChannel configuration, if you change this to any other number you cannot use EC and your using static pinning instead)
    FOUR EASY STEPS!!!!
    Here are some good troubleshooting commands.
    sh fex”
    sh fex detail” – This command will show the HIF to Fabric Uplink Mappings and the version of code running on the FEX.
    sh interface fex-fabric”  – This command will display all the FEX units attached to the parent switch.
    sh inventory fex xxx” Display the inventory information about a specific FEX.
    show diagnostic result fex 100″ – Display diagnostic test results for a specific FEX.

NOTE: Nexus 7000 (Parent) FEX

If your trying to use FEX on the N7K be certain to issue the following commands or FEX WILL NOT BE ENABLED.

In the DEFAULT VDC issue this command “install feature-set fex
Now switch to the VDC that you want to enable FEX on and issue this command “feature-set fex

Your all set now.

Cisco.com Configuration Guide for Nexus 2000:
http://tinyurl.com/36uojrv