Hello everybody! Welcome to another tech-torial blog here at IPexpert!!! Today we are going to be diving head first into one of the newest, most exciting, and in some ways most feared technologies on the new v4.0 R&S blueprint – MPLS L3 VPN. Putting together this blog, I can see why Cisco has added this particular technology into the mix. To get something like this working end to end you really HAVE to be an expert in many different technologies. MPLS L3 VPN is not a single thing you have to master. You must master all the little pieces individually, and then put it all together. First, we’ll start with the quintessential network diagram of what we will be playing with today.
Whoa. There is a LOT going on there!!! Before you do something you will likely regret, let’s break this down into little pieces. Remember, in the CCIE lab breaking things down into their basic pieces is a big part of our success! Before we even begin to think about what’s going from the CE routers to the PE routers let’s get some basic facts straight and cover what is going on in the service provider cloud here.
Part 1: Fundamental MPLS terms and functions
Let’s first focus on the frame-relay cloud here, as it will serve as our “service provider.” The first thing we need to focus on is that we are going to be running OSPF area 0 as well as MPLS inside our frame-relay cloud here. OK. Well, we should be masters of OSPF by now, but what do I mean we are running MPLS? MPLS in the most basic form is simply a new paradigm for forwarding packets. We will forward packets based on something called a label instead of based on the destination IP address. For instance, when R2 wants to talk to R4 over the frame cloud, we would normally do so by looking in our routing table for the destination address and finding the 22.214.171.124/32 host route (we are running OSPF point-to-multipoint here which treats each connection as an individual point-to-point link and injects /32 host routes), seeing that it is available via OSPF with a next-hop of R6, and forwarding the packet out the S0/1/0.2456 interface. With MPLS, instead of looking up a prefix in the routing table and forwarding an IPv4 packet out an interface, we look up prefixes in our MPLS label table (the LFIB) and send things out an interface with a label attached to it. With MPLS, we generate a label for each prefix in our routing table that we learned from an IGP. So in this example, R2 would receive an OSPF route for 126.96.36.199/32. It would then add it to the routing table, and also generate an MPLS label for it. Additionally, it would be receiving label for this prefix from R6. It would put the prefix into the LFIB basically saying for the 188.8.131.52/32 prefix, if I need to send to it attach the label that I learned from R6 and send it towards R6.
We also need a mechanism for label distribution. In this case, we will be running the LDP (label distribution protocol) to advertise out the labels to our other MPLS routers. For instance, R4 will generate a label for the 184.108.40.206/32 route and send it over to R6 via LDP. R6 will then advertise a label for the 220.127.116.11/32 prefix over to R2. Now, if R2 needs to send a packet to 18.104.22.168/32 it simply looks up the prefix in the LFIB, labels the packet and sends it out to R6. R6 sees the MPLS label, consults it’s LFIB and sees that it simply needs to swap the label coming in for the label advertised to it via R4. Essentially, inside the SP cloud we are simply label switching.
Before we get too into the MPLS side of things, we need to get basic OSPF working on the frame-relay. Because this article is going to be very long as it is, and because OSPF is not really the major focus here, I have preconfigured this. Each router on the frame-relay is running OSPF area 0 on their frame-relay interface, and is running the network type point-to-multipoint. Again, setting up MPLS assumes you are already an expert in OSPF and frame-relay in this case. Now, let’s configure our actual MPLS on the frame-relay network. To do this, we need to ensure CEF is enabled (it is by default), turn on mpls with the “mpls ip” global command, and finally enable MPLS on the interface with the “mpls ip” interface command. Additionally, we need to make sure our label distribution protocol is up and running. To ensure you are using LDP you need to run the “mpls label protocol ldp” global config command. You will also notice the OSPF command “mpls ldp autoconfig”. The ldp autoconfig command in the OSPF process tells the router to go ahead and enable LDP on all the OSPF speaking interfaces.
R2, R5, R6:
mpls label protocol ldp mpls ip interface s0/1/0.2456 mpls ip ! router ospf 1 mpls ldp autoconfig !
mpls label protocol ldp mpls ip interface s0/0/0.2456 mpls ip ! router ospf 1 mpls ldp autoconfig
Just like that we are actually running MPLS! Check it out below…we can see that R6 (our frame hub) is peered up to all three spokes via LDP.
R6#show mpls ldp neigh Peer LDP Ident: 22.214.171.124:0; Local LDP Ident 126.96.36.199:0 TCP connection: 188.8.131.52.646 - 184.108.40.206.27176 State: Oper; Msgs sent/rcvd: 40/40; Downstream Up time: 00:25:26 LDP discovery sources: Serial0/1/0.2456, Src IP addr: 220.127.116.11 Addresses bound to peer LDP Ident: 18.104.22.168 22.214.171.124 Peer LDP Ident: 126.96.36.199:0; Local LDP Ident 188.8.131.52:0 TCP connection: 184.108.40.206.646 - 220.127.116.11.51798 State: Oper; Msgs sent/rcvd: 40/39; Downstream Up time: 00:25:22 LDP discovery sources: Serial0/1/0.2456, Src IP addr: 18.104.22.168 Addresses bound to peer LDP Ident: 22.214.171.124 126.96.36.199 Peer LDP Ident: 188.8.131.52:0; Local LDP Ident 184.108.40.206:0 TCP connection: 220.127.116.11.646 - 18.104.22.168.60000 State: Oper; Msgs sent/rcvd: 39/39; Downstream Up time: 00:25:22 LDP discovery sources: Serial0/1/0.2456, Src IP addr: 22.214.171.124 Addresses bound to peer LDP Ident: 126.96.36.199 188.8.131.52
Using this LDP peering, our routers are going to exchange labels for prefixes. Now we will check out part of the LFIB on R2 to see an example…
R2#sh mpls forwarding-table | i 150.100.100. 18 Pop Label 184.108.40.206/32 0 Se0/1/0.2456 220.127.116.11 19 16 18.104.22.168/32 0 Se0/1/0.2456 22.214.171.124 20 17 126.96.36.199/32 0 Se0/1/0.2456 188.8.131.52 21 Pop Label 184.108.40.206/32 0 Se0/1/0.2456 220.127.116.11 22 19 18.104.22.168/32 0 Se0/1/0.2456 22.214.171.124 23 20 126.96.36.199/32 0 Se0/1/0.2456 188.8.131.52
What we see here is that if R2 receives a packet with a label of 19 (headed to the 184.108.40.206/32) it will swap that label for a label of 16 and send it out s0/1/0.2456 towards R6. Similarly, if it has to reach R4′s frame interface it sends it towards R6 with a label of 17. Finally, if it has to talk to R6′s frame interface it actually will not append a label, and just send it to R6. This is called PHP (penultimate hop popping) and basically means that when R6 advertises the prefix to directly connected LDP neighbors, it does so with a special label called “implicit null” meaning “don’t attach a label to these prefixes because I have them directly connected.” It prevents the router who advertises the route from doing more work than necessary later. What we have at this point from an MPLS perspective is a label for each frame interface, and a label for each frame router’s loopback (which we will use for BGP peering.)
Part 2: MP-BGP Configuration
OK, now that we have MPLS running in the core, we need to setup BGP. Because we have MPLS labels for each of the frame routers’ loopback addresses, behind the scenes things are actually going to be using MPLS label switching to setup the BGP TCP connection. Cool, huh?!!! You should also notice in the diagram that we don’t have an iBGP full mesh. We will get around this by configuring R2 as a route-reflector. Again, we have to be masters at all of this stuff already. For sake of brevity, we will only show the R2 BGP configuration here. We are using BGP here to exchange our customer routes between PE devices. These routes are a special type of route called a VPNv4 route. You notice that in our BGP configuration we have to add that as an address family. When we setup a VRF on a router, we configure an RD (route descriptor) which uniquely identifies our routes in BGP. This way, multiple customers could be using the same prefix and still remain unique. For instance, customer A and customer B could be using the 10.0.0.0/8 range. In the SP cloud, our BGP will distinguish between the two using the RD. The RD is essentially appended to the beginning of the regular IPv4 route to give you a 96 bit VPNv4 route (64 bit RD + 32 bit ipv4 route). Note here that we must include the “send-community extended” command on each BGP neighbor because we are communicating the BGP extended community RT in each of our updates.
router bgp 65001 no synchronization bgp log-neighbor-changes neighbor 220.127.116.11 remote-as 65001 neighbor 18.104.22.168 update-source Loopback0 neighbor 22.214.171.124 remote-as 65001 neighbor 126.96.36.199 update-source Loopback0 no auto-summary ! address-family vpnv4 neighbor 188.8.131.52 activate neighbor 184.108.40.206 send-community both neighbor 220.127.116.11 route-reflector-client neighbor 18.104.22.168 activate neighbor 22.214.171.124 send-community both neighbor 126.96.36.199 route-reflector-client !
Part 3: CE-PE Configuration & VRFs
OK! Wow, we’ve already done a lot of work, and haven’t even got to the PE/CE routing yet!!! Now that we have OSPF working in the SP cloud, MPLS setup to do tag switching, and MP-iBGP between our PE routers, we can begin to focus on what is happening between the PE and the CE. Let’s start with R2 because we have a customer A and a customer B site hanging off of it. The whole idea here is that on R2 we will run a separate “virtual router” between R2 and customer A and one between R2 and customer B. This is called a VRF (virtual routing and forwarding.) Each individual VRF routing table is completely separated from other VRFs. We globally create VRFs on the router then tell the router what interfaces we wish to be a part of each VRF. We will then run two separate RIP processes: One RIP process to customer A and one to customer B. This is done in RIP and EIGRP using an address-family as well. On the customer side, we are just configuring normal plain old vanilla RIP. No VRF, no MPLS, nothing magical. OK, let’s get to work on that.
First, we have to define our two VRFs. We give it a name, and tell it what RD we wish to use. Additionally, we tell the VRF what routes it will actually accept later down the road using a route-tag (RT). The route-tag export says “When I export routes from RIP into BGP export them using this RT.” The route-tag import says “Only allow routes tagged with this RT into my VRF.”
ip vrf VPNA rd 65001:1 route-target export 65001:1 route-target import 65001:1 ! ip vrf VPNB rd 65001:2 route-target export 65001:2 route-target import 65001:2 !
Next, we setup our RIP processes using address-families. We associate one RIP process with VPNA and another one with VPNB.
router rip address-family ipv4 vrf VPNA network 10.0.0.0 no auto-summary version 2 ! address-family ipv4 vrf VPNB network 10.0.0.0 no auto-summary version 2 !
On R1 and R9, we are simply running regular RIP and advertising the 10.0.0.0 network and our respective loopbacks. We would do very similar configurations over on R5 for the VLAN 58 segment, and over on R4 for the VLAN 47 segment. Let’s verify we are seeing things.
R2#sh ip route rip R2#sh ip route Gateway of last resort is not set 188.8.131.52/32 is subnetted, 1 subnets C 184.108.40.206 is directly connected, Loopback0 220.127.116.11/32 is subnetted, 1 subnets O 18.104.22.168 [110/129] via 22.214.171.124, 00:53:52, Serial0/1/0.2456 126.96.36.199/32 is subnetted, 1 subnets O 188.8.131.52 [110/129] via 184.108.40.206, 00:53:52, Serial0/1/0.2456 220.127.116.11/32 is subnetted, 1 subnets O 18.104.22.168 [110/65] via 22.214.171.124, 00:53:52, Serial0/1/0.2456 126.96.36.199/16 is variably subnetted, 4 subnets, 2 masks O 188.8.131.52/32 [110/64] via 184.108.40.206, 00:53:53, Serial0/1/0.2456 O 220.127.116.11/32 [110/128] via 18.104.22.168, 00:53:53, Serial0/1/0.2456 O 22.214.171.124/32 [110/128] via 126.96.36.199, 00:53:55, Serial0/1/0.2456 C 188.8.131.52/24 is directly connected, Serial0/1/0.2456
Hmmm….Well we don’t see any RIP routes. Furthermore, the only thing we see in our entire routing table is our directly connected routes, and the loopbacks of our other frame routers we learned via OSPF. The reason is this – Because we are running multiple “virtual routers” we have multiple virtual routing tables!!! To see the various individual routing tables we need to do a “show ip route vrf” command. The same goes for other commands like ping, traceroute, and telnet. You always need to tell the router what VRF you are working with. What we see above is the “global” routing table. Anything not in a particular VRF gets put into the global routing table. Well then, moving on…
R2#show ip route vrf VPNA rip 184.108.40.206/32 is subnetted, 1 subnets R 220.127.116.11 [120/1] via 10.0.12.1, 00:00:14, FastEthernet1/0 10.0.0.0/8 is variably subnetted, 8 subnets, 2 masks R 10.0.1.105/32 [120/1] via 10.0.12.1, 00:00:14, FastEthernet1/0 R 10.0.1.104/32 [120/1] via 10.0.12.1, 00:00:14, FastEthernet1/0 R 10.0.1.103/32 [120/1] via 10.0.12.1, 00:00:14, FastEthernet1/0 R 10.0.1.102/32 [120/1] via 10.0.12.1, 00:00:14, FastEthernet1/0 R 10.0.1.101/32 [120/1] via 10.0.12.1, 00:00:14, FastEthernet1/0 R 10.0.1.100/32 [120/1] via 10.0.12.1, 00:00:14, FastEthernet1/0 R2#show ip route vrf VPNB rip 18.104.22.168/32 is subnetted, 1 subnets R 22.214.171.124 [120/1] via 10.0.29.9, 00:00:13, FastEthernet1/1 10.0.0.0/8 is variably subnetted, 14 subnets, 2 masks R 10.0.9.103/32 [120/1] via 10.0.29.9, 00:00:13, FastEthernet1/1 R 10.0.9.102/32 [120/1] via 10.0.29.9, 00:00:13, FastEthernet1/1 R 10.0.9.101/32 [120/1] via 10.0.29.9, 00:00:13, FastEthernet1/1 R 10.0.9.100/32 [120/1] via 10.0.29.9, 00:00:13, FastEthernet1/1 R 10.0.9.105/32 [120/1] via 10.0.29.9, 00:00:13, FastEthernet1/1 R 10.0.9.104/32 [120/1] via 10.0.29.9, 00:00:13, FastEthernet1/1
Sweeeeeet!!! We can see that in our VPNA VRF we have all the RIP routes being advertised to us in VRFA from R1. In our VPNB VRF we have all the RIP routes being advertised to us in VRFB from R9. We should see similar results over on R4 and R5, but we simply do not have room to include everything here!
Part 4: Redistribution & Putting It All Together
OK. OSPF in our core SP network – CHECK. MPLS configured in our SP core – CHECK. MP-iBGP running in our core between PE routers – CHECK. PE/CE IGP configured and working properly – CHECK.
Now, the final piece of the puzzle – We have learned all these routes from customer A and customer B on routers R2, R4 and R5. We have put those routes into their own VRFs, one for each customer on each of R2, R4 and R5. We have a BGP connection between R2, R4 and R5. So, how do we get the RIP routes we learned on each PE router to the other PE routers? THIS is where we will utilize our MP-iBGP!!! When we learn the routes, they get put into a VRF right? We told our routers what RD to use in each VRF, and we told each VRF how to export and import routes based on the RT. Now what we will do is redistribution. Again, yet ANOTHER topic we already have to be masters in to get this to work. So the basic idea is that on the PE routers we will do mutual redistribution between RIP and BGP. When we learn the route via RIP from the CE and redistribute into BGP, the route will get prepended with the RD we specified for the VRF and thus become a 96-bit VPNv4 route. It will also get tagged with the export RT community value we specified and sent out over BGP. When the BGP neighbors receive the route, they will be redistributing from BGP into RIP. At that point, the router will look at the RT value we tagged the route with and see if it is allowed into any VRF by looking at the RT import value we set on that end. For example, R2 will receive the 126.96.36.199/32 route from R1 via RIP. It will receive this RIP route on an interface and RIP process that we have allocated to VRFA. We have told VRFA we wish to use the RD 65001:1 as well as RT import and RT export 65001:1. When R2 redistributes from RIP into BGP and sends the route to R4 and R5 over BGP, it will send them as VPNv4 routes (RD+prefix) using the VPNv4 address-family. It will also send them with the extended BGP RT community value 65001:1. When the route arrives at R4 for instance, R4 will redistribute the route into RIP to any VRF that allows the 65001:1 RT via the RT import function.
Additionally, notice that for our RIP to BGP redistribution, we need to do this under the ipv4 address-family in BGP, NOT the vpnv4 address-family. Again, for sake of brevity we will only show the configuration here on R2. We are doing very similar configurations on R4 and R5.
router rip address-family ipv4 vrf VPNB redistribute bgp 65001 metric 3 ! address-family ipv4 vrf VPNA redistribute bgp 65001 metric 3 ! router bgp 65001 address-family ipv4 vrf VPNB redistribute connected redistribute rip no synchronization ! address-family ipv4 vrf VPNA redistribute connected redistribute rip no synchronization !
Alright…we are done!!! I know this is a LOT to take in. Let’s do some verification now to get the big picture, and then step through an end-to-end ping packet flow.
First, we will examine the routing table on R1. R1 is a customer A CE router, so it should only be seeing customer A routes. These should include the 188.8.131.52/32 and the 10.0.7.100-105 routes from R7.
R1#sh ip route rip 184.108.40.206/32 is subnetted, 1 subnets R 220.127.116.11 [120/3] via 10.0.12.2, 00:00:03, FastEthernet0/0 10.0.0.0/8 is variably subnetted, 14 subnets, 2 masks R 10.0.47.0/24 [120/3] via 10.0.12.2, 00:00:03, FastEthernet0/0 R 10.0.7.105/32 [120/3] via 10.0.12.2, 00:00:03, FastEthernet0/0 R 10.0.7.104/32 [120/3] via 10.0.12.2, 00:00:03, FastEthernet0/0 R 10.0.7.101/32 [120/3] via 10.0.12.2, 00:00:03, FastEthernet0/0 R 10.0.7.100/32 [120/3] via 10.0.12.2, 00:00:03, FastEthernet0/0 R 10.0.7.103/32 [120/3] via 10.0.12.2, 00:00:03, FastEthernet0/0 R 10.0.7.102/32 [120/3] via 10.0.12.2, 00:00:03, FastEthernet0/0 R1#ping 18.104.22.168 so lo0 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 22.214.171.124, timeout is 2 seconds: Packet sent with a source address of 126.96.36.199 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 92/92/96 ms
Victory is mine!!! (Stewie Griffin voice). We can see that on R1 we only have received routes in VPNA. Additionally we can ping end-to-end. Let’s take a look at a customer B CE.
R8#sh ip route rip 188.8.131.52/32 is subnetted, 1 subnets R 184.108.40.206 [120/3] via 10.0.58.5, 00:00:21, FastEthernet0/0 10.0.0.0/8 is variably subnetted, 14 subnets, 2 masks R 10.0.29.0/24 [120/3] via 10.0.58.5, 00:00:21, FastEthernet0/0 R 10.0.9.103/32 [120/3] via 10.0.58.5, 00:00:21, FastEthernet0/0 R 10.0.9.102/32 [120/3] via 10.0.58.5, 00:00:21, FastEthernet0/0 R 10.0.9.101/32 [120/3] via 10.0.58.5, 00:00:21, FastEthernet0/0 R 10.0.9.100/32 [120/3] via 10.0.58.5, 00:00:21, FastEthernet0/0 R 10.0.9.105/32 [120/3] via 10.0.58.5, 00:00:21, FastEthernet0/0 R 10.0.9.104/32 [120/3] via 10.0.58.5, 00:00:21, FastEthernet0/0 R8#ping 220.127.116.11 so lo0 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 18.104.22.168, timeout is 2 seconds: Packet sent with a source address of 22.214.171.124 !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 80/81/84 ms
Awesome, we have reachability here! Let’s recap what happens here when we ping from R8 to 126.96.36.199 sourced from our loopback like we did above.
- R8 issues the command: ping 188.8.131.52 source lo0. It looks up 184.108.40.206 in the routing table, and sees a route for 220.127.116.11/32 learned via RIP from R5. It sends the packet to R5.
- R5 receives the packet on an interface designated for the VPNB VRF. R5 examines the ipv4 header and sees it is destined for 18.104.22.168. R5 does a lookup of 22.214.171.124, and sees that it has learned this route via BGP from R2. R5 looks up the next-hop IP address to R2 in the LFIB and sees it has a label for that next hop (126.96.36.199). R5 sends the packet with 2 MPLS labels. The “inner” label will be the VPN label describing the VPNB VRF. The “outside” label will be the label describing how to get to the next hop of R2. R5 sends this out the frame interface towards R2.
- To get to R2 from R5 we have to pass through our frame-hub, R6. R6 receives the frame on its frame interface and sees the outside MPLS label. Notice that R6 is not running ANY BGP whatsoever – AKA a BGP-free core. Additionally, R2 would have been advertising the implicit-null label to R6 for this particular prefix since it has it directly connected. Therefore, R6 will pop off the outside label and send it over to R2 with just the VPN label remaining.
- R2 finally gets the frame from the frame-relay hub, and sees the VPN label. It consults its LFIB and sees that it needs to pop this VPN label and forward the packet towards R9 as a regular IPv4 packet.
- R9 receives the packet, and replies. Now, the same thing happens going back the other way.
As you can no doubt see, we have a LOT going on here. At a minimum we need to be masters of the following technologies to get this going properly. No wonder Cisco has added this to the new v4.0 exam!!!
- OSPF over frame-relay
- BGP (including route-reflectors in this case, as well as multi-protocol configurations)
- MPLS (including LDP setup)
- All of our various CE-PE protocols we could be running (RIP, OSPF, EIGRP, BGP)
- Mutual redistribution between BGP and all these IGPs
- The actual VPN setup itself (VRFs, RD, RT, etc…)
Hopefully, this has been a useful article for you guys and I hope MPLS L3 VPN is a concept that you better understand now! Like anything else, it takes repetition, repetition, repetition to master! Good luck in your studies!!!
Joe Astorino – CCIE #24347MPLS L3 VPN,