Next Hop in MPLS VPNs

VN:F [1.9.6_1107]
Rating: 4.7/5 (6 votes cast)
By Marko Milivojevic on May 31st, 2010

In the last couple weeks I observed several discussions on some of the mailing lists in regard to the old question – “Why are we using Loopback interfaces for iBGP peering in MPLS”. Do we really have to?

To get a better understand the reason for using Loopback interfaces, we’ll set-up a quick network as depicted on the diagram below.

Diagram

Scenario Introduction

In this network, we have fairly basic MPLS setup with OSPF as the IGP of choice in the core. Customer sites are in VRF aptly named “CE” and there routing protocol of choice is BGP. There was no particular reason to choose BGP, other than I was lazy to redistribute.

The real catch is in VPNv4 BGP sessions between PE1 and PE2. Unlike the usual approach, we’re using peering using Ethernet0/0 interfaces on both ends. Those are the interfaces that connect PE1 to P1 and PE2 to P2 respectively. Let’s examine how that works.

PE1:

PE1#show ip bgp vpnv4 all summary | begin ^Neighbor
Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.1.1.11       4            1      62      62        4    0    0 00:54:24        1
192.168.2.2     4           12      62      62        4    0    0 00:54:01        1

PE2:

PE2#show ip bgp vpnv4 all summary | begin ^Neighbor
Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.2.2.22       4            2      63      61        4    0    0 00:54:24        1
192.168.1.1     4           12      62      62        4    0    0 00:54:01        1

Highlighted sections above represent VPNv4 sessions. If we examine routing tables on CE and PE routers, we should have the end-to-end connectivity.

CE1:

CE1#show ip route bgp | begin ^Gateway
Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
B        10.0.0.2/32 [20/0] via 10.1.1.1, 00:58:49

PE1:

PE1#show ip route vrf CE bgp | begin ^Gateway
Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
B        10.0.0.1/32 [20/0] via 10.1.1.11, 01:00:05
B        10.0.0.2/32 [200/0] via 192.168.2.2, 01:00:05

PE2:

PE2#show ip route vrf CE bgp | begin ^Gateway
Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
B        10.0.0.1/32 [200/0] via 192.168.1.1, 00:20:04
B        10.0.0.2/32 [20/0] via 10.2.2.22, 00:20:53

CE2:

CE2#show ip route bgp | begin ^Gateway
Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
B        10.0.0.1/32 [20/0] via 10.2.2.2, 00:58:50

The Problem

Let’s examine those pings fly!

CE1#ping 10.0.0.2 source Loopback0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.2, timeout is 2 seconds:
Packet sent with a source address of 10.0.0.1
.....
Success rate is 0 percent (0/5)

This… is clearly a problem. We have the routes, but we cannot ping between endpoints. Let’s troubleshoot this a little bit. First of all, what could be wrong here?

Many things can be wrong. Let’s list some of the likely candidates.

  • Access lists – There could be an access-list somewhere that blocks pings
  • Wrong routes – We have routes, but are they correct?
  • MPLS label issues – Do we have the correct labels end-to-end?

Let’s examine some of the unlikely candidates and why they are unlikely.

  • Missing ARP or some other layer 2 problem – Remember, we have BGP running. That’s TCP application. If TCP application can work, so should ping.
  • Missing routes – We can clearly see the routes in routing table. They are not missing, but they could be wrong!
  • Solar flares – while they can generally influence complex networks in unusual ways, they are pretty unlikely case here. Trust me, I know – I broke this network personally. It’s not that!

Troubleshooting

Now that we have some likely and some unlikely candidates, let’s troubleshoot further. Access-lists can be a problem, but I’m inclined to look into routes and labels, first. Route to CE2 from CE1 has next-hop pointing to PE1. That’s OK. What about the same route on PE1?

PE1#show ip route vrf CE 10.0.0.2

Routing Table: CE
Routing entry for 10.0.0.2/32
  Known via "bgp 12", distance 200, metric 0
  Tag 2, type internal
  Last update from 192.168.2.2 00:23:33 ago
  Routing Descriptor Blocks:
  * 192.168.2.2 (default), from 192.168.2.2, 00:23:33 ago
      Route metric is 0, traffic share count is 1
      AS Hops 1
      Route tag 2
      MPLS label: 22
      MPLS Flags: MPLS Required

Next hop for the route is in the global routing table – which is perfectly normal and expected. Take a look at the diagram now. Next hop is the interface between P2 and PE2. What happens with labels?

PE1#show mpls forwarding-table 192.168.2.0
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop
Label      Label      or Tunnel Id     Switched      interface
21         21         192.168.2.0/24   0             Et0/0      192.168.1.11

We have the label. Let’s move on and verify the same thing on P1.

P1#show mpls forwarding-table 192.168.2.0
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop
Label      Label      or Tunnel Id     Switched      interface
21         Pop Label  192.168.2.0/24   0             Et0/1      192.168.12.22

P1 is popping the label – it’s performing penultimate hop popping. So, when we send the packet from CE1 to CE2, it gets VPN label by PE1, it getc the label to reach PE2, but P1 will remove that label and send the packet with the VPN label to P2. Since P2 has no idea what to do with that packet, it drops it. Not very nice of P2. How do we solve this?

Solution 1 – Change Peering to use Loopbacks

This is by far the easiest solution. We can change peering to use Loopback0 interfaces on PE1 and PE2 and problem should be solved. Let’s do that and see it work!

PE1:

PE1(config)#router bgp 12
PE1(config-router)#no neighbor 192.168.2.2
PE1(config-router)#neighbor 192.168.0.2 remote-as 12
PE1(config-router)#neighbor 192.168.0.2 update-source Loopback0
PE1(config-router)#address-family vpnv4
PE1(config-router-af)#neighbor 192.168.0.2 activate
PE1(config-router-af)#neighbor 192.168.0.2 send-community both

PE2:

PE2(config)#router bgp 12
PE2(config-router)#no neighbor 192.168.1.1
PE2(config-router)#neighbor 192.168.0.1 remote-as 12
PE2(config-router)#neighbor 192.168.0.1 update-source Loopback0
PE2(config-router)#address-family vpnv4
PE2(config-router-af)#neighbor 192.168.0.1 activate
PE2(config-router-af)#neighbor 192.168.0.1 send-community both

To verify, let’s ping from CE1 and CE2.

CE1:

CE1#ping 10.0.0.2 source Loopback0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.2, timeout is 2 seconds:
Packet sent with a source address of 10.0.0.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/20/32 ms

CE2:

CE2#ping 10.0.0.1 source Loopback0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
Packet sent with a source address of 10.0.0.2
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/20/32 ms

Joy and happiness!

Solution 2 – Change Peering to use Another Ethernet Interface

That being done, this is not the only solution. Remember, the problem was that P1 was popping the label, not that we didn’t use Loopback0 for peering. Using Loopback interfaces is recommended practice, but it’s not a requirement! Let’s prove that. In the second solution, we’ll change the peering to use Ethernet0/1 interfaces on PE1 and PE2.

PE1:

PE1(config)#router bgp 12
PE1(config-router)#no neighbor 192.168.0.2
PE1(config-router)#neighbor 192.168.222.2 remote-as 12
PE1(config-router)#neighbor 192.168.222.2 update-source Ethernet0/1
PE1(config-router)#address-family vpnv4
PE1(config-router-af)#neighbor 192.168.222.2 activate
PE1(config-router-af)#neighbor 192.168.222.2 send-community both

PE2:

PE2(config)#router bgp 12
PE2(config-router)#no neighbor 192.168.0.1
PE2(config-router)#neighbor 192.168.111.1 remote-as 12
PE2(config-router)#neighbor 192.168.111.1 update-source Ethernet0/1
PE2(config-router)#address-family vpnv4
PE2(config-router-af)#neighbor 192.168.111.1 activate
PE2(config-router-af)#neighbor 192.168.111.1 send-community both

Let’s make sure the peering sessions are up and that we have routes in place on PE routers.

PE1:

PE1#show ip bgp all summary | begin ^Neighbor
Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.1.1.11       4            1      54      58       10    0    0 00:47:19        1
192.168.222.2   4           12       6       6       10    0    0 00:01:36        1
PE1#show ip route vrf CE | begin ^Gateway
Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
B        10.0.0.1/32 [20/0] via 10.1.1.11, 00:45:21
B        10.0.0.2/32 [200/0] via 192.168.222.2, 00:01:38
C        10.1.1.0/24 is directly connected, Serial1/0
L        10.1.1.1/32 is directly connected, Serial1/0

PE2:

PE2#show ip bgp all summary | begin ^Neighbor
Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
10.2.2.22       4            2      56      57       10    0    0 00:48:12        1
192.168.111.1   4           12       7       7       10    0    0 00:02:28        1
PE2#show ip route vrf CE | begin ^Gateway
Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
B        10.0.0.1/32 [200/0] via 192.168.111.1, 00:02:37
B        10.0.0.2/32 [20/0] via 10.2.2.22, 00:47:10
C        10.2.2.0/24 is directly connected, Serial1/0
L        10.2.2.2/32 is directly connected, Serial1/0

Now, let’s make sure ping works between CE devices.

CE1:

CE1#ping 10.0.0.2 source Loopback0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.2, timeout is 2 seconds:
Packet sent with a source address of 10.0.0.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/17/24 ms

CE2:

CE2#ping 10.0.0.1 source Loopback0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
Packet sent with a source address of 10.0.0.2
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/19/28 ms

Fantastic! We have solved the problem again and in the process proved that we don’t need to have MP-BGP between Loopback interfaces. We need to ensure that correctly labeled packets arrive to PE routers. But that’s not all…

Solution 3 – No Peering Changes

What would happen if we were not allowed to change peering interfaces? How would we solve the problem then? Let’s roll back to the original setup and see what we can do about it. I will skip the rollback configuration and provide only the solution below.

We’re now back to original problem. When we examine the VRF routing table on PE1, we can see the “wrong” next-hop being used.

PE1:

PE1#show ip route vrf CE bgp | begin ^Gateway
Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
B        10.0.0.1/32 [20/0] via 10.1.1.11, 00:52:43
B        10.0.0.2/32 [200/0] via 192.168.2.2, 00:00:51

What happens if we force a different next-hop using route-map? Let’s try it out.

PE1:

PE1(config)#route-map FIX
PE1(config-route-map)#set ip next-hop 192.168.0.2
PE1(config-route-map)#router bgp 12
PE1(config-router)#address-family vpnv4
PE1(config-router-af)#neighbor 192.168.2.2 route-map FIX in

PE2:

PE2(config)#route-map FIX
PE2(config-route-map)#set ip next-hop 192.168.0.1
PE2(config-route-map)#router bgp 12
PE2(config-router)#address-family vpnv4
PE2(config-router-af)#neighbor 192.168.1.1 route-map FIX in
PE2(config-router-af)#end
PE2#clear ip bgp * all

Quick look at routing tables on PE1 and PE2:

PE1:

PE1#show ip route vrf CE bgp | begin ^Gateway
Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
B        10.0.0.1/32 [20/0] via 10.1.1.11, 00:00:56
B        10.0.0.2/32 [200/0] via 192.168.0.2, 00:00:56

PE2:

PE2#show ip route vrf CE bgp | begin ^Gateway
Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
B        10.0.0.1/32 [200/0] via 192.168.0.1, 00:01:37
B        10.0.0.2/32 [20/0] via 10.2.2.22, 01:00:42

Ping should now work flawlessly between CE routers.

CE1:

CE1#ping 10.0.0.2 source Loopback0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.2, timeout is 2 seconds:
Packet sent with a source address of 10.0.0.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/16/20 ms

CE2:

CE2#ping 10.0.0.1 source Loopback0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds:
Packet sent with a source address of 10.0.0.2
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/20/32 ms

And it does!

I hope you enjoyed this brief troubleshooting session on MPLS L3VPNs. See you next time!


Marko Milivojevic – CCIE #18427
Senior Technical Instructor – IPexpert
Join our Online Study List

Next Hop in MPLS VPNs, 4.7 out of 5 based on 6 ratings
Share and Enjoy:
  • RSS
  • Twitter
  • Facebook
  • Google Bookmarks
  • Digg
  • Print
  • Technorati
  • Slashdot
  • LinkedIn
  • del.icio.us
  • Reddit
  • Sphinn
  • Mixx
  • Blogplay
  • Netvibes
  • NewsVine
  • Live
  • Ping.fm
  • MySpace
  • Yahoo! Bookmarks
  • Yahoo! Buzz

3 Responses to “Next Hop in MPLS VPNs”

  1. Prakash says:

    This is great basic fundas

    VA:F [1.9.6_1107]
    Rating: 5.0/5 (1 vote cast)
  2. Tim says:

    Thanks for the article; very informative!

    Question: what is the specific reason that P1 was performing PHP on the label when the BGP session between PE1 and P1 was using the Et0/0 link as it’s source as opposed to Et0/1 or the Lo0 address? Thanks!

    VA:F [1.9.6_1107]
    Rating: 3.0/5 (2 votes cast)
    • The reason is because that link is advertised by P2 as the connected route and not by PE2, which is what we need for MPLS VPN between PE1 and PE2 to work.

      Take note that the exact same thing happens in the other direction. For traffic from CE2 to CE1, P2 will perform PHP for PE1′s Et0/0 link – which is advertised by P1 as the connected route.


      Marko Milivojevic – CCIE #18427
      Senior Technical Instructor – IPexpert
      Join our Online Study List

      VN:F [1.9.6_1107]
      Rating: 3.0/5 (2 votes cast)

Leave a Reply