Meet us at Cisco Live Las Vegas 2024
Home
>
Blog
>
BGP resiliency and received prefixes

BGP resiliency and received prefixes

4 minute read
Home
>
Blog
>
BGP resiliency and received prefixes
Updated: October 27, 2023
May 6, 2021
Updated: October 27, 2023
4 mins

BGP is famously known for being the routing protocol of the internet, but not only. A lot of companies are using BGP in their internal network. One of the main reasons for that: it allows great control and flexibility with routing exchange.

As with any routing protocol and critical connectivity, we want to ensure we are resilient to failure and that the resiliency is operational. With BGP, to do so, we need to have a closer look at the received prefixes.

Why should you care about the prefixes you are receiving via BGP?

Let's take a classic example, where you would have two devices connected to an external party. This could be your ISP, MPLS provider…

Site L47 contains 2 routers with 1 eBGP connection each to the MPLS Cloud. Both are receiving prefixes.
BGP - Resilient situation

In this example, site L47, on the left, is connected to site L21 which is your MPLS cloud.

You have your two eBGP sessions established with the MPLS cloud, that's great! Now you want to ensure that you are resilient by checking the receive routes from the provider on both links. This is how you will be able to reach any of your sites connected to the MPLS cloud.

In this situation, if one link fails, you have full resiliency. You will not lose connectivity to your MPLS network and all the services hosted outside site L47.

What happens if you are no longer receiving routes on one of the two links?

Site L47 contains 2 routers with 1 eBGP connection each to the MPLS Cloud. This time, one of the link receive zero prefix.
BGP - Resiliency affected due to no received prefixes

What can cause this situation?

  • Misconfiguration: configuration has changed on L47EXR2 or within the provider site L21 which has affected the number of received prefixes.
  • Provider issue: the provider is not advertising any route, this could be due to an issue within their core network. However, the BGP session stays up.

What is the problem?

In this situation, if you lose L47EXR1 or the link between this router and site L21, your secondary link will not be able to forward any traffic. L47EXR2 is not receiving any routes.

This is what you would see on your device:

Output of "show ip bgp summ" showing we can see the neigbor in the list, but in State/PfxRcd it shows 0.
BGP session is up...

The fact that you can see State/PfxRcd showing "0" tells you the BGP session is established, otherwise, you would see the current state (Idle, Active...). Any other number would be good, unfortunately in this scenario, you are not receiving any route.

Output of "show ip bgp neighbor x.x.x.x received-routes" which confirms we are not receiving any prefix.
... but you are not receiving any route

You are now in a situation where you do not have any resiliency, and your monitoring system is not alerting us on the issue.

Hold on, why is my monitoring system not alerting me?

That’s a very good question! In a situation like this one, the BGP session is not affected, which means there is no error generated by the router. The device will not send Syslog messages or SNMP traps to inform your monitoring system that you are not receiving any routes.

How can I spot the issue to ensure it gets fixed?

That's where IP Fabric can play a crucial role to identify the issue quickly before it causes any damages. There are several ways to see the problem:

1. Dashboard

The IP Fabric dashboard provides an overview of the network analysis results, including issues and links to the detailed reports.

IP Fabric Dashboard showing a "red flag" matching the established BGP session with no received prefixes.
IP Fabric Dashboard - BGP routing verification

After seeing this table, you will want to check further details regarding the issue. Just click on the interesting number, and you will be redirected to the appropriate technology page and the intent verifications.

More detail about the BGP neighbor not receiving any prefix.
IP Fabric Technology & intent verification - Routing/BGP

2. Diagram

As we have seen at the beginning, you will be able to check directly on the diagram the number of received prefixes for a specific neighbor.

It's also possible to display on the diagram the verification information from all the supported technologies. In this example, we will add the BGP information on the diagram for site 47:

GIF showing how to add the intent verification information on a diagram. The device not receiving the BGP prefix becomes red.
IP Fabric Diagram - Intent Verification

The moment we select the verification to add to the diagram, we can see L47EXR2 becomes red. If we click on the device, we will see the explanation on why it's showing red:

Further details on why this device becomes red when applying the intent verification
BGP resiliency and received prefixes 1

3. End to end Path

In addition, we can spot this issue by looking at the end-to-end path. Let's take an example where we are looking at the path from a source in site L47 to a destination in site L66, which is connected to the MPLS cloud.

In a normal situation, we would expect to see the traffic being able to use both links to reach the MPLS cloud:

End-to-end path between a source in site L47, going through the MPLS cloud to reach site L66.
We can see we have two links usable to leave site L47 to the MPLS cloud. This is the normal situation, resiliency is in place.
IP Fabric End to End path - Normal situation, resiliency is operational

If we now compare with the snapshot where we have the issue, we can clearly see that we have lost our resiliency:

bgp gif06 e2e compare
IP Fabric End to End path - Comparision between snapshots, resiliency is affected

Conclusion

You can't ensure BGP resiliency if you do not look at the received prefixes. The main issue with this situation is the fact that your monitoring system will not be able to inform you of the loss of received prefixes. The last thing you want is to face a massive outage because of one link failure and your secondary link was not operational.

I have been in that exact situation previously, and let's just say I did not enjoy writing the postmortem explaining why we did not know our redundant path wasn't working. It's not a situation you want to be in...

If you would like to find out more about IP Fabric and how it can help improve your existing infrastructure by detecting issues you are not aware of, please contact us through www.ipfabric.io! You can also follow our company’s LinkedIn or Blog, where more content will be emerging.

BGP resiliency and received prefixes

BGP is famously known for being the routing protocol of the internet, but not only. A lot of companies are using BGP in their internal network. One of the main reasons for that: it allows great control and flexibility with routing exchange.

As with any routing protocol and critical connectivity, we want to ensure we are resilient to failure and that the resiliency is operational. With BGP, to do so, we need to have a closer look at the received prefixes.

Why should you care about the prefixes you are receiving via BGP?

Let's take a classic example, where you would have two devices connected to an external party. This could be your ISP, MPLS provider…

Site L47 contains 2 routers with 1 eBGP connection each to the MPLS Cloud. Both are receiving prefixes.
BGP - Resilient situation

In this example, site L47, on the left, is connected to site L21 which is your MPLS cloud.

You have your two eBGP sessions established with the MPLS cloud, that's great! Now you want to ensure that you are resilient by checking the receive routes from the provider on both links. This is how you will be able to reach any of your sites connected to the MPLS cloud.

In this situation, if one link fails, you have full resiliency. You will not lose connectivity to your MPLS network and all the services hosted outside site L47.

What happens if you are no longer receiving routes on one of the two links?

Site L47 contains 2 routers with 1 eBGP connection each to the MPLS Cloud. This time, one of the link receive zero prefix.
BGP - Resiliency affected due to no received prefixes

What can cause this situation?

  • Misconfiguration: configuration has changed on L47EXR2 or within the provider site L21 which has affected the number of received prefixes.
  • Provider issue: the provider is not advertising any route, this could be due to an issue within their core network. However, the BGP session stays up.

What is the problem?

In this situation, if you lose L47EXR1 or the link between this router and site L21, your secondary link will not be able to forward any traffic. L47EXR2 is not receiving any routes.

This is what you would see on your device:

Output of "show ip bgp summ" showing we can see the neigbor in the list, but in State/PfxRcd it shows 0.
BGP session is up...

The fact that you can see State/PfxRcd showing "0" tells you the BGP session is established, otherwise, you would see the current state (Idle, Active...). Any other number would be good, unfortunately in this scenario, you are not receiving any route.

Output of "show ip bgp neighbor x.x.x.x received-routes" which confirms we are not receiving any prefix.
... but you are not receiving any route

You are now in a situation where you do not have any resiliency, and your monitoring system is not alerting us on the issue.

Hold on, why is my monitoring system not alerting me?

That’s a very good question! In a situation like this one, the BGP session is not affected, which means there is no error generated by the router. The device will not send Syslog messages or SNMP traps to inform your monitoring system that you are not receiving any routes.

How can I spot the issue to ensure it gets fixed?

That's where IP Fabric can play a crucial role to identify the issue quickly before it causes any damages. There are several ways to see the problem:

1. Dashboard

The IP Fabric dashboard provides an overview of the network analysis results, including issues and links to the detailed reports.

IP Fabric Dashboard showing a "red flag" matching the established BGP session with no received prefixes.
IP Fabric Dashboard - BGP routing verification

After seeing this table, you will want to check further details regarding the issue. Just click on the interesting number, and you will be redirected to the appropriate technology page and the intent verifications.

More detail about the BGP neighbor not receiving any prefix.
IP Fabric Technology & intent verification - Routing/BGP

2. Diagram

As we have seen at the beginning, you will be able to check directly on the diagram the number of received prefixes for a specific neighbor.

It's also possible to display on the diagram the verification information from all the supported technologies. In this example, we will add the BGP information on the diagram for site 47:

GIF showing how to add the intent verification information on a diagram. The device not receiving the BGP prefix becomes red.
IP Fabric Diagram - Intent Verification

The moment we select the verification to add to the diagram, we can see L47EXR2 becomes red. If we click on the device, we will see the explanation on why it's showing red:

Further details on why this device becomes red when applying the intent verification
BGP resiliency and received prefixes 2

3. End to end Path

In addition, we can spot this issue by looking at the end-to-end path. Let's take an example where we are looking at the path from a source in site L47 to a destination in site L66, which is connected to the MPLS cloud.

In a normal situation, we would expect to see the traffic being able to use both links to reach the MPLS cloud:

End-to-end path between a source in site L47, going through the MPLS cloud to reach site L66.
We can see we have two links usable to leave site L47 to the MPLS cloud. This is the normal situation, resiliency is in place.
IP Fabric End to End path - Normal situation, resiliency is operational

If we now compare with the snapshot where we have the issue, we can clearly see that we have lost our resiliency:

bgp gif06 e2e compare
IP Fabric End to End path - Comparision between snapshots, resiliency is affected

Conclusion

You can't ensure BGP resiliency if you do not look at the received prefixes. The main issue with this situation is the fact that your monitoring system will not be able to inform you of the loss of received prefixes. The last thing you want is to face a massive outage because of one link failure and your secondary link was not operational.

I have been in that exact situation previously, and let's just say I did not enjoy writing the postmortem explaining why we did not know our redundant path wasn't working. It's not a situation you want to be in...

If you would like to find out more about IP Fabric and how it can help improve your existing infrastructure by detecting issues you are not aware of, please contact us through www.ipfabric.io! You can also follow our company’s LinkedIn or Blog, where more content will be emerging.

SHARE
Demo

Try out the platform

Test out IP Fabric’s automated network assurance platform yourself and be inspired by the endless possibilities.

What would this change for your network teams?
Start live demo
 
 
 
 
 
We're Hiring!
Join the Team and be part of the Future of Network Automation
Available Positions
IP Fabric, Inc.
115 BROADWAY, 5th Floor
NEW YORK NY, 10006
United States
This is a block of text. Double-click this text to edit it.
Phone : +1 617-821-3639
IP Fabric s.r.o.
Kateřinská 466/40
Praha 2 - Nové Město, 120 00
Czech Republic
This is a block of text. Double-click this text to edit it.
Phone : +420 720 022 997
IP Fabric UK Limited
Gateley Legal, 1 Paternoster Square, London,
England EC4M 7DX
This is a block of text. Double-click this text to edit it.
Phone : +420 720 022 997
IP Fabric, Inc. © 2024 All Rights Reserved