Spotting Single Points of Failure in a graph
Enterprise networks should never have Single Points of Failure (a.k.a SPOF) in their daily operation. SPOF in general is an element in a system which, if stopped, causes the whole system to stop.
Single failure (or maintenance or misconfiguration) of any network device or link should never put a network down and require manual intervention. Network architects know this rule and design the networks like that — placing redundant/backup devices and links which can take over all functions if the primary device/link fails.
But the reality is not always pleasant, and today’s operational networks can include SPOFs without anyone explicitly knowing. The reason is that even though the original design put all SPOFs away, the network may have evolved in the meantime, and new infrastructure may have been connected to it.
- An application which was not important at the beginning (and thus was connected to the network by single link and single switch only) may have gained large significance for the business today. The dependency among the applications can be complex and if only one in the critical chain is SPOF, then whole chain of applications fails.
- Mix of two redundant network infrastructures — for example after new company acquisition — could not be redundant as the original components. The networks may not have taken into account during the original design that other connected networks may be needed to be redundant as well.
- Or there may have been one network failure which was not noticed and have put part of the network out of the operation — and created set of SPOFs. The network may continue to work (because the redundant design mitigated one failure) but is not ready for second and subsequent failures.
Part of the network operation activities should take into account that SPOFs may appear unexpectedly. This require advanced skills and lot of time to go trough routine settings and outputs if the networks is still resilient and high available. It also requires expensive disaster-recovery exercises to be performed regularly.
IP Fabric helps with finding of non-redundant links and devices in the network in its diagrams section.
It does not matter if the SPOF is Layer 2 switch or Layer 3 router — those are all in the critical chain of application uptime. Device and links which form SPOF will be highlighted.
Additionally, networks with many small sites have automatic grouping of small sites into redundant and non-redundant groups in IP Fabric. Groups allow to easily spot sites with non-redundant transit connectivity.
If you’re interested in learning more about how IP Fabric’s platform can help you with analytics or intended network behavior reporting, contact us through our website, request a demo, follow this blog or sign up for our webinars.