Read blog
De-risk your SD-WAN rollout with network digital twin technology.
read more

Total Network Operations: Building a Network Digital Twin for Automation and AI

Total Network Operations: Building a Network Digital Twin for Automation and AI
We're cooking up something special...

Scott Robohn: Welcome to Total Network Operations, the project of the podcast dedicated to all the hardworking network operators like you who deliver, secure, manage, analyze, and replay all the packets. We talk about great ideas in NetOps so you can make informed decisions and to spur constructive dialogue between vendors, NetOps teams, and across the entire NetOps community. I’m your friendly neighborhood podcast host, Scott Robohn. Well, part of what we do on Total Network Operations is to proactively look at new tools and tech that are useful in taking a holistic view of NetOps and your whole tech stack and how it should support key business and mission goals. One of the evolving technologies we need to address is the digital twin. It’s not a brand new concept, and many of us have a vague idea of what a digital twin is and what it does, but we’re here today to wipe out that imprecision and our lack of certainty around digital twins. To help us accomplish this today, we have Daren Fulwell, the Field CTO of IP Fabric, who’s come from 10BASE2 and Novell Network to what can be done through digital twins, how they’re connected to AI for Ops and beyond. We’re going to get into some details and definitions of digital twin, automation and autonomy, the power of abstraction layers, and what you need to take from this for your career and the future of networks you’ll work on. So, Daren, welcome to the podcast. Please tell us about yourself, what led you to your roles in tech and networking, and how you got into NetOps.

Daren Fulwell: Well, thanks for inviting me, Scott. It’s always good to talk to you, so good to finally do it on a recording, so good to be here. Yeah, I’m Daren Fulwell. Gosh, I’ve been in networking an awful long time, or it feels like an awful long time. I was that 1980s micro kid who grew up with the home computers, with the basic programming, with the assembly language, with the tinkering with electronics to build speech synthesizers, all of those kinds of things.

Scott R.: That was you. That was you. I can see it.

Daren F.: That was me. It was so many of us, but I was just steeped in that. I was never, ever going to do anything but be in this incredible IT space. But networking came to me a little late, but not that much later. I was, as you’ve rightly pointed out, 10BASE2, gosh, yeah, crawling around under desks, cabling together PCs in order to get them to be able to share files, disk space, and that kind of thing. That was very much where networking started for me. We didn’t have networked applications. We didn’t have all of those things. We barely had email at that stage. We were in the corporate environments. It was all about what resources can we share in order to just make computers able to work together in some shape or form. Obviously, live through the whole worldwide web explosion, and as a result, well, here we are today having completely changed everything, including that network becoming the absolute foundation of everything digital. I think that, for me, has been an absolute joy of a ride really, because I’ve come from tinkering with bits of cable to planning, designing, architecting massive scale enterprise networks, and really coming to understand how foundationally our businesses are built on so much of this IT, which in turn depends on the availability of the very thing that we put together for it.

Scott R.: Yeah, love it. That ramp up to where you are today, let’s do a little preview of what IP Fabric does, who you are, and we’ll preview the digital twin concept. We’ll get into it more as we go, but tell us a little bit upfront so we know what we’re referring to, so our thoughts have mental furniture to sit in as we go, so to speak.

Daren F.: No, it’s a good one, because IP Fabric comes very much from that same network engineering background. Our co-founders were network consultants, working for some of the biggest consultants in Europe, dealing with some of the biggest networks. They’d often walk into a new customer, sit down with the documentation they were presented with, and literally say, what the heck is this? You know the challenges, everyone listening to this will know the challenges of being able to lay out an understanding of their own network in a form that they can come back to time after time, and make sure that people understand what’s going on. It’s just so difficult to do and maintain. As consultants, Pavel and Roman, and the other guys would come in and say, this makes no sense. They’d have to start again from scratch, effectively mapping that network out for themselves. It’s as simple, the foundation of the company is as simple as, how can we do this better? What can we do to do that? Ultimately, they came up with a means of automatically discovering customer networks, to mapping out the dependencies through those, so that they were able to provide the consultancy services. That’s where it comes from.

Scott R.: Was that a pivot from very consulting focused initially, to turning into a product company? Because that’s the sense I get for you guys today.

Daren F.: Absolutely. It was very much a question of, once they’ve realized and appreciated the advantage that that brought to them as consultants, they could see, actually, this is something that would be incredibly useful for all of the people who are our customers now. There’s that realization that actually, manual documentation, manual processes to maintain all of this, manual processes to pass on understanding of the network from one person to another, is not big, not clever, and subject to all kinds of problems with consistency, and with accuracy, and with timeliness. Because of course, you’re relying on these things being kept up to date whenever there’s change. If you and I know one thing, it’s that the change is inevitable, both planned and unplanned, it just happens. You can’t keep up with it.

Scott R.: For sure. Well, let’s touch on, okay, we know each other through our network automation circles. I think we met early on when the first AutoCon event was being planned in 2023. We’re seeing increasing maturity, customers moving forward in their network automation activities, but you and I have had some interesting conversation about automation versus autonomy. Tell me what you’re thinking on that front, and what’s the big picture? How does automation support autonomy, and the other things that we need to wrap around it?

Daren F.: Yeah. It was one of the early discussions in AutoCon 0, I think, where we started talking, going back almost, looking back at what it would take to create self-driving networks. I know it’s an old concept that keeps coming back around, but it’s this idea of, if customers want, our users don’t want to have to know about all the mechanics of how networks work. How can we build networks that are consumable to them, that change and adjust with their needs, but that don’t cause us problems when we’re trying to try and maintain them, and so on. It’s all about the self-driving element. Find the things that are wrong, react to the things that are wrong, and fix them automatically in some shape or form. You’ve got to have that closed-loop approach. Now, we talk a lot about closed-loop automation, but it’s very limited in its scope, more often than not, because it’s all about saying, right, well, this thing’s wrong. I know how to fix that thing. Go fix that thing. Test that that thing is now fixed, and go again, and essentially working that through. Automation is the enabler for that. I think this is one thing that, having listened to how things have changed with AutoCon and where we started from and where some of the sessions in the last conference, the direction they went in, we understand what it takes to automate tasks in the network. I think we’re in a pretty good place there. We understand that we need to have an intent. We understand how we go about enforcing that intent, inflicting it, which is the best word for automation for me, inflicting that change on the network to support that intent, validating that that intent is actually being delivered, and if it’s not, understanding why and looking at that whole drift from intent. Closing the loop in a small way, but the challenge is that you can do that in one place, but the network’s distributed. You’ve always got that concern of the unpredictability of the distributed system of what impact that will have elsewhere. The automation of individual tasks, absolutely key to this, but the big picture is what actually is going on across the network. Thankfully, the network is deterministic. In theory, we should be able to make a change and know and understand the impact of that change. The thing is though, our networks we build now are so blooming complicated. They’ve got so many different moving parts and they all interact in lots of ways and some more deeply than others, but ultimately that interaction is the difficult bit to understand and it’s far too much for our human brains. We need to do something about that. We need some way of being able to look across all of the technologies that are in that network, understand how those technologies are going to interact and what impact that’s going to have on applications and services that are laid across the top of that infrastructure. I think that’s where autonomy is a great idea and definitely a direction we need to be driving in because it allows us to dream, to have a target, but we’re nowhere near it yet because we don’t have the full picture and we don’t have that well understood yet. I think this is a nice introduction point really to our topic today.

Scott R.: Let me take you there. As we’ve exchanged some ideas, here’s something that came to me on a flaming pie. You’ll be beetles with an A. A big barrier to getting to network autonomy is trust. If I understand this correctly, building your digital twin allows you to do so much what if and assess impact without inflicting it on the production network to test, validate, verify. I see digital twin technologies as fundamental to something that can get us to autonomy. Agree? Disagree? How do you think about that?

Daren F.: I agree with you. Absolutely. I think when you look at the general use of digital twin in the broader sense, there’s a well understood idea of using a data model that basically simulates the behavior of a closed system and using that data model essentially to both look at the nature of that closed system as it is, but also be able to predict what will happen to it if certain things are changed. Absolutely, I think to be able to create that same view of a network, extraordinarily useful and probably fundamental in reality to the idea of autonomy, because once you have that, all of the aspects that you might have, all of the different elements that you might have actually making those changes and whatever can use that single source of, and I’m not going to say single source of truth, but that single oversight of everything that’s going on in the network to ensure and validate that what’s going on is what you expect it to be. Absolutely key. Now, that said, we are not there yet in terms of building that level of understanding of a network, and there’s a few reasons for that. But let’s start where we are now. The network digital twin, as we term it, as Gartner and whoever else will talk about these to term it, is about having that data model, having that data source that will behave in the same way as your real network from a forwarding perspective. You’re able, for example, to build all of the topology, all of the configuration, all of the state information that you’ve gathered from the network. You can drop a virtual packet in at one end and you can see what that virtual packet looks like when it comes out at the other end, having, I don’t know, having been switched, routed, load balanced, firewall, knotted, whatever. You go through the full forwarding plane to understand how it gets from one end to the other. What that’s going to allow you to do is test the behavior of your network without actually testing the behavior of your network. You don’t have to put artificial probes in. You don’t have to actually break things in the network to pull information out. You’re testing a data model. But the beauty of it is it will simulate all of the different technologies. So you might have an SD-WAN environment. You might have a software-defined data center running VXLAN. You might have a campus environment running wireless. The digital twin will understand what all of those things do. It will understand where there’s encapsulation. It will understand all of the impacts of those elements and give you a full understanding of what’s going on from one end to the other. What it won’t do, and here’s the gap, what it won’t do right now is simulate the control plane for all of those things. Because if you think about it, you’ve got to worry about everything as trivial as MAC address learning and spanning tree through routing protocols, all of through centralized control planes for a controller for your SD-WAN, for cloud infrastructure, and so on. These all behave completely differently. The only way that you could really realistically do that right now is simulate. That isn’t really to simulate it at all. It’s to emulate it. No one has enough compute power and memory and storage to emulate an entire enterprise network end-to-end right now. That’s not to say that it won’t happen. We’ve seen so many advances recently, especially when you look at AI, as to what’s possible. I’m not predicting for one second that in five years, we’re not going to be able to do that. You’ve got to start somewhere. The place to start is very much with that data model in the first place. That’s where we sit.

Scott R.: Let me make sure I’m understanding. This has been a super clarifying conversation leading up to actually having this episode. You really, your digital twin is capturing the forwarding plane, the state of interfaces, not BGP sessions, not OSPF adjacencies.

Daren F.: Now, very specifically, IP Fabric as a product will do all of that as well. We do gather all of that and we model all of that. We understand. If you think of it as a multi-layer diagram, topology diagram, what we’ll do is we’ll gather all of the, where we can, all the physical connections between devices. Then we’ll look at the layer two and we’ll understand how that works at layer two, where all the MAC addresses are, where things like spanning tree and stuff run. Then we’ll go to layer three and understand where all the IP addresses are, where the subnets and that’s v4 or v6. We don’t care. Then we’ll gather the information about all that control plane activity because it’s relevant. You still want to be able to analyze that. You still want to look at that. You still want to be able to say, right, actually, is this autonomous system connected to that one? What’s the routing policy that sits between them? What it allows you to do is to make predictions on yourself, but it’s more of a probability game rather than anything else. It’s like saying, this will probably impact that, but it’s down to you as a networking professional to understand that and be able to work with that. If you go back down to the actual forwarding plane, the simulations are there and built in. We’ll take into account things like security policy and stuff like that. Those things can be tested and changed. When it comes to dropping an interface and seeing what impact that has, there are so many links in the chain beyond that to a point where all of the control plane reconverges that you’ve got to go through all that process first. That’s where the fun comes.

Scott R.: There certainly is complexity in the control plane. Given even if you just had one vendor and one version of operating system, I appreciate the goal of having deterministic behavior of the network. That’s why trouble tickets and network outages are so impactful because we’re violating our assumptions of deterministic behavior.

Daren F.: Well, you say that. The simple fact is that most trouble tickets are caused by people making changes and not understanding the outcomes. Actually, you can argue that it is deterministic completely and it’s the people who don’t appreciate and understand it. I hear where you’re coming from.

Scott R.: No, that’s a super helpful clarification. I guess I was overly focused on what happens when a NIC starts to go bad or a port on a switch or a circuit from your WAN provider is dirty and you’re taking errors. Yes, there’s plenty of stupid things done that can be determined deterministically what the impact will be of the shut command, for instance.

Daren F.: Very true. I think that’s something we’ve seen change over time as well, isn’t it? Those kinds of issues that you do see that are less under your control happen less and less than they used to. We do have, generally speaking, and again, I make generalization, the quality of the hardware is better and fails less often. Software, I think that’s a vendor by vendor thing. 

Scott R.: I agree. 

Daren F.: Certainly, I think we’re less impacted by those kinds of environmental and failure scenarios than perhaps we were previously.

Scott R.: I’d say that’s true in the West. I think in other developing parts of the world, it’s probably more like what it was for us in the late 80s into the 90s, for sure. Yeah, there is an up and to the right. My whole introduction, coming out of an academic background in industrial engineering and operations research and really focused on system simulation, my first 5, 10 years in the business, I did not understand why modeling and simulation was not more widely used. I see the differences between what a digital twin is and does versus a stochastic injection of events into a simulated system. They’re not exactly the same, but everyone wanted to test it in the lab. Nobody was going to trust what a report from a piece of software said. They wanted to see it work.

Daren F.: Yeah, I think this is an important point that what we’ve reached is this stage where we need, as a network engineering function or whatever, we need this support now. The complexity is such that you can’t have it all in one head. What you end up with, you end up with silos. You end up with a specialist team over here who look after one technology and then another team looking after another and so on and so forth. The challenge is that all of those technologies need to be deployed correctly in order to provide a service. Here’s the thing, because the customer, your user does not care about all of the different technologies and how they’re deployed. They don’t care. There are different people looking after different parts. They have different levels of understanding of what everyone else does. The point is that you’re there to support a service and that service is meant to be available. That’s what we’re all here for. Because of that, we need to have and we need to build a better understanding across all of those functions, across all of those technologies. This is why it’s time for the digital twin now, really, because if it hadn’t been five years ago, it is now. We need to have that understanding that we can help that poor network engineer sit in trying to make head or tail of what’s going on to the left and to the right by showing it to them in a way that they can relate to and be able to trace those application paths through it all to identify where the challenges might be if they have them.

Scott R.: I do think the time series element to this is really helpful as well, to be able to go back from putting my tech engineer hat on, all right, I can see what events led up at least after the fact. What use does that get from your users?

Daren F.: I think that’s a really good point. Again, we’ve spoke about this before, this idea that change is the only constant in any network, certainly of any size, and that could be anything from having to swap kit out because it’s reached end of life to things just going wrong or people making change and there being unintended consequences. All of these things are happening all the time. The supporting systems need to be kept up to date in order to be sure that when those things do occur, that the people who are those level one analysts on the NOC are able to work out, who do I need to speak to? Where do I take those incidents? If they haven’t got the right visibility, they’re in trouble. What you need is that digital twin to be constantly updated, to be able to track that change history, to be able to pin down a root cause because this that was like this yesterday and everything seemed to work okay, now is not like that and everything’s broken. We can highlight that straight away because we have it in a data model, easy to spot. We can bring that information out and pass it through. What you can then start to do is bring that into an observability platform, which is where this goes. Observability is a separate thing from the digital twin, but it’s related because it’s really about having, once you’ve identified there’s a problem and where it is, it’s about having the right information about that point in the network or that particular device or the application as it’s flowing through a part of the network to be able to zoom in on that and go fix the problem and get to that next level of detail. But it’s much more near real time. The digital twin is often, it takes a while to build it, but what it’s going to do is it’s going to be much more up to date than any documentation you had, right? So it’s always the baseline of, right, what is the structure of the problem? That comes from the digital twin. Now what’s the telemetry that relates to that problem from the second, right? I’ll get that from my observability platform and you put the two things together.

Scott R.: So let’s talk about connections to other pieces of the ecosystem. One thing I want to just tease at a little bit is, you mentioned there are limitations on how much stuff you can store. So you’re making some smart decisions on what to collect and what not to. And I want you to comment on the following. There’s this tension between, with streaming telemetry, I can create my own big data problem, right? And I can’t collect everything about everything that’s going to be hurtful and harmful. And there’s a US comedian named Steven Wright who does, he’s really good with one-liners. And he says, you can’t have everything, where would you put it? Right. So there you go, right there. I don’t have unlimited storage, but I do have more, especially with cloud storage services. I got to be careful, you know, I keep ginning up some incredible bills. But there’s an enabler there. How has that, like in the last three years or five years, how have you seen that emerge?

Daren F.: Well, I think this comes back to that observability piece, right? Is you can have all that data and that the data is great. And there’s kind of two parts to that then, because one is you’ve got to go looking for the needles in the haystack, right? Of what are the things that actually matter in the scenario that I’m looking for? And, you know, as a human, you’re never going to be able to do that. That’s just, if we have a problem understanding how the network hangs together, how the heck are we ever going to find that one piece of data that tells me? So then we start to look at correlation and those AI Ops engines that can go through machine learning, they can pick out correlation and have all the information. But again, if you’re not gathering information from the right places, or you don’t know where to look for the problem, what you’re doing is you’re hoping almost that the correlation can piece together enough information to do that. And again, this is where having that deterministic view of the network and its structure and its topology that sits underneath that then allows you to guide that your AI and ML to look in the right places, because you can turn around and say, well, look, this application for a customer, for a user who’s in this location, they’re going to go through this wireless infrastructure here, this AP, this switch, this router into this WAN across this link into the DC, then to a workload that lives on this particular leaf switch, let’s say. So you’ve mapped out that you understand where all of the elements are of that application flow. Now that’s really narrowed down the data that you have to go searching through in order to find out what’s causing those problems. So it’s just a really useful context set, I suppose, for finding that needle in the haystack.

Scott R.: And it’s not a needle in the haystack, it’s a needle in a stack of needles.

Daren F.: That’s exactly what it is, right? Ultimately, yeah. Because otherwise, that’s right, you’ll find the observability platform will find all the needles, but you need to find out which needle is the one you’re looking for, right?

Scott R.: Have you developed either recommendations or best practices, or just have a sense of what’s good to collect and put in the digital twin, and what really doesn’t provide a lot of value?

Daren F.: Yeah, I mean, the approach that IP Fabric takes is to look, basically, our data model is vendor neutral, right? So what we do is we gather the information that’s relevant across all of our vendors. And so what that does is, kind of naturally, it gives you a selection criteria as to what data to collect and what not to. It’s all about what you need to validate a path from one end of the network to the other. So it’s all the control plane data that allows you to create the state for the forwarding plane. And therefore, all the inventory information and so on about the devices. What we’re not going to is gather every single configuration item and whatever for vendor specifics. We capture that information and stash it to one side, but we don’t bring that into the data model, because it’s really about understanding what that flow is going to look like. So it’s literally all the information that you need to get from a packet from one end to the other. So, I mean, that’s the way that we do it, because that’s the way if you or I were sat in front of a network and we had to sit down and work that through, that’s the information that we would need. So all of the things that we would need to do our jobs, that’s what goes into IP Fabric’s version of the network digital twin. What that does then allow you to do is start to do some interesting stuff around compliance checking and that kind of thing, right? Okay, can I validate that all of my devices, regardless of which vendor they are, are configured for this type of management approach, I don’t know, this SNMP, this logging, this NTP or whatever. And I don’t care whether it’s a Cisco or an Aruba or an Arista or Juniper, it doesn’t matter to me, just do that for me. Gives you a whole bunch of insight automatically that you just don’t have to go searching for. And I think that’s where we’ve sort of drawn the line in our approaches very much. What are the things that a network engineer actually is interested in pulling from the network?

Scott R.: And I can see the roots of, hey, we’re just trying to figure out how to automatically discover a client’s network infrastructure to showing you, okay, what’s useful and what can we ignore or maybe not place as high a priority on and then bring the right bits together. Yeah, absolutely.

Daren F.: I mean, the things like when you’re looking at a topology, we take it as given when we build networks that we have redundancy and resilience and those sorts of things, but are they there or do they have the impact we expect them to? So one of the things we can do, because we have the data and it’s relatively straightforward once you have the data model, look for single points of failure, look for loops, look for multiple paths that allow you to fail over from one to the other. So long as you can set the parameters for those things and understand what it means to do that, it’s a graph database problem at that point, but it’s not just about having physical links between two places. It’s making sure that the physical links are there, that they’re supported at layer two, that they’re supported at layer three, that they’re supported in the routing protocols as well, all the way up the stack. We have to go as far as is possible to make sure that the network engineer can turn around with confidence to their users and say, look, the network is going to support your application from top to bottom. And that’s what we’re there for.

Scott R.: So speaking of top to bottom, you’ve touched on quite a few specific, these are ways I use a digital twin. I wanted to give you an opportunity to comment on any other specifics on how Ops teams do interact with the tool, how they use it, but also get us to the topic of, hey, remember, we’re supporting business services and mission. We have to elevate above that. I know that’s maybe a contradiction, but I’d love for you to speak to both of those things in order. Are there any other specific ways that you can tell the network operations audience here, here’s other specific things that we see our customers doing using our digital twin instantiation?

Daren F.: Well, there’s, there’s a few things. I mean, I mean, the, the, the key, the key to it all is to have a complete picture of the network. So a few places when you, when you rock up there for the first time and, and, you know, our co-founders, this is, this is the problem they had. Do you know for sure that you know everything about your network, top to bottom, every, every switch, every router, do you, do you have all of the information, all the configuration stored and so on. So to have that initial discovery, it used to be able to say, go find the boundaries of my network and tell me what’s there. Our process and, and, and, you know, other people working in the same space have a, have a similar process. It’s all about making sure that you’re not just relying on discovery protocols and those sorts of things, but you’re looking at all of those layers to determine what, what’s out there. We, for example, when we discover a network, we don’t stop where we, we run out of credentials or whatever. What we’ll do is we’ll look at the edge of the network. We’ll understand, right. Does it have a CDP neighbor relationship? Yes, no. Well, fine. If it does great, but if it doesn’t, we’ll look at layer two and see if there are things out there that we didn’t know about. We’ll look at layer three and look for, for next hops are outside our scope. We’ll look for BGP relationships with things that we can’t control. And we’ll, you’ll get a full picture of everything that’s of the complete scope of your network infrastructure. Cause you’re going to find things that you weren’t expecting to be there. Right. We all, we all know about the, you know, the, the, the hubs that are plugged into switch ports under people’s desks and those sorts of things. Those things were real are still real, right. You know, people, people do those things and, and, you know, that, that obviously opens up a whole bunch of security side of things.

Scott R.: So there’s still Apple talk out there. There’s still network out there somewhere. Yeah. That’s right.

Daren F.: IPX, IPX. I mean, we, you know, it’s, I think, you know, recently talked about this one. IPX is still there. And, and we’ll come back to that later. Cause there’s, there’s a story about about one of my, my biggest gaffes about with IPX, but we’ll talk, we’ll talk about that in a bit. But, but that’s exactly the point that there’s, there’s always going to be things that you don’t know about. And the challenges that brings span far and wide, because if you don’t have a full picture of what your network is, then your procurement people are going to want to know where things are and whatever, because they need to deal with your your support renewals. You know, your, your, your monitoring system won’t tell you what you think it will tell you because you don’t have a full inventory in there. Your automations are going to get you so far, but not as far as you need them to, because you don’t have a full picture. So, so straight away, just that, that bare minimum of the inventory allows you to, to reach out to, to different teams around you in order to, to make sure that they’re seeing everything that you are. But then, and of course, from a documentation perspective and those sorts of good things, hell, you don’t need to draw diagrams anymore. Right? I mean, this is that alone for me was, was an absolute boom when I saw that for the first time. And, and as a result, you’re enriching, you’re not just enriching your understanding of the network, but you are ensuring that all the other sort of found platforms have a solid foundation to be built on. I think that’s where, where this becomes really visible because all of a sudden then people can start to rely on those tools that we’ve spent years manually configuring these things, monitoring platforms where you’ve had to, when you make a change, you go, Oh yeah, that interface doesn’t connect to that thing anymore. It connects to something over here. So I better change the priority of that. So it’s, so when, when it appears in my list of red alerts on my thing, it comes a little bit higher than it did before or whatever. It was a manual process to go through that every time you’d make a change, because it wasn’t just documentation that needed to be updated. It was the tooling as well. Well, how about you automate that process so that you don’t have to do that anymore? You know, the opportunity for service improvement is massive as a result, particularly because you can now map the applications across that infrastructure and start to understand the dependencies between the things that the customer cares about and the things that you’re responsible for.

Scott R.: I, here’s one reaction I would have to all of this. If I was trying to do all this work and intentionally document my processes and workflows and figure out how to automate them without a digital twin, then I had a digital twin to help me go deeper and have more comprehensive view. I think I would never want to go back to not having a digital twin.

Daren F.: It was the very thing I said to the, to my very first interview with Pavel CEO, the co-founder, why have I not seen this before? And why have I not done this before? Because you do instantly, you just realize the difference that it could have made to the way that you’ve done things over time. I’ve spent years consulting into, into customers where you have to sit down and work this stuff through. I wrote script, Python scripts for days to go gathering this information and that’s fine, but it’s only going to gather the bits of information you want. It’s not going to give you all of this extra insight about right, actually the interconnectedness of it all. And the way that those things interact with other parts of the network. There was, there was a great example when I was a consultant before, before this role, I was in with, with a hospital trust in the UK. We, we, we have our NHS and they were arranged in these trusts that essentially are separate organizations looking after multiple hospitals. Um, this one trust had four hospital locations, one of which was brand new. So they had great documentation. They’d got all the information from their, their, uh, their, uh, systems integrator with all the documentation. Everything was fantastic. They had another one that was a hundred years old that had been grown organically over time. It was all kinds of horrible care. No one really had a great picture of what it was like at another one that they’d acquired from another trust. So they didn’t really have much of an idea of what was there at all. It was okay, but it was pretty grubby. And the other one, no one really knew what was there. It was just something that had appeared one day. And, and, uh, it was, all of these four were interconnected. Um, all of them were running different technology, different generations of Cisco kits and so on. And what they’d asked us to do was come along and plug in cloud into this, just come and plug a bit of cloud in. It’ll be fine. We’ll move everything to the cloud. We can get rid of a data center. Great. Brilliant. The problem was, as soon as we plugged the cloud in, as you do, um, you could only see it from two of the four locations and no one understood why. And we spent, and I just not, we spent three months getting to the bottom of this, a team of three consultants we had working on this one because they had to sit down and re-document the whole thing top to bottom in order to understand why it was. And the fact was they had four different instances of OSPF. They had two different, uh, BGP ASs. They had a range of different policy for redistribution between all these. It was just an absolute car crash. And it was a miracle that anything would root from one end to the other at all. So there’s just no way that you could have got to the bottom of that without going through that process. And it’s literally the first time I sat in front of the product. And I realized that one snapshot from IP Fabric could have solved, um, 80% of that in, in one hit. And so it’s like, when you see that happen, you, you instantly, why would I do this any other way?

Scott R.: Yeah, no, I can, I can see that for sure. I, I haven’t had a chance to play with this in a real production network, but like I can, I can see the value.

Daren F.: Yeah. Let me, I think it’s, and, and yeah, I mean, obviously we’re talking about IP Fabric specifically, but this, the digital twin itself in whatever form lends, lends itself to this because it’s all about having a data source that’s trustworthy. And I think that’s the key to it.

Scott R.: And, you know, another point that, uh, that has made me think about, you know, as, as we came up with, okay, what discussion points do we want to talk about? You know, what are the connections between digital twins and AI? I’m like, well, you’re, you’re presenting the data that needs to be crunched. And I’ve been able to crunch that data in certain ways, you know, using it to populate a graph and using graph theory. There’s probably other statistical techniques I can use. And now I have a whole other set of techniques with LLMs and machine learning to process that same data. How, how is IP Fabric thinking about that?

 Daren F.: You’ve got a couple of things there. I mean, again, we’ve talked a little bit about prediction and that kind of thing. And here’s the perfect opportunity to perhaps start looking at those techniques to use that data for that purpose, right? We can train the, um, uh, the models in the right way. We can use them to at least start giving us probability of, of, um, of predictability, um, based on, on rule of thumb, rather than by, by, uh, brute force running all of the control plane, uh, protocols and technology. So, so there’s absolutely, um, that’s one area we’re looking at. The other is, is comes all the way back to autonomy that we were talking about at the beginning, right? The idea of, of having these autonomous AI agents, this, this army of agents, I’ve heard someone call it the other day, which was also, um, who are essentially operating as coworkers, right? Looking after very small fragments of the, uh, of the intent that’s needed for that network, creating their own small closed loops. But the way that they can understand everything else that’s going on in the network is to refer to that digital twin. So if you have an agent does a thing and another agent does a thing that overlaps with that thing, so long as the digital twins up to date, you can see the impact one has on the other. And I think that’s somewhere where we’re going to get, um, a lot of traction ultimately, because how else could you do that other than telling every agent that they have to all go and look at the network and find, find out what’s going on in order to get that picture end to end. So it’s almost like a proxy for, for the understanding that you would get by inserting a human into the process. And it’s all about that. It’s about saying, look, you don’t need to have every process that you’re, you’re going through every workflow. You don’t need to inject a human with the full understanding of the network into that process. You can use essentially a data source that you can, the manual processes can use the data source as well. So you can gain trust through interacting with that model. But once you’ve gained that trust, you can then use the API and ultimately, um, say an MCP server or similar to dip into that data and use it to guide your AI agents. And I think this is where we reach that point of, of inevitability. So, you know, our CEO always talks about this as being an inevitable technology that will come a time when, as you, as you said, I’ve, I’ve used a digital twin. Why would I ever go back? There will be a time when everyone has a digital twin of their network simply because they need it and they see that they need it. And there’s no reason for them to go back to try and do things manually afterwards. And we believe that, that this, this idea of autonomy, this idea of the AI agent is, is the time where this becomes inevitable.

Scott R.: With that being said, let me at the risk of maybe being a little too science fictiony.

Daren F.: Yeah.

Scott R.: Um, what do you see about the possibilities for federation where, you know, if I’m an enterprise and I’m using a carrier or set of carriers for certain WAN circuits and other transport services, and maybe I’m dealing with two different providers who also have their own digital twins, you know, we’re seeing this really interesting evolution with MCP being, being this reference point for, um, organizations to share information and let agents go at it. I’m not asking roadmap questions for IP Fabric, I’m not putting in a request, but how, how far off could something like that be?

Daren F.: I, I think genuinely this is, this is where it gets really, really interesting because I mean, we have customers at the moment who come to us and say, look, how do I know that my service provider is giving me the service I expect of them? You know, I, I, what we, the way that we work at the moment is it’s, it’s like, we, we, we get them to make an agreement with their service provider that we can extract data from their, their network and bring it into our model. There’s no reason why you couldn’t extend that to, to some sort of federation, as you say, and, and MCP, what a great way of doing it, you know, because ultimately, um, trying to keep on top of, of everyone’s API specs and so on is, is, is a painful thing enough as it is to be able to do it.

Scott R.: Oh no, that’s been easy. The consistency is so high and security, I don’t, I don’t understand what you’re talking about Daren.

Daren F.: Yeah, yeah, yeah, exactly. Exactly. I mean, and, and so, you know, certainly we’re seeing, um, the likes of MCP and, and A2A and these sorts of things as being opportunities to, while we care about the, you know, the structure of data and so on, to be able to exchange that information and then be able to, to translate it back to the thing that we need, you know, is something that we’re, we’re sort of investigating quite heavily, but, but yeah, um, it’s top of a lot of people’s minds right now. It’s really interesting that, that literally in the last, the last year, um, people have, have really started focusing on actually, how do I make this data available in a, in a, in a broader sense to the people who need it, um, without resorting to coding or, or whatever through APIs. MCP has really sort of, even if it’s not the technology that we use, you know, sort of in the long, the long term, it’s opened so many people’s eyes to, uh, to what the possibilities are.

Scott R.: So yeah, here’s a connector. Here’s, here’s one way of connecting and maybe even a natural language. What a great way to do it. That’s a killer app. Like I think, I think natural language is the ultimate expression of intent. So we’re now entering the age of intent based networking because now I can just say, Hey, can you tell me what’s going on? With this site, you know, why am I seeing these log messages? And I say it in a somewhat snarky way, but I do think we’ve crossed the threshold with that.

Daren F.: So there’s, there’s definitely, we’re definitely on a path. I mean, the ambiguity of the English language amongst others is we’ll probably, you know, sort of, give us cause to not necessarily jumping with both feet, but I think absolutely there’s a path there for sure.

Scott R.: We’ll crank through that and, and look, you know, non-determinism from LLMs is in a sense, no different from non-determinism from other humans.

Daren F.: Uh, just look at us separated by a common language. Right. Exactly. It’s a fact.

Scott R.: So I, I love, um, one of the things that I always catch between British English, I’m sorry, real English and American English is what, what you make plurals of and singular of versus what we do. And, uh, it’s, uh, always been a funny thing to me.

Daren F.: No, no, no. For me, it’s, um, it’s spellings that, that, that get me every time because, because I, I now try and write in, in American English just because that’s marketing. Right. Um, and, and I, now I, my spelling check always tells me I’ve spelt things wrong, even though I’m using the American spelling.

Scott R.: So, yeah. I do. My, my short understanding of that was Noah Webster trying to make the first American English, um, dictionaries intentionally tried to simplify. Um, so getting rid of lots of use in neighbor and color, um, et cetera, was like a defining feature. So, um, I’m sorry that your spell checker still wants to use there. I get it. I get it.

Daren F.: It’s, it’s the Zeds as well. Right. Yeah. The Z is actually more, more old English than the S is. So that’s us, us being influenced by the, um, by the Europeans apparently.

Scott R.: So I, uh, as, as a person of Norwegian heritage, I also know the Vikings brought a little bit, um, to the British Isles as well.

Daren F.: So just, just a little bit, just a little bit. Yeah.

Scott R.: Well, look, as we, as we try to wrap this up, um, you know, the last topic for you to comment on here would be, there’s lots coming at this generation of network engineers. You know, some things have changed, but there’s so much more tooling and capability. What do you think are the requisites for the, the, the network engineer to survive and thrive and the environment that is, is already here and is to come?

Daren F.: Yeah. I mean, it’s, it’s the same as it’s always been fundamentally, I think is, is that, that, um, from your, your, um, interview with Vint Cerf, um, he used that term relentless curiosity. And I think that for me, head and shoulders above everything else is, is the key. I mean, that’s the one thing that I’ve always prided myself in, in that I’ve always been learning, always been looking to the next technology and so on. And I don’t think that’s any different now, if anything, it’s, it’s more important now than, than ever, because things are changing so quickly. So that importance of understanding the broader context, um, than, than focusing in necessarily on, on one specific, uh, group of technologies is always going to be key. Um, because who knows what happens next? I mean, we, we’ve, we’ve talked before about the problems of, um, of automation, you know, people worrying about what it does for their jobs and that kind of thing. Well, ultimately, if you’re, if you’re looking forward and you’re always thinking about, about other things, you’ve always got options, you’ve always got places to go and, you know, to going back to the, um, the, uh, the Willy Wonka movie back in the day where he, it was, it was about screwing the tops on the, uh, uh, Willie’s, the dad was in the factory and screwing the tops on the, uh, on the toothpaste, uh, tubes. And then he was the guy who became the one who looked after the machines to screw the tops on the toothpaste tubes. That’s for me, that sums it up completely.

Scott R.: It’s all about that. I’m going to reappropriate that example.

Daren F.: Honestly, it’s brilliant. But I think that’s the point is, is that always look for the next thing. And so as a result, you know, we’ve had so much emphasis on network engineering, people being educated in programming and whatever through, through the DevNet program, great program from Cisco. They did a brilliant job with that to start us down the path, but it’s, but as we’ve seen, you know, and we’ve talked about today, it’s, it’s much broader than the network of people becoming programmers. So be aware of everything else that’s going on, be aware of what’s around you and, and that those, uh, the AI techniques and the, the things we’ve talked about, about MCP and those sorts of things, follow the right people who are, who are doing the experiments, understand where those are going to take you, but form your own opinions of, of how they make things better and, uh, and follow your gut really.

Scott R.: For sure. I have to ask you the final question and I will give you, I will give you the digital, I’ll give you the digital twin version of the question as is appropriate for this particular episode. Um, so you, by your own admission, 10BASE2, um, you know, you’ve seen a lot, um, and you’ve, you’ve obviously been turned on to the benefit of, wow, now this having all this information in the form of a digital twin, is there an outage scenario that you, you saw or maybe even caused, um, that may have been prevented with having a digital twin, or if you don’t want to be that bold and admit to taking down, you know, some, uh, you know, uh, trusts, healthcare network, um, you know, have you, have you seen interesting examples? You don’t have to take personal responsibility.

Daren F.: No, there’s, there’s, there’s, there’s a couple. I mean, I, I, I mentioned, so the first one probably, I don’t know if a digital twin would ever have saved me from this one, just because it was, it was so, so basic. Um, uh, my very first day in my very first networking job, um, I was sat in front of, uh, um, the management console for, uh, for, for a managed hub. Most people won’t even understand what these things are. It had a backplane that was separated into two networks. I inserted a blade into this thing, connected it to the wrong network. And all of a sudden I hear the beeping going on in the corner. Now, again, for those of you who don’t know network, you won’t know what that beeping was. For those of you who do, you know exactly what I’m talking about. Um, I basically crossed the streams. I got to two networks talking bridge that shouldn’t have been, and a room full of about 40, uh, network servers all lit up and beeping. Um, the digital twin wouldn’t have helped me there. I mean, that was just one of those things that was, that was a whole university’s compute resource. They’re telling me I’d made the mistake. So, uh, yeah, the phone lines soon lit up afterwards. But from the digital twin perspective, actually, um, I was, when I was consulting, I, uh, looked after a particular customer and we were deploying technology called OTV, which is, uh, um, Cisco technology.

Scott R.: Yeah.

Daren F.: Yeah. It’s basically, it’s a tunneling technology that allows you to extend layer two between two data centers. Um, and it was fine when it worked, it was fine and it did its thing and whatever. As I say, it was about tunneling packets from, from, from a VLAN in one location to a VLAN in another and allowing you to effectively bridge over this tunnel. Um, all good. But when we were doing some tests, um, we had, we pulled one particular link and we failed over to the second link and everything seemed fine. Everything seemed fine, but I was unable to authenticate a new login to one of the switches. It was a bit strange. Turned out that what I’d found was a path through the network where there was a, uh, an insufficient MTU to carry that, uh, that tunnel traffic. Unfortunately, I configured that link. So I got a lot of grief for that. But of course, if I’d have had the data about, uh, my MTU configurations throughout that environment, I could have checked that beforehand and that probably would have saved me a lot of grief. Um, every time I spoke to that customer thereafter, it was, uh, MTU was all they could talk about. So unfortunately, so, but yeah, um, it’s a great example of, um, having the right understanding of the network, being able to, to make sure that you’re not going to make those kinds of changes.

Scott R.: Very, very well played on both responses there. Thank you. Thank you so much for that. Is there a place you want to point people to for any writing blogging or other content you produce or IP fabric or both?

Daren F.: I was going to say my, my stuff is primarily on LinkedIn. I put pretty much everything on there. I’ve, I’ve kind of gone off the other social platforms because noise and whatever LinkedIn seems to have, uh, for me, at least become the de facto place. So, um, I do occasionally do guest blogs for, um, for various other folk. I’ve just done one for, for the Packt folks. So I can, I can give you a link to that one, which actually is on this topic. So, uh, so it fits in quite nicely, but otherwise, um, IP Fabrics, blog, uh, and website as the, uh, best places certainly to find out more about the product. So, and again, ipfabric.io is the place to look for that. So excellent.

Scott R.: Daren with one N Fulwell with three L’s first one, and then two, of course, um, super to have you here today. Thank you so much for coming on total network operations.

Daren F.: It’s always good to talk Scott. So thanks for inviting me.

Scott R.: Absolutely. I get to, I don’t ever have to be the smartest guy in the room again. Not that I ever was. ‘Cause I get to have all the smartest people, um, on this conversation. You talk about learning and lifelong learning. I learned something from every one of these conversations.

Daren F.: As you know, I listen to every episode and I’m always learning new stuff as well. So I appreciate that. Keep it going, my friend.

Scott R.: Yeah. Thank you. Your commentary is always very kind, perhaps too kind, but always appreciated. And, uh, to the listeners, Hey, you have the door open for feedback as well. If you, if you’ve got somebody you want us to have on the show, you have something to say about net ops and where things are going. Send me a DM on LinkedIn, hit me up on packetpushers.net slash followup. Would love to talk to you and see what you want to talk about. Thanks again for tuning into Total Network Operations. Enjoy the rest of your day. Bye. 

Listen to the full podcast episode here. Note that this episode was transcribed by AI and lightly formatted.

Want to know more?

Are you looking to know more about the article or the platform?
Please chat with our experts or try out the guided demo.

Newsletter