Welcome everybody to this recording of episode 2 of For the Journey, which is IP Fabric's, podcast around, well, all things I suppose, network automation lead trying going on this this journey towards a self driving network. And I'm joined by some really super special guests today. Hopefully, these guys will need our introduction, but if you do need to find out who they are, go look them up afterwards. Dave Donahue here from, Cloud Architect with Blue Cat. Hi, Dave.
Good to see you. Hey, Darren. Great to be here. We've got Tim Tryak, a solution architect with Network to Code. Hi, Tim.
And, the other Tim, Tim Fowler also, we know it's good. Good to see you, Tim. And today we're we're gonna be talking about the the place that we store the intent for an intent based network. The much fabled source of truth. A lot's been made of that need for source of truth in the journey to build this this meaningful network automation.
So what we're gonna do is we're gonna dig in to the why, the how, and the what, with some folk who really know their stuff rather than just me. Gents, I guess the obvious place to start here is why. Why do we need a source of truth? What benefit does it bring us? And I know, Tim s, when we were talking before about this, you got some thoughts on this so you might wanna start us off.
What what's what do you think? Sure. Absolutely. So yeah. I mean, when you think about automation in general, here on the network side, we feel like we're kind of a little bit behind and we're catching up.
The, the other our our, partners over on the other side of compute and storage, they've been doing this for a long time. And but their data requirements are so far, less than ours. We have a really unique problem on the network side with the amount of data that we have to track, store, and input into any form of automation. And traditionally, we've done this by hand, you know, spreadsheets and whatever. And while that works fine when we were managing via the CLI, and even as we moved into UIs and GUIs, it all still works that way.
We still always had problems with this. A, we'd end up with overlapping IPs because we handed out things twice. We forget that VLAN a was being used over in this other data center, and we always had problems with this with this data tracking. But when we get to the automation, it now becomes a hard requirement. We cannot proceed at scale with automation if we don't have some form of source of truth.
So our problem is unique. And I think that's why we haven't seen a solution really prior to the network getting into automation, for source of truth because it just wasn't that much of a requirement for the server side. Now on the network side, we're starting to see really, you know, this this growing area of understanding and realizing source of truth and its importance. Guys, thoughts? Yeah.
I I completely agree. Not only do you get it from the system side, but you also get it from the security side as well. That requirement to be able to say, hey, this is the exact device that's on this IP address is really critical when it comes down to it from a security posturing perspective. Like, if if you don't know what it is, you certainly can't go inter inter, intervene on something going on on that device, without knowing exactly who it moves and where it is, what it is, and how it is. So, I mean, that's certainly highly critical on that airspace as well.
And Go ahead. The point I'd like to cover here, yeah, real quick. Tim Shrake mentioned the the sense of scale. It's impossible to use it. It's impossible to scale with without it, and that's very true.
And now if you step back a moment to, let's say, the engineer who's just starting down the automation path, you know, in your typical day to, like, provision of VLAN, for instance, you might, you know, pull something from a spreadsheet, pull something from a database, synthesize that data into a config and go in and paste that into the the device, and and that's fine. Now when you start simple kind of abstraction, like, set let's say a simple Python script or a simple Ansible playbook to do that same thing, you can still use those same data sources. Oh, let me just go here to, you know, like, go go here to, you know, like, you're right here and get this. Let me go here and get this. And you can just put those as arguments, simple arguments into that script or playbook, and it'll still work okay.
Where it becomes more important now is when you start getting into the orchestration piece. It's kind of further down the the spectrum on the automation journey there, where now you're dealing with step 2 depends on the output of step 1. Step 3 depends on the something else, and you're starting to deal with multiple network elements. At that point, you need a single source of data for your automation infrastructure, or else your infrastructure is just gonna be used to find data instead of doing true automation. So it becomes a requirement when you start, like, down the orchestration path.
It's it's interesting. I remember, before I I guess we're all we're all sort of suitably of an age here in in terms of network engineering. Right? There was a time before network automation was really the, you know, have the focus it has now where people would talk about the self documenting network. The idea is that that your documentation wasn't stored.
Right? You would go to the network to find out what the state was, but there's all kinds of problems that I can see with that with the sort of thing you're describing there, Tim. Oh, look at all those nodding heads. Tim Yeah. No.
This is the confusion of actual state versus intended state. These are completely different things. And absolutely, we need to monitor and understand the actual state, but that is not how we should be driving configuration into the environment. We should be driving it from our intended state, and these are not the same. And in fact, this is something to point out really too.
Right? That our source of truth should not be reflective of actual state. Actual state should be reflective of intended state. It is the exactly the opposite. And this is something I think for people going down this journey that really struggle with at first.
They're like, yeah. But the the actual network, that's the authoritative source of information and 100% know it should not be. But that's a challenge to wrap your head around because you think, yeah, but it's in production. It's it's this is how I keep my job because I have my network up and running. But when we're thinking of it from a philosophical perspective, our intended state is actually where we're driving from, not actual state.
Yeah. And Absolutely. Go ahead. It's the consistency piece. No.
Go, Tim. Okay. Sure. And, you know, to to put it a a slightly different way, you know, as a network engineer, if you're gonna engineer your network, you'll come up with an architecture document and figure out, you know, we need this many routers here, this many edge routers in each site and whatnot, and you you create an architecture document. And then if the network doesn't match that state, that architecture, the network is wrong because, same way with the the source of truth which which holds data on the intended state of the network.
Truth kind of means intent, really. In that, if your if your network state doesn't match what's reflected in your data and your source of truth, the network is wrong. It's your it's your source of reality. It's not your source of truth though. Your intent is stored in your source of truth data.
Yeah. We tend to use the term, observed truth rather than intended truth. Right? And and because obviously from and again, sort of minor plug, but but from the the thing that the IP fabric does is goes and fetches that observed truth from the network. So, you know, it's that's that's where where that comes in.
And And I guess, Dave, you've seen this from a very specific angle, but you'll see this regularly and not in just on prem networks. Right? This is a broader problem than just just the simple, the the simple networks that we we simple networks we've deployed over the years. Right? Yeah.
I mean I mean, one of the big challenges is, well, I've I've got all my stuff on premise, and that's that's great. And I have all my existing stuff that and backs that up. But where it becomes difficult is when you have multiple different groups operating everything. So I've got my cloud ops group that doesn't talk to my network group, and now they've gone and created overlapping ISP IP space and, like, GCP. And now they're like, hey.
We need to make these two things talk. And now you're sitting there being the poor network guy who's like, wait. Now I have to double net this to make this work. Right? So now you've created a problem in the network because you didn't plan ahead.
Right? And and for me, I I came from the network world, and I spent a lot of days fixing those types of problems. That's, like, near and dear to my heart because I hate having them put in Band Aids or bad Band Aids to fix problems that could have been solved by just making a good decision upfront. So I'm always very, very, very vocal about, like, hey, look, if we're gonna be talking about moving to the cloud or anything like that, we need to ensure that we've got 1, the network team, 2, the the DDI team, which is the DNS, DHCP, IPAM team. And then we also need the whole cloud team involved.
It can't just be the cloud team Yeah. And for me, again, it is that that that breaking down of silos thing, isn't it? And and having that that central source of reference. I know, like, as as the guys have said already, it is vitally important that we use the same reference regardless who we are and regardless of the of the resource, you know, addressing VLANs connections, whatever it is. Right?
So I mean, what sort of other data I mean, these are the obvious things. Right, Tim? S. What else would we would we put in that sorts of tricks do you think? Yeah.
So, I mean, this is an area, like, on the network side, we often cross the line between or sorry. We we we are on both sides of fence of the physical side of the world and the logical side of the world. So we have to store all the information about the hardware and the equipment and where it's installed. And often our database is also storing all the other things. So all the servers, all the storage, all of that data is in our database essentially if you wanna call it, if it belongs to somebody.
And then we've also got all the logical side of it. So everything that we're doing, we really want to be tracking whether it's, you know, VLANs and IPs, ASNs, and all of our circuits, power managements. And the the the list just goes on and on of the data that we need to store on our side that, essentially because we're the the underpinnings of the entire infrastructure. Everyone else is relying on the services that we deliver. So that's where this notion starts to come in.
And then particularly as you start to move into now, you know, kind of more modern design, you've got the underlay and the overlay. So now we have 2 essentially different networks that we're operating. So now we're we're we just double the amount of data that we need to store. So we have so much of this data that we're trying to track, maintain, and use. Right?
Because if if we're if we're not having it accurate and up to date, then we can't use it. So what's the point? Right? So it needs to be, you know, really how we initiate all of our change. From a practical perspective though, I totally understand, you know, for many folks starting down this journey, there's going to be some necessary requirements especially in the early stages of being able to ingest data into source of truth.
So how do we start from there? And this is an area like as, you said earlier, just, you know, small plug, at network code that we we put a lot of effort into. You know, with IP Fabric, we have, our plugins now that allow to ingest some, actual state into the intended state as a way to kind of start down the path. Because, you know, while while we can say, you know, in our, ivory tower that, hey, you should only be coming from intended state. There is the real practical world.
Right? Like, it's a journey. You gotta get there. You know, and of course, we're talking about the ideal end state, you know, there's a path along the way. No network is a green field, basically.
Yeah. Yeah. Yep. Understood. And, Tim, s, that's a great example.
Like, for Blue Cat, we natively will grab everything at d h speed on the network. All that ends up in the IPAM, and then anything that's got a DNS entry typically will have an entry for. Beyond that, we also extend that same kind of ability up to the cloud. So we slurp in all the data from, say, AWS, GCP, or Azure, and we put all that information into kind of, as much as I hate saying it, a single source of truth, because I've been in the last word long enough that that's one of those words. It's like, it's kinda, but there's other parts missing, and and I do have that realization.
But, I mean, to be able to provide exactly that information, without some of the difficulties associated to it, obviously, hard code and static IP stuff has to be put in manually just that's the nature of the beast, but all the d h p stuff, so I know exactly what's on my lands or, here's on my book devices. So I can easily start go pushing out, like, here's the options I need for all my d h or for all my phones, for instance. Right? And you can do that automated, obviously. I guess the other thing is is what we're talking about here.
I mean, we're talking about pulling data from loads of different places and bringing those together that day. It's it's normalization, isn't it? It's it's being able to sort of almost abstract out the fact that you're deploying to different, different networks, different vendors, different domains underneath the hood. But ultimately what you what you need is and comes back to your point from before, Tim. That that intent for the network has to be expressed in an abstract normalized way.
Otherwise, it doesn't make any sense, Tim, if you're you're nodding there. Yep. Yeah. Absolutely. The the the source of truth, you can think of it as a as an aggregation layer across all your your data source, your authoritative data sources, and we should probably loop back around and discuss an authoritative source in a moment here.
But, yeah, your your automation infrastructure should spend time automating stuff on the network, not searching for data. Let the source of truth aggregate all your data from all your different sources and present that upwards. You know, additionally, having all this data in one place also has a lot of other benefits. So one of those is it exposes metadata that might not be apparent, when the data is scattered around. And, you know, as an example of that, you know, if you have, your IP address, your IPAM information, your circuit information, your interface information, those are all different kinds of data.
But when you put them on the same data source, you start to be able to do things like move from circuit to interface, to related IP address, to a cable trace. So all this kind of extra metadata that's not apparent right away becomes becomes apparent and that alone is, you know, another benefit of of the source of truth because it it exposes other relationships that like I say might not be obvious. Yeah. I was just gonna use the word relationship. Exactly that, isn't it?
It's a because the model isn't just a flat, you know, copy of of anything that you would have had on paper or on an Excel sheet. You've got relationships between those pieces of data, which actually have a meaning. Certainly when it comes to even even to a troubleshooting scenario, I suppose, when you're trying to work out what, you know, where a circuit appears and which interface and which cable like you say that it's gonna use. You've got all that data to hand, which is is gonna be hugely helpful. And I guess I guess some of this data is gonna be stuff that you have to manually put in there and some of it's gonna be automatable.
And this is Tim, you mentioned there the aggregation piece and being able to bring data from other data sources. In that sense, then a source of truth would be almost a proxy for for data from other other locations. Right? Yeah. It's it has to do with authoritative source.
And by authoritative source, we mean a final source of record that communicates intent. So, you know, for instance, you might have an IPAM that's holding your IP addressing information. That's great. Other systems might need need that information. So that information gets spread out amongst other systems for them to process it as they need.
But when it comes down to it, what should this be? You look to an authoritative source to solve that just to get the specific final answer. That is an authoritative source. So, you know, with a source of truth, you can keep your authoritative sources and you can pull the data into a source of truth. You You can also do the opposite.
You can also hold authoritative data in your source of truth, but push that information out to other systems to let them do what they need to do with the data. But there can only be one authoritative source for each type of data. So different types of data. You can have multiple authoritative sources as long as you're using multiple different types of data. IPAM versus d sim versus, you know, circuit information.
Those are all different authoritative sources. And I suppose the yeah. Of course, the the issue that you don't want to have is multiple things claim to be the authority to source because that's where the point at which where you're gonna get Yeah. Copies. You're gonna they're they're gonna go out of sync and and all of a sudden you've got issues.
Right? Yeah. You want your network to run-in the engineered state, the engineered optimized state. You put that information in one place not multiple places. Which comes back to Tim's point, I suppose, again, about about the expression of an intent not being that being effectively representing your design for the network.
Right? And and then what you're doing is you're you're just using that data from that authoritative source to say this is how I want the network to be. Now I guess then you end up with with a a drift from that potentially. If if you've got a a network that's that's being operated by loads of different people, Ultimately, somewhere down the line, someone's gonna put something in that they've not gone to the authoritative source for. They're gonna go, oh, so just put this in, that will fix this problem.
And and again, I can start to see you all smiling at this one because that's that's never happened ever, apart from I don't know what you're talking about. Yeah. In every moment. So so again, I guess then what we need to worry about is how we keep that up to date and how we go about that. And I guess, Dave, again, you're nodding.
Blue Cat has has a particular approach to doing that. What what what way do you do you go about that? Yeah. So Blue Cat actually does that mostly natively. So like DHP, as something gets new address, it automatically updates the actual server itself So much to the intent.
Right? So as DHP is grabbing addresses, it's gonna set it's gonna run through the normal door process like discover, offer, acknowledge, sorry, request acknowledge. So So at some point. Yeah. It's it's a little early.
Sorry. Door is not there just yet. But, as DHCP happens, it's going to basically say, hey. Here's the address that I got. Here's my MAC address.
So now I know exactly what physical device is there as well as the IP address. Right? So we do all that natively. And then with the cloud, as things change in the cloud, it will actually automatically update itself. Right?
Because let's be honest, an IPAM's worthless if it's not taking care of itself. Sure. Because cloud stuff changes fast that we can keep up with, period. Like, if you have I guess for that one, David, it's it's a balance, isn't it? Because you've got certain aspects of the, of the intended state that are dynamic Mhmm.
And certain that are static. And so I guess you what you and so I'm thinking back to what what the the teams have said about the, about that that intense state. And and it's the static stuff you don't want to just learn it and and whatever from the network. You you want it to be as you intended it to be. But from a dynamic perspective, you've designed that into the fact that it's going to get allocated out.
So at least by by retrieving that state, you've got that to to update your your records. Right. So it says go on. Sorry. Sorry.
My bad. I mean, in reality, you can't go tell AWS, oh, you're gonna give me this IP. They're never gonna do that for you ever. Right? So you have to observe what their actual state is and then update your authoritative saying, this is what that is at the time being.
And it really doing that at scale is difficult. Right? I've got 400 accounts. How do I keep everything in sync? Right?
That becomes really difficult really fast. So the dynamic parts to that point, yeah. Absolutely. You kinda have to observe and be like, alright. Cool.
This is the intended state. Now the static stuff, absolutely, you need to say this is the intended state, and these boxes should be at this point. I mean, every box I've ever hard coded by hand, it's, alright. Here here's the IP we're gonna give it, then you walk over the box and you put it on the box. Right?
You're not like, hey. It's just gonna magically work. Right? That yeah. It's alright.
We've been we've been saying for years not to hard code IPs on on servers. So every time someone uses the word hard code, it makes me makes me shiver. But, yeah, I I know who you mean. I was a slurp at it for a long time. So, Tim.
I did it a lot. Yeah. Sorry, Tim. No worries. I I think this underscores, you know, one of the challenges that we have faced as we move from the traditional method.
As traditionally as network engineers, we always wanted to have our our hands in every little piece of the network. We wanted super handcrafted and bespoke and artisanal, and we wanna look at every packet as it goes by. And but that doesn't scale, first of all. And secondly, there's no need. So why are we worrying about the details about slash 32 that Amazon is giving us?
Who cares? But yet, traditionally, we did. Like, we cared about, like, what VLAN is it? You know, we use 100 for something very specific and only for that thing and but why? It's just an identifier.
Right? So let's only care about the framework. Let's care about the prefixes. Let's not care about the slash 30 twos in this example. Right?
Like in IPAM or in VLANs. You know, like, let's care about ranges and and big broad strokes. Let's not care about details. And I think that's kind of part of the transition as you think in your head, you know, like Yeah. From an an engineering perspective.
In in support of the network engineer there because I was one myself once. I we had to. You know, I think this is part of part of the the the problem. I think that that people will talk about complexity that that was introduced into the network unnecessarily, but often it was necessary because for whatever reason, the tools that we had available to us, we had to have firewall rules with IP addresses or we had to have, you know, we had to know which port things were plugged into because that's what we had to do. We had to be able to span subnets across data centers and those sorts of things.
So we had to have those kinds of tools available to us, but we don't need to do that anymore. And I think this is for me is the biggest shift is that as as cloud has come along and taught us so that we can build applications differently, it means that we don't have to be as possessive about the minutei. So we can normalize things and we can abstract things and we can create this these these incredible automation platforms that will use that data that you're that you guys are talking about to actually do a lot of the grunt work that we used to have to do manually ourselves. Is that fair? I I I think I yes.
I would definitely agree. And it also is just the the nature of that our roles have really expanded. You know, we went from, you know, large networks used to, by today's standards, not very large. You know, and we've really grown. And so what used to be manageable by hand is no longer in many environments.
And, of course, you've got not just one network to manage, but a network of networks, haven't you? Because you've got your campus network, you've got your wireless network, you've got your WAN, you've got your SD WAN because it sits over the top, You've got your cloud, you've got your etcetera, etcetera, etcetera. And so you have to understand how all of those things hang together. And I guess this is where where the source of truth really comes in because it gives you that one place to go to to see everything and to to know how everything hangs together. That's really important.
So we've talked obviously a lot about the networking side of things, but there are other areas that might have benefits from us maintaining this data as well. Right? So I I often think of how we can push our network data into other operational areas perhaps who can use that. Any thoughts on that? Yeah.
This is always kind of one of those sort of fun things that happens, I think, from the network side because we are the underpinning of a lot of the infrastructure. As we start to build our source of truth, other teams use it, sometimes in unexpected ways. Something you didn't even realize they might think of or the problem they're facing. As you allow them access to the information, they can start to do really creative and interesting things. In the early days of source of truth when, in an environment I was working in, we had built a network source of truth and, exposed it to the rest of the organization.
And, some folks over on I think it was the, sort of like the application side, that were managing DNS. They went off and built their own little, app that handled, DNS registrations. So when you added a device into the source of truth, it would auto create any record for you. I was like, oh, didn't even think of that. That's cool.
So, yeah. Just sometimes, you know, these sort of things just happen naturally organically. Guys, have you seen anything else that you can think of? I I haven't specifically, but I like Tim's, example there because it it tells you what what happens when you're not focused on finding right data and when you're just when you don't know what data's out there, when it's all right here. This is what we have.
This is all our consolidated data. You can look at it holistically and go even if it's not your data, you can look look at it holistically and then go, okay. There's a lot we can do because now we're dealing with we know what's there. We can get it from an, you know, a single well understood interface that I can see how that would open up doors. So I really like, Tim S's example there.
I I think the thing for me, I always think of when I was in a network team, people would come to you asking questions that you thought were a bit a bit oblique really, you know. You know, how do I know all these circuits are up or, you know, I'm pay I've got a bill for all of these, you know, all of these support contracts for all this kit. How how do I know that these are all the devices that are on the network? And so you can you can see how as just having a simple inventory, right, of everything that's that's in the network. All of a sudden you've got, access to commercially useful data as well, that that, you know, you might want to farm out to your your support people or your your contracts people, whoever, in order to validate the the the amounts they're being charged by, by suppliers.
I mean, this is this is the kind of thing that, that it really helps for me. We call it, data democratization. Right? You've you've all of a sudden, you've got this this big boatload of really interesting data that you can do new things with that you've never chosen to do before, and it's useful to lots of different people. You know?
So I think it's Right. I think it's incredibly useful. Yeah. We assume that you understood. Sorry.
Go ahead, Thomas. Go ahead. Go ahead, Dave. I was gonna say we've seen, customers take the the single source of truth and then go wrap it into ServiceNow. So, like, people are requesting things about tickets, and it's like, hey.
I need a ticket for a server for here's the server, and, hey. Everything's already right. Right? It removes some of the guesswork. So all I have to do is go find the name and it just works.
So we've seen a lot of that type of stuff where they're extending it beyond just a standard IT scope and now putting in the hands of people who don't typically have access to that information. But that information is incredibly valuable for making sure that we as ops guys have the right information to do what they're asking us to do for the business. Right? Because end of the day, if it doesn't tie back, it's worthless. Right.
The joy of APIs. Right? You can integrate all this stuff directly now, which is incredible. As Tim was alluding to earlier, like, we've done a lot of work in this area as well as far as being able to use a tool like Nutabot as an aggregation layer to do syncing between these various sources of truth. Yeah.
To exactly Dave's point of, you know, that we wanna share the data. We wanna get it out there. So when someone opens a ticket in ServiceNow or whatever the ticket system is, they have accurate things to choose from, you know, in those drop downs. Those are actually reflective of what really is out there. Great example.
Yeah. That's incredible. And even from just like the contract management side, like, how do I handle all my maintenance contracts? How do I know that I'm paying for stuff I should be paying for, you know, versus I'm paying Cisco a bunch of money for stuff I don't even have anymore? It's a common example.
I I I couldn't possibly comment about it being common, but I know exactly what you mean. Yeah. No. Absolutely. Chaps, I've just realized we're already half an hour in.
And I'm I'm thinking, is are there any other nuggets that you'd like to share before we, we wind things up? I think I would say from my perspective, the the most common challenge we hear is like, that's great. You guys talk about this, you know, perfect end states. You know, you have all of these tools integrated and working and everything, and I just can't do that. It's okay.
Start somewhere small. Get a source of truth operating in your environment and pick 1 IP prefix to start working with or start somewhere. And you don't have to take the whole thing all at once. Take it as a journey. Take it piece by piece.
And where needed, reach out for help. That's why we all exist. We exist here to help you. So sometimes it's just a matter of convincing your management or whatever, throw some dollars at it, get some help. Tim, f, when we were talking before, you talked about that journey, from a slightly different way.
Right? The the idea that you can start network automation without, but there comes a point where you you really have a real cutoff. Right. And don't let the the lack of a source of truth currently prevent you from starting it because it's gotta start somewhere. And you don't need everything to start.
Just start, and you'll realize great benefits from just managing scripts that manage hundreds of devices. But, eventually, you'll you'll see the need to get there and you'll probably realize once you're there that you need it. And then that's what we're here for again. Perfect. And, Dave, any thoughts before we close?
I don't have any of these 2 gentlemen. I've kinda hit the nails fairly on the head. Yeah. BlueCat kind of enables this, the automation side of it, just because we do behave as that single source of truth natively. So for us, it it's it's pretty easy to kinda help walk customers towards automation that what I talked about earlier, that's, cloud discovery and visibility.
That is exactly an automation we built ourselves, to go out and and fix those cloud portions of the business where we have disconnects. I think I think you've you've highlighted something, though, that, you know, that that people don't have to do everything themselves. You know, there are there are capabilities in all of the platforms we've talked about today that that people can lean on to to take a lot of the pain out of some of the, some of those requirements. Right? Which is which is hard to do.
A lot of this network automation stuff that that people are constantly sold is all about programming Python to do this, to do this. You don't have to learn how to do it all yourself. That's the point here, isn't it? There is a community around. There are people to ask, and there are tools that allow you to to to move along down that path.
Gentlemen, I'm gonna wrap it up there. Thank you very much for joining us. What we'll do is we'll share in the notes all your contact details if everybody wants to be contacted. Very good. And, yeah.
We'll add any sort of, if you've got any pointers, any posts, any, you know, blog posts or anything like that that people might be worth them following up with, let me know and we'll just pop it in the, in the post the notes afterwards. But thank you very much for for joining us. I appreciate how early it is for all you folks. So, yeah, much appreciated. Good to see you all.
Good to see you too. Thank you.