Read blog
De-risk your SD-WAN rollout with network digital twin technology.
read more

Put Network Observability Into Practice With The Right Tools.

Join Daren Fulwell and Simon Bell for an in-depth discussion on network observability from the tooling perspective.

Transcript

Hello, and welcome to another episode of the Community Fabric Podcast, where we bring network community to the table to talk about the things that matter to them in their day to day. I'm Darren Forwell, your host for today's episode, where we're going to dig a little bit more into the idea of observability and what it means from a tooling perspective. Now I'm joined today by a very special guest and good friend of mine. Simon, would you like to introduce yourself? Sure thing, Darren.

Thanks very much. My name's Simon Bell. I'm, part of the, Paceler, Alliances team. We're alliance partners with IP Fabric. And, unfortunately, I I I think I might have the wrong end of the stick here, Darren.

You're talking about network observability. Yes. Okay. I've done I've been doing a lot of research. I thought we're here to talk about quantum physics where, observables manifest as linear operators on a a Hilbert space that represent the state space of a quantum state.

So I'm just thinking mate. But never mind. Let's see what's missed the point a little bit there, mate. But never mind. Let's see what we can if we can conjure up something about network observability in the meantime, shall we?

Sure, sir. Thanks for that. Now when we were chatting about what we wanted to do on this recording, and yes, we do actually do preparation for these things. So, no more surprises, please. It struck me that while people talk about network observability, apologies, air quotes there.

You won't be able to see those, of course. They might not fully appreciate why we need to do something a bit different. I mean, is observability a thing? Is what is it? Why is it that we need to do something different?

Any thoughts on that, Simon? It's an interesting it's an interesting idea, Darren. The the term observability actually comes from engineering control theory, and it's to do with, automating the control of a dynamic system. If you think about maybe the flow of water through a pipe or, controlling the speed of a vehicle going up and down hills. And it's it's based on feedback from that system.

And it's it's really a method for learning about what's happening inside a system from being able to observe what's happening outside of it. So you you make inferences, about what's happening internally when you don't necessarily have, all the, all the data to hand. So it's about, I guess, tying the outcome of using a system to the elements within that system that are actually creating that outcome and being able to to it's cause and effect really, I suppose, isn't it? It's measuring cause and effect. Absolutely.

I mean, if you think about sort of pre cloud, IT installations where where, you know, everything was maybe in your own data center or maybe had a bit of colo, but you had pretty good visibility and an understanding of the components that make up that system. That was what traditional monitoring was all about. You know, that's that's what PRTG is all about. It's about monitoring. But now if you take sort of hybrid environments where you've got maybe, some stuff on prem, you've got, public cloud, you've got private cloud.

And in a lot of cases, you don't have full visibility into what's happening in the service provider's environment. This observability piece allows you to infer, infer, the state and the performance of things when you don't necessarily have access to pure metrics to actually measure them. Yeah. I mean, I'd this is really interesting because obviously this ties quite neatly, and you mentioned process automation already. It kinda ties back neatly into the automation question, I suppose.

In order to to ensure that your automation processes, your network automation process, is delivering the outcome you expect. You need the data to validate it. Right? And and and if you've not got that that level of understanding of all the the aspects of your network, you need something to be able to correlate them and be able to bring all that back together to give you the insight, I guess. And and so this is quite neatly where it sits in alongside that network automation point of view I suppose.

Yep absolutely. You know it's a cliche but knowledge is power. Yeah. And the more information you can glean about the state of your services, whether it's through monitoring, actually collecting data points, or whether it's through the observability piece, which allows you to infer how things are are performing. Then that gives you the control that you need to be able to automate them and, you know, keep things running smoothly.

I guess with with networks being so much more complex now and and I mean you've hinted at this already. The the the idea of this this different mechanisms for controlling and managing different parts of them, different people responsible for different levels of of information that you're likely to be able to get from, I don't know, anything from, an SDN controller to the command line of certain devices to SNMP, querying of MIBs to right the way out to APIs into a a cloud instance or something. You've got a whole bunch of different ways of of getting to the to the information. At the moment, some poor network engineer has to sit and log in to all of those things and gather it all together in order to make head or tail of it. So so do we think that this is where observability becomes a thing then?

Yeah. Absolutely. I mean, it's very much, it's a very hot topic in the the the DevOps area. You know, if you think about things like microservices and serverless and x as a service, where your guys making keeping things running don't necessarily have the access. You know, you mentioned SNMP.

You you know, you don't necessarily have the ability to throw an SNMP stack on something to see what's happening. But you can look at the outputs and, and draw conclusions about what's happening internally. It's it's it's an interesting idea. It's it's a bit of a 2 edged sword. You know, there there is undoubtedly some.

It does have credibility. But I think in some cases, marketing departments are just jumping on the bandwagon. And anything that used to be a monitoring tool, it's now being rebranded as observability. There's there's no quotes again. Yeah.

Yeah. Yeah. No. I've and I think that's a really important point. Right?

That that what I was trying to understand and and establish, really, was where the value comes from it. And and and how you go about assembling it. I mean, obviously, we've we've mentioned already lots of different data sources, I suppose, there. But it's, have you had those conversations? Are you sort of talking to people about who, you know, who are trying to assemble those those, platforms.

I mean, that's that's what people talk about. Right? Observability platforms. What are the bits that actually give us that data to give us that understanding? If you have any thoughts on that.

Yeah. I mean, the there's a couple of definitions. Some people talk about the pillars of observability. Others talk about the the the golden triangle of observability. But what what they all agree on is that there's really 3 main parts to it, and that's metrics, logs, and traces.

So something like PRTG gives you access to the metrics, the things that you have access to. You can monitor them. You can pull data from them. You can assign thresholds, trigger alerts when, when those thresholds are exceeded. But that's very reactive Yeah.

Really. Similarly with the logs, you know, you can use tools to dig dig down into logs, spot anomalous activity, spot patterns and trends. And similarly with traces, you can actually trace the, the paths of data through through the network, which is obviously where, a tool like IP fabric comes in. That's what you guys are all about. And it's really the way to combine those three strands, the metrics, the logs, and the traces that allow you then to analyze that data, as a whole that lets you infer what's happening with your service.

Yeah. I know. It's a good point that that from a network perspective, you those are the tools that you've got to hand. Right? Those are, as as you've already said, you know.

The the the idea of of gathering all of those data points in real time from a from some kind of monitoring, tooling, through whichever mechanism. I mean, you you we talked about SNMP, but it's not just SNMP. There's a whole bunch of other ways that we can we can gather that data through through, REST APIs or or whatever else. Right? But it's but it's all that that real time granular metrics that are relating to the behavior of of a of a device and an interface.

Right? That that kind of stuff is is super important because it gives you the the the real time understanding of what's going on. The the logging piece, I guess, we we used to use like a seam. Right? Or or, for for that kind of purpose.

The idea of gathering logs together and correlating them from different points in the network and and being under to be able to understand the reaction of a device to a particular, a particular flow or a particular, event occurring and being able to zoom in on it and work out what's going on with it. And you can you can imagine being able to bring those two things together and say, right. Oh, we've got, you know, volume of of a certain metric here relating to this this, instance of an event over here that we've seen through through the logs. I guess then having the context, as you've said, about understanding not just how all those things interact, but then what an application needs, right, in order to be delivered across the top of that network. And how how you then understand how thing.

You know that it's using this infrastructure thing. You know that it's using this infrastructure. This infrastructure is behaving like this. And current you know, once you've correlated all the logs, you've got that understanding. It feels to me like it's all about trying to get that insight into the behaviour of the applications, the very applications we're using to deliver, our business process.

And I think that's feels like what we're trying to aim for here. Is that that fair? Yep. Absolutely. That that's that's that's summed it up.

As as I've mentioned, you know, traditionally, the the IT the the network engineer had complete control and visibility over exactly what was happening. But now you start throwing stuff up into the cloud and something magical happens. You you don't have the the the visibility, you know, The the cloud guys expose some information about performance, but obviously, it's it's kind of their secret sauce to to not let you have full visibility of performance and so on. Yeah. You can set thresholds to say, okay.

You're running out of compute resource, but it's fairly superficial. They they don't let you have that deep dive visibility that you had if if you were running your own infrastructure. And this is where observability comes in because it allows you to infer what's happening from the things that you can make. Yeah. I think you've touched on something there as well.

Because what we're talking about here is not just the physical infrastructure. It's not just the the the underlying networking. It's everything. Right? From the from, you know, you've you've mentioned cloud there, but Kubernetes and the cont the container environment, the microservices architecture, whatever that you're using, there may not be those those granular data points, that that you can pull out as metrics from those those kinds of instances.

Right? So so I guess that's what we're saying is is gather all the data from not just not just left and right of of the, you know, the end to end, if you like, of what's going on. But also the different levels in in the the stack of, that's actually supplying that the the data that needs to be consumed for the application. Right? Sure.

Absolutely. But is it I think so with a pinch of salt. God bless them. The marketing people are all over observability because it's, it's it's, you know, it's observability and AI. I think that's a few big buzzwords.

And and to be honest, quite often they're combined. I was just gonna say an AI in observability. Yeah. Absolutely. Yeah.

Yeah. Yeah. No. I think that's a good point that that you can see I'm gonna do my old g's a bit here. Right?

Because because back in the day right? And so I know I know we talked about this earlier. Back in the day, what this would have been would have been going to the people who know. Right? It would have been your your network guy Yes.

Knows and understands the network as you said. And and you've got your your server guy who would know and understand the server, or in the the storage, and and whatever. And and almost never the twain would meet. Right? There was always that that point that that finger pointing exercise of well, it's it's the network.

Well, it's the server. Yeah. Yeah. I've I've been in those We've all we've all been there in those, you know, in those war rooms, you know, around the whiteboard going, yeah. Now look, it can't possibly be the network because My network, Syme.

It's your server. Exactly. No. It's your application. No.

It's DNS. Everyone just gets Yeah. Always DNS. But but I suppose the whole point here is what we're trying to do is something different and and and change the approach to to be more, more joined up. To to to bridge those silos and to say, actually, you've got the data from the from the infrastructure here.

You've got the data from the network here. You've got some context data that you can wrap around those to And then use it, you know, obviously, use And then use it, you know, obviously, use the, the rules of thumb that we know and have the experience of having been through that. But also then to to to teach and train the the machine learning algorithms. Right? To to go and spot these things and basically do the pattern matching, right, which is is all it is Yep.

To to bring those those things out of the data. And I mean, this is where it comes back to, the engineering discipline of control theory. It's it's treating the IT environment as a a complete system, not a bit of storage and a bit of compute and some software. It's treating it as a complete system and measuring what you can measure from that to draw the best conclusions you can about what's happening inside. So it's it's that inference thing.

And then when we said it before about about automation, I suppose that's exactly what you need from a from an, again, an end to end automation perspective. So not not diving straight into the network and and solely focusing on the network, but looking at the, the the orchestrated workflow, I suppose, that's needed to deliver service Mhmm. You would use the observability, I suppose, to, a, measure the success of that of that, that workflow, but, b, to trigger a workflow should things fall out of fall out of compliance, I suppose. Absolutely. And I was I was talking to a colleague about this actually this morning, a different topic, but we we got into the subject of of this.

And I think one of the key things, if if you're investigating observability, is clearly define the objectives of your system and then look for the tooling to match. Don't don't think, oh, we need observability. Let's get some software in. It's having a very clear definition of what it is that you need to, check and control and then find the tool that matches that. That's an interesting point.

So so really kind of grow it with your requirement, I suppose, is what you're saying. You you you start start small. Start in the area where you're that you're interested in in knowing and understanding more about. That might be, I suppose, again, tied to an automation initiative. If you've got a particular requirement to automate part of the infrastructure.

Right? You need the observability to sit alongside that in order to give you the insight for the triggers and the and the validation. So so build out, right, from from there. That makes a makes a bunch of sense. And I guess the the more tooling you throw at it, the, you know, and this sounds terrible.

Giving you different aspects of insight is gonna enrich the process. Right? So you don't don't want things to overlap necessarily, but if you can put in a monitoring platform, great. That's gonna give you that that kind of data. If you put in a, some sort of event correlation, it's gonna give you a different kind of data.

You put in a modeling infrastructure like an IP fabric. It's gonna give you, again, a different a different set of context that you can bring. The more data you put in and the more varied data you put in, I suppose, the the potential greater the insight. It it can it sounds a bit, ironic really, but before you can choose your tooling to give you the insight into the environment, you have to kind of understand that There is a good point. To choose the tooling.

So it's a bit of a, a circular argument. Yeah. Yeah. Yeah. No.

So what you really need then is a piece of software to discover your environment and understand it. No. I'm not gonna go into the sales the sales pitch for IP Fabric. But but I think that's that's the point here, isn't it? And and and I guess this is why marketing people like the idea so much because if you can if you can, you know, apply a coat of of observability paint to your product, it's another aspect to observability that people can add into and and and add another layer of data.

And I suppose the danger is is is trying to take too much and and and and listen to everybody on that one. You need to, like you say, you need to focus in on what data you're looking for in order to achieve the outcome you're looking to to get to. Same as everything. Yep. And then and then focusing on the things the tooling.

The tooling that's gonna give you the most value that doesn't necessarily overlap with with what you find elsewhere. Yep. Absolutely. This is a great conversation. And I I just think there's so many doors opening up here that we could just we could stroll off down, but I'm I'm gonna I'm gonna reign it in because we haven't got all day, and and certainly the people listening won't have either.

Any final kind of insights before we, before we close? I think the last the last thing we touched on is is the important one. It's it's don't fall for all the marketing hype. You know? Every tool out there pretty much at the moment, it mentions observe ability, somewhere.

Be because, you know, Gartner and the other analysts have said it's it's the hot topic. So everybody's pushing observability. Just understand your requirements, understand what you want to deliver, and then choose the tools that can help with that. That don't don't just be blinded by the, the the marketing hype. That's perfect.

Listen, Simon, great as always to talk. Really appreciate your insight on that one. And it's clearly something that we all need to understand because as our networks increase in complexity, we all need that help. Right? So, no.

I really, really appreciate your time. Thank you very much, and thanks everybody for listening. Tune in next month for another community fabric episode. Thank you.

Podcast notes

Episode Title:

Put Network Observability Into Practice With The Right Tools.

Hosts:

Daren Fulwell and Simon Bell

Topics:

  • Network Observability
  • Network Assurance

Our hosts