Make sure you don't miss out on what is sure to be a fascinating discussion between two networking virtuosos, when Daren Fulwell welcomes Phillip Gervasi to talk about network observability!
Transcript
Hello, and welcome to another episode of the Community Fabric Podcast where we bring the networking community to the table to talk about the things that matter to them most in their day to day. I'm Darren Fulwell, your host for today's recording where we're going to dig into a topic that's really become a talking point over the last year or so. And I wanna find out if it's just a marketing term or if there's something real there behind the the words. I'm talking about network observability, and I'm joined by someone who's been a leading light in the network community for some time. I remember reading your blog posts about the trials and tribulations of being a practicing network engineer many moons ago, Phil.
Would you like to introduce yourself? Sure. Hi, Darren. Thank you for inviting me on today. And, yeah, the trials and tribulations of being a field engineer.
You know, we say trials and tribulations, but I have to say, I miss those days a lot. Oh, yeah. I really do miss turning a wrench and being in the hot aisle and then the cold aisle and then the hot aisle again. So I think it's with rose colored glasses that we look back in hindsight, because at the same time, I'm very grateful to be working what they call bankers hours 9 to 5 and not in a data center at 2 at 2 AM. But you are right.
I have a I have a extensive background installing and designing and and fixing and building networks, very traditional network engineer Yeah. For 12, 13, 14 years. And, and then I got into solutions engineering or presales as some call it, which I enjoyed thoroughly. I really did. And, and then after 3 years of that transition to technical marketing, which is what I do now.
Now I did start my career as a professional, as a grown up, right, prior to being in tech as a high school English teacher. I taught literature, writing, that sort of thing for 5 years in private public schools here in New York State where I live. So that's actually my my foray into adulthood and then transitioned in my, mid late twenties into tech, and here I am today. Well, I guess that that serves you nicely for the for the role you have these days, right, of, of of evangelizing, technology to to the folks who wanna listen. Right?
Yeah. Well, I mean, the the word evangelize means to bring the good news to the world. That's that's the the the the the definition, the denotation of the word. And so what I do is bring the the news, and hopefully it's good news, of network observability and what my company does, Kentic, to the world and and from an engineer to engineer perspective. So there's certainly a, a technical requirement where you have the experience and and know not just the language and the jargon, but, but really have the, I guess, the gravitas, the the, credibility of having been in the trenches, but also the ability to explain complex things simply and to be able to speak in, engineer's language.
Yeah. Because what I do is not sales, and what I do is not traditional marketing either. And so I do enjoy it, and I do believe that I I know and I have to say, Darren, to be very transparent with you and your audience. Sometimes I I regret not getting into tech sooner and not being a teacher. Right?
I I look back and say, well, why didn't I get a computer science degree instead of a, English degree or literature degree? But then looking back, that that's what has allowed that's what's allowed me to do many of the things that I do now. So, you know, I I know some folks say, well, I have no regrets. I have no regrets. I have so many regrets.
I have a list of regrets that you can't, you know, as as long as my arm. However, you know, trying to balance that with the reality of where the decisions in your life bring you to where you are today, that's that's hard. That's that's real, you know. It's it's trying to balance the ideal with the pragmatic. And, and and so as a technical evangelist, I'm still required to to be very deep in technology specifically, how my company approaches it and what we do in that in that space.
And I think it's important to understand that network observability with that preface network is somewhat different than observability, which I think is an older term, a little bit more understood. And I think no, I don't think I know. Generally more understood in terms of APM, application performance monitoring, not necessarily the network. So this is a little bit of a different thing. No.
And I think that's you've you've introduced that really love nicely there because obviously from from my perspective I see something similar to you that people make, people have a trajectory right? You start in a place, you build experience in different areas, and you get the breadth of experience as well as the, as the direction, and brings you out to where you are. In my my situation I started in, in application support, built up through server support, then into into networking that way so I have a broad view from a technology perspective that I bring to my day to day but ultimately it's it's about using all of those aspects in order to to to move on to the next stage right and I guess what I take from what you said so far already you've got that that idea of understanding how to operate a network, how to build a network, how to design a network and and now what you're talking about is is bringing all of those aspects together to look at the bits that are missing and then start to fill those gaps, right, with with some of the technologies that you you talk about now?
Pretty much. Pretty much. Let me get more specific. Okay. Network observability is a network centric form of observability.
That's why we use the term network observability. And observability comes from this idea of looking at the outputs of a system without manipulating the system and in order to determine its state, its health, however you wanna determine it. And and the idea of observability is generally, tied to technology, though industry, like manufacturing, large scale manufacturing, uses some of the same methods and workflows, in their process to determine how efficient their their processes are and, to reduce error on the factory line, that sort of thing, but apply that same technology. And the reason that we are kind of moving forward with the network centric version of observability is because today, now we're we're looking at 2023 here, most of the applications that we use are delivered over the Internet, the public Internet, and then also whatever local network you happen to be on that day. Sure.
So if you think about it logically, right, that means that there is a tremendous amount of application performance telemetry information embedded in the network. Yeah. That does not negate the value of traditional APM, of course. That's still separate. That's still separate.
That's not what you know, Kentic is not an APM company. So code level review, looking at overall database architecture and what resources are are given to a given application, how how you're designing your containerized services. Yeah. All of that stuff is still very important. However, it's all delivered over the network.
And so an end user's experience I know that a lot of people like to use the term digital experience. But a lot the at the end of the day, the the experience that a person sitting behind a keyboard has clicking away through a an application is largely determined, yes, by all those other things, but but also by the quality of the network and and its ability to deliver that in such a way that it, you know, it performs really well. The application performs very well. So I look at, this new era of APM having this network centric approach, not again discounting code level visibility and, looking at, you know, using browser plug ins to see what an end user is seeing Yeah. Yeah.
And looking at compute resources and what's going on in AWS and with Kubernetes. It's all important, but we we if we do not look at the network component, we're missing out. And isn't that true? I mean, if you think about it, you may have a SaaS application that has sufficient compute resources that is, really well designed. It's great, and and the network is a piece of garbage.
I'm sure. End user is gonna say, hey. My application's broken. Yeah. Well, as you start to investigate why, then, you know, you start to discover, oh, alright.
Well, there's a network issue here. Or or how about this? The network has so much telemetry in it about the application that you can actually infer that it's not the network, and you can see, oh, it's because there's not enough. There's something wrong with the server. It's taking too long to respond.
See, it's not a network issue there. Now I'm not trying to say, oh, therefore, you're mean time to innocence. I don't care. I was gonna say there we go. Right?
I've been on enough calls. And when I was a a field engineer, it wasn't Zoom. It was Webex. We were on our Webex calls, and we were all trying to figure out the problem together. There was no idea we didn't have this idea of the network person was like, wait.
It's not us, and then they leave the meeting. No. We were all working together. And so if we found a clue here that, oh, it you know, it seems to be there's the round trip time in your network is extremely good. When your server responds, we get that response here from the at the user side instantly.
It's very, very good and vice versa. The user the user response at the server side is received almost immediately. But it's taking this many milliseconds for the server to actually respond. There's a problem there with the server itself. What's going on?
And then you start to to find out, you found that information in the network. But the but I suppose the beauty there is if it's an onward con connection problem from that server to some some other aspect, some other element over the network, then you've got more visibility again to then to then look further on into into what's going on. Correct. So it's all about bringing the the network aspect, I suppose, into the the broader understanding of how an application's behaving. That's right.
And I think it's necessary. It's not like a cool new thing that we do to sell more products. It's absolutely necessary. It's requirement of how we do service delivery today because of the nature of service delivery, application delivery. It is predominantly network driven.
Not always. I have a couple of applications on my computer locally, like, let's see, Adobe Premiere that I use for video editing, Camtasia when I'm creating, like, lessons. Those are local. That's fine. And so if the application is not working right, probably has nothing to do with any networks.
Yeah. Yeah. Yeah. But that's but that's not most of my applications. Most of what I use for my personal life and for work are delivered from the public Internet.
And then who knows where those live? They live in public cloud, my company's private data centers, or both. And then, of course, there's also the component of, container architectures that we're using now. So, you know, what's going on in the networking, in between Kubernetes pods and and, you know, between one cloud and another cloud. The complexity requires that we that we gather more telemetry.
So in in math, as you know, and I'm sure your audience knows, in math or as you say, maths with an s We do. We do. Yeah. Okay. The more data that you have in, the more accurate your conclusion or your prediction, if you're doing predictions.
You need more data. And and then, of course, it presupposes that that data is of high quality, very accurate, very, timely. And so with the growing complexity of just service delivery, application delivery as a whole. Yeah. So not just the growing complexity of networking, but also of our application architectures.
It requires that we collect more data, more telemetry of divergent types in order to accurately assess the health of our system and then also to be able to predict. So we wanna do some more advanced stuff, which I'll get to in a moment. Sure. That means that we are collecting very diverse information. We're collecting we're looking at code.
We're looking at configuration logs. We're looking at server logs. We're looking at, the mundane network stuff like loss latency, jitter Of course. Server round trip times. We're looking at things that you get in, you know, traditional visibility visibility, like CPU utilization and and memory.
All of it's important. Every single thing. But also the, I'm gonna say the more qualitative things that are still important, but that are really kinda hard to inject into that database, like a threat feed. Should should we be considering security incidents happening maybe even at a global scale in the performance of our application? The answer is yes.
Because in my experience, many, not all, but many, security breaches are they manifest themselves in performance problems. Mhmm. They're not always, but very often. So all of this is tied together. Your geolocation, where are you where are you located?
So if you think about it, we are talking about this concept of network observability that collects all this divergent information, incredibly divergent, and it's very different. So as useful as this may all sound and as cool as it sounds and as relevant as it sounds because of how we do application delivery today, it's very difficult. Yeah. And that's when we get into the buzzwords and potentially the hype where people say, ah, it's just hype. Is the second that I mentioned the word machine learning or AI or anything like that, you get cringes and eye rolls and all that.
But we could get into that very deeply. Yeah. I think there's there's a really good story to tell there. Yeah. That's a really interesting one.
I mean, you've you've touched on so many things there that if you go back even even 5 years to what a network engineer would would, know and understand, you're outside the realms of their understanding immediately. Right? Because a network engineer, what what did what what were we doing? You know, when I started doing this thing, it was it was literally getting packets from one place to another, and and you wouldn't even consider the fact that you've got these complex applications that you were having to ensure the delivery of the service, right, rather than than the delivery of the packet. And I think what a lot of the things you've touched on there, you're pushing a network engineer's understanding into a whole new new realm of of really appreciating the architecture of applications, the the behavior of of those applications across the network infrastructure, and understanding the impact of the infrastructure on on the end user experience.
And these are all almost intangibles when you look at something like, the traditional network monitoring platforms and stuff that we're so used to using. Right? The telemetry was we were pulling metrics off of interfaces. We were pulling up downs off of off of devices. Those things are meaningless when you're actually until you're, when you're actually looking at the application delivery because the application delivery is so so much more complex.
So so I think you're you're taking the whole thing into a different generation here, aren't you? Yes. I wouldn't say they're meaningless. Those metrics are still They're not. They're just not enough.
Yes. Of course. It's not sufficient to know that I'm dropping packets on interface gig 0 1. That's helpful, but it's part of a larger dataset. It's one it's one metric among many.
Yeah. Now the question is how do we correlate those things so it's meaningful? So you mentioned, you know, what a network engineer has to do today. And I I think you were probably alluding to both the idea of machine learning and then just the idea of the mind shift from just getting packets from a to b to how does the entire system of application delivery work. It's important to understand that network observability, observability, all of this stuff that we're talking about, it really is a service layer.
We didn't at Kentic, we didn't invent a new routing protocol. We didn't figure out a way to make BGP converge faster. But what we did and what others are doing is adding a a service layer on top of what's going on, including your environment in AWS and GCP, your private data center, your network, your firewalls, your switches, all of it. So a service layer over the whole thing to help you figure out what's going on faster, more efficiently, and do some really cool advanced stuff figuring out what might happen. Like, if you could whittle this wire, what could happen to my application?
And so it's augmenting an engineer. We're not talking about replacing an engineer. It's augmenting an engineer so they can do their job better. And it's sort of necessary because of the the growth in complexity. So do we wanna make do we wanna ensure great application, experience?
Yeah. Okay. Well, how do we do that if everything is more complex? Well, you can hire 200 data scientists with PhDs from Stanford. Done.
Barring that, you know, assuming you don't have the budget for that, you need a tool to do that. And that's what network observability and and ultimately the the broader concept of observability is all about, augmenting an engineer. And and what we get into then is this trend to use so I'm gonna say machine learning, but keep in mind that, you know, when I talk to data scientists very often, they will readily admit that that's that is a little bit marketing because they don't start with ML. They start with the easy stuff. How can we organize these datasets?
What kind of database architecture can we use to make it as efficient as possible to query? They start with simple stuff. That that's step 1. Is there any basic statistical analysis that we can use prior to getting into something more complex that is more efficient, faster to run. You know, why why do we have to run this complex algorithm that's inherently more difficult, takes more compute when we can do something more simple?
Yep. The idea isn't let's all run to ML and AI. The idea is what what technologies are available to us now considering the, the vast amount of compute we have available to us in storage, whether that's privately or in the cloud, never before have we had this. How can we use that to solve this problem of of complexity and application delivery? And and that's where we start to talk about, statistical analysis and the Holt Winters algorithm and ML and clustering and classification, all that cool stuff because they're tools.
Yeah. So so it's important to remember that. When you start to hear that from vendors out there, we use the most advanced ML. Ask, oh, yeah. What?
What do you do? What which which algorithms? What models do you use? And I think this is the point, isn't it? It's it's as you say, the the fact is that you need this vast quantity of data.
If it were all structured data organized the same way and whatever else Right. Then the statistical analysis is enough. You can do run run the algorithms and it's job done. As soon as there's, I guess, a variety of data and that correlation piece that needs to be done and all of the the the fact that some of it's unstructured and those kinds of things, that's when you're starting to have to dig into the likes of the AIs and the MLs. Right?
That's exactly right because if you think about it, here you have dataset 1. It is flow data, and you're looking at percentage. 62% of my network is HTTPS. Okay. You have a percentage, k, out of a 100.
It's a decimal maybe on a computer. And over here, you have millions of packets per second. It's a completely different data type. They they both relate to what's going on, but they're completely different types of data, different formats. Over here, you have a threat feed.
It's written in in maybe English or whatever language. Right? And then over here, you have yet another complete you know, SNMP traps, whatever it happens to be. Yep. You have all this divergent data.
How do you correlate it? Now you said correlation. How do you find out how things are related to each other? Let's put it that way and not use the word correlation for a bit. Yeah.
Yeah. Well, that's when you start to get into the, some of the concepts related to ML, not quite ML, but things like scaling and normalization. How do we put millions of packets per second and then 62% on the same scale from 0 to 1? That's now we're talking about machine learning Yeah. Yeah.
Or at least, you know, touching on that, I think a purist would not call that machine learning necessarily. But, nevertheless, you're talking about using math to put that on a scale. So now you can compare those two values, and you can see how they're related. But when we say correlation, it's not hard to find correlation because so much of it's spurious. So it's very important to keep in mind that when we look for correlation in all of this data, all of this activity, we're looking for strong correlation and causal relationships.
But even then, that's still not the end. Even then, we wanna find correlation, strong correlation that is meaningful to a human being engineer trying to solve a problem. Let me give you an example. I use this at a tech field day back in September. I'm sorry.
Networking field day. I think it was networking field day 29. Go watch it. Go watch it. Yep.
Imagine having a 400 gigabit per second interface in your data center. 400 gigs, which is not unheard of today. Right? Yeah. Yeah.
When I was finishing up my career in the field, we were just getting into 100 gig. So let's say you got a 400 gig interface, and it's a it's a standby interface, which in a design would be stupid because you wanna utilize all your interface. But let's say it's a let's say it's a standby backup, whatever, and it's plugging away at whatever control traffic at, like, 1 meg per day, statistically 0. Yeah. That 1 meg, and you're and you're tracking it in your monitoring system.
And then one day, it goes from 1 meg to a 100 megabits per second. That's a gigantic, statistically large increase. Yeah. Do we send out the fire brigade? Do we send out all the alerts?
No. It is still irrelevant in the sense that it does not affect application performance or service delivery in the slightest. So so how do you incorporate now the subjective component into finding correlation, into finding relationships among the data? Whether you're using columnar databases or graph graphical databases or whatever it happens to be that your organization is using, My company happens to use columnar databases for a specific reason, but I know others that use graph and things. How do you add the subjective component?
The the meaningful, the significance. And so that is a struggle right now. How do you represent that in math is ultimately what we're trying to do. It's tough. Yeah.
I I think this is it's a fascinating point that that what you're really doing here is or replacing not replacing, but but augmenting people to so that they don't have to go through and do all this this correlation and this this understanding, this normalization themselves. Mhmm. Basically, you're advancing their position and saying, right. Look. We we we do all this for you because this is a whole lot more complex than it ever used to be.
We've moved things on and we're now putting you at a point where you have the data. We can help you make those decisions by giving you something that you can relate to right and and so then it's about I guess bringing other data sources as as much as possible to enrich that as as far as is possible to basically make that ultimately the decision as simple as possible for the human who's at the end of the chain, right? And and ultimately, it's a human trying to fix an issue. Right? Right.
Right. Why is this thing not working right? You know? And and so it really feels a little bit mundane in its conclusion. Nevertheless, it's still very important.
Because it's so exciting. Exactly. We're collecting flow data. So we're talking about networking now. We're collecting flow data.
We're collecting, routing tables, which is not, you know, just a straightforward dataset. It's not like we're collecting streaming telemetry. We're collecting SNMP traps. We're collecting maybe we're collecting packets. Folks, we're using eBPF to monitor container networking now and and, you know, look at Linux kernel activity.
Maybe you're collecting threat feeds and looking at the geolocation, when we collect an IP address and correlating it to something and saying, this is where you are. So the the amount of data, but also the quality of data and the divergence of data all matter. But but, ultimately, like you said, it's to augment the engineer and help them to figure out, oh, it's that router over there. When I when I make this change to DNS here, why why does that router all of a sudden go to 99% u CPU utilization? And and the only reason you knew that is because, you know, there was an application slowdown or something.
So so be being able to, expedite and make more efficient that process of going through discovering clues and how things are related to each other, is really what we're talking about here. Yeah. Yeah. It's it's for for me, it feels like it's filling in gaps. Right?
It's it's the fact is that you would have to make those those logical jumps yourself as a as a network engineer. You'd have to go gather that data yourself. You'd have to go find and understand how an application behaves and all of those things bringing in people, looking at documentation, going through all of that process in order to make those leaps yourself. And what you're doing here is basically pulling data from the source that you so you know it's gonna be accurate because it comes from the actual network itself or or from the from the application infrastructure itself and be able to then, make the, make those jumps way quicker and and and put people way much further along the path of of resolving any issues. Right?
Yeah. Isn't that what we do though as engineers, Darren? Well, absolutely. You and I have both been engineers. We've done that manually for years.
That's what we did in our career. So I'm not trying to say that this is a magical thing that that people can't do. It's just that this makes the process of what would take an inordinate an inordinate amount of time or people because of its complexity and nature of all the different types of telemetry going on, makes it very difficult or next to impossible. And so, yeah, you can do this with a bunch of people. It's just gonna take a long time to pour over all of the information in the logs and you need somebody who's, you know, you need a room full of unicorns that understand all of the different types of data.
This is the other part part. Right? You've got all that knowledge of that you've had to have accumulated to get to that stage. And at least by this short, you're helping that people along that as well. But, you know, the thing is it presupposes that when the system spits out that conclusion, that we can trust the system.
So one problem that we're seeing with observability, network observability, the application of data science to networking is that there is it's a reducing number, but it's still a high number of false positives. For example, anomaly detection, is something that the industry as a whole still struggles with to an extent. I've seen it improve dramatically over the past 3 years that I've been in this space in particular, this specific sliver of networking, and it's it is dramatically improved. But it still does presuppose that you can trust that conclusion, that output. And that that requires a constant iterative process of giving the system good data, looking human beings looking at the conclusion, tweaking the algorithms, tweaking the models that are being used, getting feedback from the community, from customers, from network engineers saying, was this significant?
We gave you this this insight. We gave you this little bit of feedback in the system, like, we threw this alert. Did that alert really help you? No. You don't even care about it?
Well, maybe we don't need to do that anymore. And and I guess in enriching with new data sources and other things, bringing that in as well. Right, and seeing where where other data, contextual information, or whatever would also improve the outcome. Right. Exactly.
Right. Now what we've talked about is kind of real time state. I'm troubleshooting a problem. But what about predictive stuff? You see, when you're using machine learning, you can do, you know, linear regression.
And you can start to do things like baselining or looking at seasonality or trend analysis. Those are all different things. A lot of people like to or they accidentally, conflate trend with baselining. Those are different. But you could start doing those things.
Some of those things you can do with basic, you know, statistical analysis like we said. But let's say you wanna look at seasonality and sort of predict that when this change occurs over here in this part of my network, this kind of perform I should expect this sort of performance degradation in this part of my network over here, in this part of the world. And so you have prediction prediction analysis, which again is based on probability. Yeah. And then also seasonality trends and looking at over time.
But it's not just networking, you know, look at how is the trend of the cost of my monthly cost of AWS? That's isn't that part of what we do as engineers? Absolutely. What is the the trend in help desk tickets or I don't know what? But, you know, you can look at all these other components and incorporate it into your dataset as well.
This is the this is it. It's those KPIs. It's the metrics that that the the things that are used to measure our performance as a as as operational teams as well as the actual performance of the of the technical systems, isn't it? So Yes. And I guess this is where it gets really interesting because then if you can start to look at automated remediation and and building, you know, the magic of the API comes into play.
Right? Because then you could then you can start to take those results and go do some really cool stuff with them. Yeah. Yeah. I do believe that that is still, a ways off in its practical application simply because the reality of a a system that can push configuration autonomously and be able to avoid any particular bug that's out there in a a version of code in these 3 switches you have in this closet, but you have different code in these 3 switches.
You know, the the reality of of networks out there is that they are not perfectly pure and clean. So having a a con a a a service layer that can push config autonomously is, I think, still a ways off because, that's that's just the reality. I mean, literally, just starting with bugs alone. I mean, there's other reasons it's a little bit difficult. In theory, we could do all of it right now.
Yep. It's not hard to push config to a box, you know that. And to do it autonomously just says, hey, if else statements, right? Yeah, yeah, yeah, yeah. It's first push this config.
If this trigger, go do that. Yeah. It's it's not hard. And and you know what? We do that to an extent in the in the network security realm much more.
If this occurs, shut down the interface, throw an alert. So we do some automation of configuration stuff or or throttle bandwidth, whatever whatever we wanna do. But to scale that to, you know, entire swaths of the network or a data center or cloud, footprints, all that kind of thing, I think is still a little bit difficult because of the nature of, well, the complexity of the devices. Not the complexity of the devices, the the diversity of devices. The and the complexity of the interconnections between the devices, I guess.
Right? Yeah. And, you know, we have those we have those folks out there that are vendor agnostic that can push code to anything, and they can discover a network. So so, certainly, there's advances going there. And then how is that system going to make those decisions?
Well, that's gonna be based on, network visibility and observability that we've been talking about for the past half hour. So so this is gonna be the underpinning for that next evolution of of networking, whether it's, completely automated remediation or some some version of that that we're comfortable with as network engineers. The, yeah. It's that that, nirvana of intent based networking. Right?
Yeah. Isn't that what we wanted? Intent based networking was gonna just do all the all the networking. Like, I I always like to say it's like Geordie LaForge, commander Geordie LaForge on the enterprise if you like Star Trek. You just say computer, you know, or or in our case, network.
Configure, you know, VXLAN between data center a and data center b. Maybe now we can do that with chat chat gbt. Yeah. We can yeah. I was gonna say yeah.
Chat. Yeah. Yeah. That's it. That's it.
But is it that interesting? Intent based networking went from this idea of automated remediation and and, just telling the system your business intent, and then the system would go and and do all the technical component. It went from that to, analytics. Because in order to do that, you had to gather an incredible amount of telemetry from the networks, you know, to be able to to Absolutely. Yeah.
And then a lot of the intent based networking companies sort of abandoned the networking part and became analytics companies I've seen. Now I I not thoroughly and totally abandoned, but in their marketing literature and in their videos and stuff, they focus on the analytics now. So I wonder if there, you know, there was a clue there, like, you know, we're not selling this box, and analytics is so important, so maybe we should go in that direction. I think I think absolutely. I think this is an an you know, speaking, you know, from from, IP fabric's perspective, that that that whole assurance piece, that whole thing about about being able to understand what's going on in the network is is the key to to so much.
I mean, obviously, not just that automated approach, but to fill in those gaps that you've described as well as we've gone along there. These are the things that are causing people pain whether they're automating or not. Right? I mean, a lot of people are just still trying to just keep their networks up and running. And and as we just add layers and layers of of complexity, it just becomes That's right.
Impossible to to to track. So to to be able to fill in those gaps with the with an observability approach as you've described or or, you know, just increasing that visibility that and understanding that people have have got in their networks can only help. Yeah. In fact, you're using the term over and over right now, fill in the gaps, and that's a great way to look at it. Because another component of network observability that we really haven't discussed yet is the ability to infer visibility.
Mhmm. If you think of and that's part of ML thing, you know, when you're looking at, you know well, not necessarily ML, but let me give you an example. You have resources on your local network, network, your own private data center. You have resources in cloud a. You have some resource resources in cloud b.
You have a SaaS application where you have no resources. You're just consuming it. You have, you know, workers at home. You have workers wherever. You you really don't unless you're also getting incredible telemetry from all the different service providers around the world all at once.
You have these kind of pockets of visibility. And then within those pockets, you have within pockets. So maybe in AWS, you can see very clearly what's going on with your resources, but you don't really know what's going on with some underlying infrastructure within AWS or or Azure. So you're missing some visibility. Using network observability, you can start to see different things.
And like I was saying earlier, you can see, like, the round trip time of a a simple packet. And then from that, along with other metrics, sometimes more much more complex metrics, you can infer the problem is with AWS' underlying infrastructure, and it's likely this. So you are now able to infer visibility where there really wasn't any What's that mean? Again, that's a matter of probability as well. So always, can you trust the system?
And and and the data scientists and product managers and engineers behind the scenes are always trying to improve that probability, make it higher and higher and more trustworthy. And I and, yeah, comes comes all the way back again because, ultimately, once upon a time, that would have been the the trusted, hardened, network engineer of of decades of experience who you would be relying on to do that that thing. Mhmm. So, yes, would just goes to show how we've moved on. Yeah.
Yeah. It's interesting stuff. I think incredibly value valuable and and more than valuable, necessary. Yeah. A requirement, simply because of how we consume applications today.
Yeah. Phil, really appreciate your insight. It's it's been a fascinating conversation and and one that we could as you rightly pointed out before we started this recording, we could go on and on for hours with this. Right? But, clearly, a space that's that's growing and evolving and and much needed to help, people with with the increasing complexity that they're faced with, I suppose.
Can people reach out to you to discuss this further if, if they're interested? Yeah. Yeah. Absolutely. You can find me on Twitter at network_phil.
I am still very active there. Excellent. You can search my name, Philip Gervasi. I have 2 l's in my name, on LinkedIn. I'm also very active there.
And then my blog is networkphil.com, where I'm not very active, but you can find more contact information there. Awesome. Listen, Phil, thank you ever so much for your time. Much appreciated, and thanks to everybody for listening. Tune in next month for another community fabric episode.