Charity Majors is the founder and CTO of Honeycomb, provider of observability tooling for modern engineering teams to build resilient production software that delights customers and reduces toil. Charity tells Beyang about how Honeycomb derives its definition of observability for software systems from its original definition in control theory, and how observability differs from monitoring and logging. She shares war stories from her time keeping systems online at Facebook and Parse, gives her predictions about how the landscape of observability and monitoring tools will evolve, and discusses how developer tools can make programming more accessible to everyone.
Charity Majors, CTO and founder of Honeycomb: https://twitter.com/mipsytipsy
Christine Yen, CEO and founder of Honeycomb: https://twitter.com/cyen
Linden Lab: https://en.wikipedia.org/wiki/Linden_Lab
Observability in control theory, as the mathematical dual of controllability: https://en.wikipedia.org/wiki/Control_theory#Controllability_and_observability
Testing in Production: https://www.honeycomb.io/blog/testing-in-production/
The Four Tendencies, by Gretchen Rubin: https://www.amazon.com/Four-Tendencies-Indispensable-Personality-Profiles/dp/1524760919
Other companies founded by ex-Facebookers, partly inspired by tools at Facebook:
DORA report and DORA metrics: https://cloud.google.com/devops/state-of-devops/, https://thenewstack.io/dora-2019-devops-efforts-improving-but-not-done/
10x engineer (trope): http://svdictionary.com/words/10x-engineer
This transcript was generated using auto-transcription software and the source can be edited here.
Beyang: Alright. I'm here with Charity Majors, former engineer at Facebook and Parse, now co founder and CTO of Honeycomb. And I believe the person who first coined the term observability as it relates to DevOps and modern application development. Charity. Welcome to the show.
Charity: Thanks for having me. I think that I was the first the person to give observability a specific technical meaning. people have been using observability to refer to generic telemetry and of course there's the, you know, the long and storied history of observability as a mathematical duel for controllability. Um, but I think that we were the first people to sit down and go, okay. This is about the unknown unknowns. And if, if you accept that definition, then what else proceeds from there. So we'll accept that part.
Beyang: thank you for the clarification. and I, I think we have a lot to talk about today, you know, want to dive into observability and Honeycomb, and your, and your thoughts on, a lot of different things. But before we get into that, we'd like to kick things off , just by asking people, to tell us a bit about their backstories, you know, how did you get into programming and engineering? And what's been kind of your, you know, brief life story as an engineer, that's brought you to this point.
Charity: Oh, I have a weird and wandering road. I was homeschooled, in the Backwoods of Idaho. and I went to college. Performance piano scholarship. when I was 15 and. It was really just like, get me out of here. Oh my God. I haven't seen another human being in six months. I'm going in the scene. but I got to college and discovered computers there cause I noticed pretty quickly that all of the music Majors were poor and I was very tired of being poor. So I swapped keyboards and and I've just kind of been tinkering ever since. And you know, I love it. You know, I really loved it. There are so many people with nontraditional backgrounds in, in like in engineering and, and it makes me a little sad that I feel like, the doors are closing a little bit, you know, that, that, that there are these certifications and these qualifications and stuff that, you know, when I got started, it was really just like, everyone is so desperate. If you could write a code, if you knew how to launch Emacs, Like you were hired. So, you know, professionalization is a good thing, but there aren't many growth industries in, in the entire world, I guess, through my lifetime. And computers has been that it's been a real source of opportunity for weirdos.
Beyang: Yeah, definitely. And I think at least in my experience, I've found a fairly weak correlation between official certifications and, you know, even degrees and, and skills. So,
Charity: yeah, no, it's not about being able to do the work. It's about cover your ass. Like as a hiring manager. Totally.
Beyang: yeah, well, you know, I certainly hope that, uh,
Charity: I feel like we could rant about this for a long time.
Beyang: This is like a whole other podcast. Yeah, definitely. I I'm actually just a little bit curious about, you said, you know, your, your original, I guess academic pursuit was , piano performance. Are there any like favorite pieces, or songs that you
Charity: I love, I love Rachmaninoff. I really loved the romance composers, you know, so I did, I did music and then I did Latin and Greek, and then I did, electrical engineering and then I got a job. as assistant admin for the math stat department, this is back when they were still giving kids root on campus, which was a terrible idea. What were they thinking?
Beyang: much power.
Charity: Oh my God. You know, and then I had root for the entire campus and, you know, and I kind of slowly stopped doing schoolwork and started working more and more. And, and then I got offered a job in Silicon Valley for like $75,000. And I had never. I had never heard of anyone making that much money. And I was just like done. I will go that. And I've been down here ever since, but I feel like I've been really lucky in my career. I've worked some with some amazing people that, you know, it's a small, small Valley and, and, you know, there's some people I've been working with over and over, off and on for like, 15 odd years now. And I've had some shitty jobs, but they're mostly pretty brief and not like. Not like, you know, they weren't like emotionally traumatizing. They were just boring. But I feel like I had, I worked at Linden lab for five years, like straight up, like my first real job. And that set the bar for me so high, like, and I feel like, you know, Christina and I have once in a while, we'll just look at each other and go, you know, if we, when we're feeling like we're failing at everything, we'll be like, you know, if we can just create a job for the people who work here, That sets their bar high so that they don't settle for bullshit. Um, then you know, it'll all have been worth it because there's really no excuse for accepting shitty jobs or skills are in far too much demand.
Beyang: definitely. I mean, the, the, the part, I think that gets everyone into programming is the creative part of it. You know, you're, you're creating these beautiful abstractions and algorithms and data structures, and that's, that's what I think we should always be striving toward.
Charity: or the impact, like the, I feel like the dev ops is, is like this noble attempt to, to tear down a wall that never should have existed. Right. It's like the wallet that you throw your coat over, you know, this, this idea that you're done with your job, once you have merged your changes to master, right? Like that. That's terrible. It's not only is it, is it bad for the code and the systems and reliability it's bad for you because it decouples you from the outcome and the impact of what you've built. And I feel like all we're trying to do is stitch back this feedback loop so that the people who are experiencing the pain are the ones who are empowered to fix it. And who have all the contexts like fresh in their head. And just like getting that feedback loop going like. Ox has a well deserved reputation for masochism. And I'm not the point here is not to invite everyone to like be masochist. The point is that this actually makes things better, that like, it shouldn't have to be painful to support these systems. And I definitely believe that.
Beyang: Yeah, I think the way that you describe dev ops is. Interesting, because when you talk about it, it's like this thing, you know, in the beginning, dev and ops were one, and then there was, this schism and now we're trying to put back a kind of the pieces together and then rebuild that feedback loop, as you said, but, you know, before talking with you, I've I feel like most of the people I talk, I spoke with, it's kind of a new term, right? Like, you know, dev ops is this new thing we're merging these, these, two separate and distinct things as
Charity: not good at all. And it's just getting back to the way it once was. Right when we were all happily like editing source code files, live in production as root, you know, the good old days. Yeah, no, like, I mean, you can understand why specialization emerged because of complexity and because it just got too, it just became impossible. Right. And, and so the trick, I think that the nail that we have to thread moving forward is to, is to, you know, allow for specialization, but not to lose sight of that feedback loop that really critical, just like heartbeat of shipping code to your users and understanding what you've built.
Beyang: yeah. And I think that's a good segue into a observability. So, you know how, first of all, you know, what, what is observability and how does it help glue back together? Those, you know, two sides of software development.
Charity: Yeah, totally. Observability is. Basically just, you know, at a high level, it's being able to ask any question of your systems, understand any state that the system has gotten itself into, without having to have prior knowledge of it without, you know, having seen it break before, And without shipping any new custom code to handle the question that you're trying to ask, because that implies that you could have predicted what question you're going to need to ask. Right? Like if you think back to like what the telemetry you've had today, like metrics and logs, right. With logs, you can always find what, you know, you have to go look for it and if you don't know to log it, and if you don't know to look for it, well, you're screwed. Right. And with metrics it's like, You're just like, they're always little counters and stuff that are firing as your, as your code is executing. It might fire off, you know, couple hundred, metrics while one request is executing, but none of them are connected. They're not tied together because there's no connective tissue. That means you can't ask all these new questions, like, you know, Oh, you know, Okay. This metric spike. Well, what else did the things that spiked have in common? Right? they're just all these things that like, because you didn't collect the data in the right way. you can't ask those questions. You can't understand these complex States. So observability is, is, you know, The term comes from control theory, which has its roots in mechanical engineering, when we were building Honeycomb, we were, and you know, non-trivial, we've had to build, you know, our storage engine and query planner and everything from the ground up, to support these data, these data structures. Far more difficult than building the technology was figuring out how to talk about it because every term and data is so overloaded. Every data tools demo looks exactly the same. They're all like, Oh, you know, and. Six months in, I was still trying to feel, I knew we weren't a monitoring tool because monitoring is for reactive. Right? Like you define these arbitrary threshold and then you just monitor the thing, just check over and over. Is it up? Is it up? Is it up right? I really weren't that, And it wasn't like till six months through that first year that I saw the term observability, I think it was from the, you know, the Twitter team was named Lydia and I looked up the definition and I just, like, I just had like, Light bulbs going off. My brain is like, Oh my God, this is what we're trying to do. so the reason we were building this back up just a little bit, Christine, my co founder, and I, we both come from Parse, the mobile back-end as a service. In Parse, we had about 60,000 mobile apps in our back-end when we got acquired by Facebook. And that was around the time that I was coming to the horrified conclusion that we had built a system that was basically undebuggable by some of the best engineers I have ever known doing all the quote unquote right things. But like a few times a day, like, you know, it'd be like Disney says their app is down. I'd be like, Well behold my wall full of dashboards. Like everything's fine. It's all green, right? Because cause maybe Disney's doing like four requests per second and I'm doing a hundred thousand requests per second. It's never going to show up in my time series aggregates, or, you know, whatever it is. So it could be an app that's complaining they're down, whatever. So I'd have to go and figure out what was going on. Very brute force, manual labor process, because you know, you've got your top 10 lists and you've got the questions that you defined advanced monitor. And then if those weren't the problem, you know, you're, you're just, you're looking for a needle in the haystack. Um, And you're looking for a needle in a haystack when, okay. So Disney thinks their app is down. well it could be something that they did, something that we did some combination of the two, or because we're using these big, like pools of unicorn workers, you know, shared databases. It could be. Any one of those other 60,000 mobile apps is doing something that caused a starvation of resources on any one of those pools of resources. It's just literally impossible to figure out what the fuck is going on. And, and like I tried every tool out there and. The first glimmer of hope that we had was we got, we started feeding some datasets into this tool at Facebook called scoop, which is, which is aggressively hostile tool. Like it's not funding youth, but it does one thing really well, which is it lets you break down by dimensions of high cardinality. So if you've got like 60,000 mobile apps, it'll let you break down by that app ID. And then by, you know, you know, Whatever else you want. And this was like, I didn't really get why at the time, but this was like, this is core. This is a core pillar of, you know, what it means to have observability is high cardinality. Because what if you're looking for a needle in a haystack, what is going to be the most identifying information? It's going to be any unique ID, right. And everything out there that's built on top of metrics. You can't have high cardinality dimensions and tags. Like you could have maybe a hundred and then it's just like your cutoff. Like you're going to explode the key space and you can no longer, you can no longer even tag them with, with that data.
Beyang: Yeah, so, so high, high cardinality, like help, help me understand,
Charity: it's, it's the unique items in a set. So imagine you have a collection of a hundred million users. your social security number is the highest possible cardinality. You can get last name and first name are very high cardinality. gender is very low cardinality. So if you're searching for somebody and you're searching by gender, that's going to not be super useful to you
Beyang: to get a lot of results and you're
Charity: most datasets. Exactly. Exactly. And so this tool that Facebook, that like, let us break down by these high cardinality dimensions suddenly we could. And if you think about like all of the questions out there that you use a software engineer want to answer, they're often by chaining together, many of these high cardinality. So it's high. Cardinal and high dimensionality. So it's like this bug is only tripped when it's on a user using this version of iOS using this version of the firmware or using this version of the app, using this region, using this language pack using, you know, like every single one of these is high cardinality. And that's the only way to like, Zero and track it down. But once you can do that, it becomes really easy, like really dead simple. And this is what I realized was that, you know, instead of like, The way that we could put debug right now is like we form these castles in our minds and we guess what the answer is. And then we go look for evidence of that answer. Well, instead of if we had a tool that just let us take one foot after the other and follow the train of trail of breadcrumbs so that it, you don't have to know what the answer is. You don't have to know where you're going to end up. You can just start looking. So, so for example, you know, I see a spike, What's wrong. Well, I don't know what's wrong, but I can go and break down by endpoint. Which ones are slow. Oh, it looks like all the right end points are slope. Is it all of them? no. It's just the ones that talk to this particular backend. Is it all of them? No, it's just the ones to that shard or those two shards. What do they have in common? Well, the primary, you know, so it's just like, you're just, it's, it's almost like bringing science back into computer science. And this is why, like, I feel very strongly about things like testing in production. I feel like, all right, I don't, I don't know how, how much you want to expand this rant too, but like, Testing in production is, is, is inevitable. It is something that all of us do, we have to do it. It's called reality. You know? I feel like, you know, TDD was great, most successful, like software moving in my lifetime, but it, it, it acquired, it got predictability and repeatability at the sacrifice of everything interesting. Like you're just like everything interesting is now a mock, you know, test. Like the end at the edge of your laptop. And that means that like, everybody should know better than to think that their staging environment is going to resemble production. Right. And, and like, if instead, if you, if you, instead of like writing code until your test pass, if instead you write code while instrumenting with an eye towards how you're going to understand, is this working or not in production. And then if you can, you know, get it. If you, if you have everything automated so that when you merged it master, it gets out to production in a matter of minutes. And then you have muscle memory. You just go and look, is it doing what you expected it to do? Does anything else look weird? Like closing that loop right there. You will catch it. 80 or 90% of all problems before your users ever even catch a whiff of it. It is, it is weird that we aren't doing this. That's how we used to write code. Right. We would write code right there on production, hit, save, reload the browser, see what happened. Right. That virtuous feedback loop has gotten all broken up and like. Tossed to the four winds and we've lost that really, that tight and really grounded in realities form of, of, of testing.
Beyang: yeah, totally. And a lot of what you're saying really resonates with me, cause you know, just to be clear to our audience, Sourcegraph is a Honeycomb customer and we use it for, One instance of Sourcegraph, which is the most important instance, sourcegraph.com. but we, we can't use it for a lot of our on prem, you know, customer instances, because, you know, they don't want to send their, the data over to, the cloud, and the kind of difference in the experience of debugging an issue on sourcegraph.com versus one of those on-prem instances, is quite different. You know, when, when we don't have Honeycomb available, it's like, okay, there's a, Prometheus's alert. That's firing. It indicates that, you know, some, an end point is, you know, taking longer than we expect. Okay. Then the question is like, okay, let me kind of pin down that to a specific issue. And that's a fairly big jump to have to make. Cause then you have to like dig into the logs. You might like try to reproduce it in the UI. You open up Jaeger to try to, you know, capture a trace as it happens. And it's just kind of this. Song and dance. Whereas, I feel like with Sourcegraph.com, we can answer a lot of those questions just within Honeycomb for us, because we're like you said, it's, it's, it's providing, that high cardinality data set to you and you can just kind of go in and explore the data, instead of having to jump to different tools
Charity: this is an experience that very few engineers have ever had, and this is why, you know, Debugging. This is why we think computers are hard is fundamentally because we have so little visibility and insight into how they work and we don't have the tools to even ask pretty big, you know, you know, the experience of after you figured it out, it's the most obvious thing in the world, right? Those categories of problems should never have been hard. It's just that like, we, you had to. Almost figured out from first principles by reasoning about it in your head. And that's just, that's insane. We can't do that. We're not good at that. It's so much easier when you just bring it out into the, into the open where you can just watch it. You can ask simple questions, you can see what's happening and then you don't have to guess. You don't have to like model the entire caching, like ecosystem in your brain just to like, make a reasonable guess about what's going to happen next.
Beyang: yeah. Now, let me, let me kind of take a devil's advocate position here a little bit. And so, you know, earlier you were talking about the importance of a high cardinality in, observability tools. What would you say to the skeptics who say that, you know, high cardinality is great, but you're never going to have kind of infinite cardinality and, this whole like promise of like granting, like giving someone access to, like a data set that represents everything that's happening, in production. That's never going to be the case. It's always, there's always going to be something that you're going to miss cause you have an instrument, your app to track that particular event, before. And so it's, it's, you're always going to go back and forth between
Charity: so it's not about, it's not really about always being able to find exactly what's wrong. It's, it's much more about being able to find out where exactly the problem lives. Right. Like, it's not like you, you might not be able to. So from the perspective of your Honeycomb dataset, you probably can't. if there's a firmware bug that's causing, you know, 10% of host of blood, Honeycomb is not going to tell you that, but it's going to tell you exactly which subset is erroring and what they have in common. Right? It's like the hardest problem in distributed systems is not. Debugging the code it's figuring out where is the code that you need to debug? And this is, this is, this is where it's, it's so interesting. Cause like we've we've with, with like distributed systems recently, like we've taken what used to all live inside the application monolith and we've just blown it up. So now we're hopping the network all over so you can no longer attach a debugger and just step through your code. Right? If you want to follow your code logic, you actually have to like. Do all this operational stuff and hop from machine to machine. So like, I feel like there is still like, hardware is not yet cheap enough that we can pump the output of like a debugger run of all of our processes into something that's tractable like now. But like, what we can do is say, This is the conditions under which the problem happens. That's enough for you operationally to get to a known good state and be stable. Right? Resiliency is not about, it's not about making it so that things never fail. It's about making it so that lots of things can fail and your users don't notice. Right?
Beyang: Yeah. How, how prescriptive are you about, you know, what sort of things, your users and customers, should track? You know, obviously I could in theory, send anything over in the metadata that a company is like an event and
Charity: well, a pretty, I would say about a third of the magic and power of observability is in the gathering of data and story and in the right way, you know, with metrics like, like I said, you know, there's maybe 200 different, you know, verbs of data there, but they're all blown up and separate from each other. You can, you can, you can't work backwards. You can't go back from 200 metrics to an event, but you can go from an event to 200 metrics. so like, It's actually very fundamental to observability that the source of truth are these arbitrarily wide structured data blobs from what you can drive logs or metrics or traces or whatever. Um, And it's important that they be arbitrarily wide because like anything, anything was schema or indexes or anything is again, locking you into, you're saying, I am only ever going to want to know these facts about my system, which is kind of anathema. Right? You want to be able to toss in more detail whenever it occurs to you. Oh, this might be interesting. so, and. And you want to gather it up, basically one blob per request, per hop. So, you know, your request enters the system, maybe hits the API server. That's one blob and maybe it bounces off to hit, you know, the payment service. That's another blob. and when, when the request enters a service, like we initialize an empty Honeycomb blob, and then we pre-populate it with everything that we know or can infer about the system, the language internals, the parameters that were passed in like all the basics. And then while the, while it's executing in your service, you can effectively do like a printf you know, just print any details that you want into it. And what you want to capture is any unique IDs, like anything where you're like, Oh yeah, I'm gonna, somebody's going to file a bug someday. And I'm going to want to be able to find it by this right. Shopping cart ID, et cetera. I'm just. Stuff, Malian. And then when the, when the request is ready to exit the service, it just ships that off as one very wide structured blob. And we find that a maturely instrumented service, we'll have usually 300 to 400, dimension dimensions. It'll be 300 to 400 like verbs wide, so to speak. Um, that just seems to be like where they stabilize and that's like, That's way more than you can keep track of in your head. so, but, but that's fine, right? Because, Well, like, like for example, with bubble up, you know, if you see a spike and you're like, ah, what's going on here? Will you just draw a little bubble around it? And then we precompute for all of the dimensions, both inside the circle and outside the circle, and then we dip them and then we sort them. So that the ones that are different come to the top. So if it's like, Oh, okay. The spike. these requests are different in these five ways. Well, you can just see that as a glance. Like you don't have to actually like keep a dictionary in your head and refer to them. It should be, there should be very low little friction to, to toss them and, you know, and, and people can do this. And I recently found out actually that Amazon of follows a, an internal logging, spec that is almost exactly like this and has for like a decade. It was really difficult for us to figure that out. I wish they would have just opened sourced that so we could have leapfrogged that year. Yeah. But no, it's really important that you gather things up that way, because that, that allows you to ask all of these novel questions later on that you may not have ever predicted that you might want to ask or, or associations that might not have been obvious. Right. Because it's all about making it visible and easy for the human user to figure out what's important to them, which means it's about aligning you with your users, right. And that like aggregating it around the request means that, you'll be able to see exactly what kind of experience your user is having. That's. Another core pillar of observability honestly, is, is, is shifting away the emphasis from the systems and the infrastructure to the user's experience.
Beyang: Hm. Cause, cause in, in order to, kind of gauge what the user experiences you really have to pull in, but kind of a variety of data, right? It's not just a single log line. It's not just a single
Charity: Yeah, it's all about from the perspective of that request, was it like, we don't actually care if you know, the cache server was down or something was down. What we care about is the ability of each request to exit or not. Right. versus like, if you're doing traditional monitoring of your infrastructure, all you have is aggregates. You can't ever actually trace that back and figure out what anybody's actual performance was like.
Beyang: Yeah, that makes sense. Kind of taking a step back here for a moment. you know, one of the things that I struggled with and I imagine, a lot of listeners, also have, have kind of, dealt with in the past is there's just like a. A lot of different, tools, in kind of the space of observability, monitoring, log aggregation, that sort of thing. So like, you know, like APM tools, like Datadog, New Relic, you got distributed tracers, like lights to Zipkin, Jaeger is like application level monitoring tools like Sentry, and then there's the
Charity: And it's in a really confusing stage right now. And the reason is because they're all in the process of converging. Like I think over the next three, four, maybe five years, you're going to see. The collapse of APM, monitoring metrics, log aggregation, except except the security use case I think might hang on as its own thing for well, but like, you can see this already starting to happen and just the acquisitions that they're doing and the like, you know, Splunk acquired SignalFX and Omnition, right. And, and it's, and it's partly an artifact of this, this three pillars myth. there's no such thing as pillars in observability, because those are just data structures. They're just data type. You know, this is something that all of the big players like to say is that there are three pillars of observability: metrics, logs, and traces. Which just, you know, they happen to conveniently happen to have a metrics product, to sell you a logs product, to sell you on a tracing product to sell you, you know, and not only is this wasteful, it is worse than that because you should only have to. Pay to store that data once not three times. And like, if you're paying to store, if you're paying to store it three times in these prematurely optimized formats, then you have to have a human in the middle who's just like sitting there copy pasting from one to the other. Right. This is where you've got like people you'll sit and look at your dashboards. There's a spike. I want to know what's going on. So then you have to turn over to their logging. It needs a completely different data set to like dive in and figure out what's going on. And if they want to trace it, they have to copy paste an ID over to their tracing thing. Like that is broken in so many ways. It should just be a visualization by time to trace. And that's it. It should not be a separate product.
Beyang: got it. And so, you know, the way Honeycomb looks at these, different facets of observability, how do you integrate all that into one application? I mean, do you think it's all going to be one interface or do you think there's gonna be like plugins
Charity: Yeah. I mean, the important thing is that you need to be able to come in at a very high level, like your dashboard view, right? There's there's a dip, there's a spike. There's something. Right. And you need to be able to slice and dice and figure it all the way down to the raw rows so that you can figure out exactly what is going on. Observability absolutely depends on having access to raw rows, because if you don't have access to the raw rows, then you can only ask the questions that you happened to aggregate on when you were ingesting the data. Right. but if you have those raw rows, if you have the ability to just slice and dice and like cut it up, various ways and like you should be able to flip back and forth between tracing, which is just an overview of, you know, a waterfall. You just viewing the events by time instead of viewing by, you know, count. yeah. They should just be two sides of the same coin. It should feel absolutely, you know, seamless to move back and forth.
Beyang: it's like, it's like data science in a way. Like, it's just another data set that you're trying to explore and you have kind of your database and then various, I guess like visualization.
Charity: like it's so you might want to, you might be like slicing and dicing and then, Oh, you find an example of the error. So you want to trace it and see where visualize where exactly that time is going. Oh. And then you see the time is going there. Then you might want to zoom back out and see who else is impacted by this. You know, you might. Just going in and out of, of the views like that. So, so you were, you were saying yes. Yes. The, the space is very congested. And in fact, when Christine and I were starting this company four and a half years ago, we had so many people tell us very condescendingly that there was no room for any, like, it's, it's a self problem. There's no, there's, there's nothing left to be done. Which, which is kind of true. I, I don't, I think that metrics have reached the end of the road. I do not think there will. Ever be another better shinier metrics product built than Datadog and SignalFX. You know, that, that has been, that horse has been driven into the ground. Um, But, I think that what you're seeing right now is, is companies in those three or four different, you know, different markets. They're all trying to get technically to where Honeycomb sits right now with our truly widespread for data blobs faster than we can get to where they sit on their business side.
Beyang: yeah. Yeah. That makes sense. And it's definitely far from a solved problem. I mean like the, the gap between dev and ops is still a difficult one to traverse and. Kind of revisiting that for a bit, you know what? I think one of the challenges is that a lot of, developers who don't come from an ops background, it's almost like ops can seem a bit intimidating.
Charity: Oh yeah. Oh, we've worked hard. We've worked very hard to make people think they were scary too.
Beyang: yeah. And so I was going to ask for like a tool, like Honeycomb. you know, when you go inside a customer and let's say, you know, a customer buys Honeycomb and they want , all their developers or, you know, as many developers as possible to know that exists and to use it and understand it cause they want to bridge that gap. Do you have, like I've talked to a lot of other developer tools, creators who sometimes have trouble spreading the word, you know, like the people who brought you in and are gung ho, they love your product. But then the rest of the team is like, ah, you know, that seems like not my. Not my job or not my thing, or like that's yet another tool I have to learn. Um, can you talk a little bit about how you kind of grow awareness, inside companies about, you know, this amazing tool that,
Charity: yeah, this hasn't actually been something that we have really struggled with too much. And I think part of that is because we, like, from the very beginning, we have always seen ourselves, not as building for people, individuals, but as building for teams. so you'll notice like, you know, there's this Slack buttons where you, if you have a graph and you're like, ah, cool thing. I want to share this with my team, you just push the button, it goes to Slack. And then they see, you know, the preview of the graph and then they can also click it, which makes them jump into Honeycomb. And then they can see not only, you know, your graph, but like your history and, and I feel like. You know, I feel like this is an area where the entire industry has been. Like, we talk about people and culture so much, but we don't really bake that into our products. and, and I feel like, you know, Christine and I used to talk a lot about how debugging was like following a pathway. Sometimes you go down the wrong fork, you need to be able to go back to when you last knew that you had the plot. Right. And, and similarly, like, you know, Each of us, who's working on our own little corner of this giant distributed system. We know our own little plot intimately for. You know for awhile, but we don't, but we're responsible for the whole thing. I mean, we don't know jack shit about anyone else's corner. Right. And so it's kind of like, we need to, we need to, to like, you need to wear grooves in the system as you use it, so that people who come along after you or you, who comes along after you a few months from now, when you've forgotten everything that you ever knew about what you were doing, you need to be able to look at. Your history and your, what were you doing? What kinds of questions did you ask? Like how did you actually solve a problem? So like, you know, when use case, that people use a lot is. You know, if, if I got paged about something, say I get paged about MySQL it's 2:00 AM. And I don't know fuck all about MySQL, but I know that like, are the experts at our company or like Ben and Emily. And I feel like the last time. This happened. I think Ben was on call and it was like, you know, it was like 2:00 AM like Wednesday or Thursday. Right. So I can search back. I can just, you know, go back and look, what did Ben ask that involved My SQL, what questions did he ask? What did he post to Slack? What did he think was meaningful? What did he rerun A bunch of times, right? It's like, like a better version of everyone's bash history file. Right? Like you just want access to like a little snippet of their brain, so you don't have to call them and wake them up. and that, that, that collaboration aspect and collaborating is not just for other people's benefits. Like collaborating is for like past you collaborating with future. You like, I, I, you know, it's, it's the same mechanic there and I really. I feel like, you know, I, for all that, we talk about collaborations. It astonishes me that like even basic history stuff, isn't baked into most tools, which, yeah. I don't remember what was the question?
Beyang: No. I mean, I want to dig into that a little bit more cause so at Sourcegraph we used to have this thing called the ops log, and we still do it basically. The idea is, you know, anytime you're on call and you have to resolve
Charity: Yeah, but the stuff you remember to write down is never going to be the actual stuff that you needed.
Beyang: Exactly. Exactly. So, you know, it helps to a certain extent cause you can, you can like if you take the time to write stuff down, and if that stuff happens to be, you know, the relevant stuff later, it's helpful. But I think one of the, one of the shortcomings we've seen is that that's not always the case. And talk about how Honeycomb help address that.
Charity: Yeah. Yeah. I mean, you should not have to like consciously, decide this is going to be important because it's always the shit that you do when you're not thinking is going to be important or, or you're just like panicked. You're in a rush. You're not thinking about, you know, doing things, you know, with a record for the future. It really has to be something that, that just ambiently is captured, you know, the way that you work with the system. Independent of you deciding, right? Like the way that you describe your behavior is never the same as your actual behavior. And yeah. And, and I feel like I also just, I want to reward curiosity and exploration and people who are having fun with their systems. You know, I feel like it's one of the most unfortunate characteristics of most systems, is that the people who, the person who knows that the best is the person who's been there, the longest like full stop. And that's that's, we just, we've just decided that this is normal and this is just how it is, but it's not, it's a sign that your tooling sucks because it's a sign that you're not actually using your tools to understand your systems. You're mostly relying on your memory of past outages and your scar tissue. Okay. Right. And if you were actually relying on your tools, then it should be the case that the person who's the best debugger is the person who spends the most time debugging or the person who's the most curious, you know, every, every team has a couple people who are like this, right? Who just, who just follow their nose. And, and, and the unfortunate thing is that so many of us get beaten out of this because, you know, you, you pick up the rock and then, Oh, Whoa. That was a mistake. Right. But like, if we could not punish people for following their curiosity, but if we could reward it, if they could swiftly and simply just get answers to their questions and then understand their systems a little bit better than they did before. Um, I, that's just a better world. And you know, I, I have worked now on, on two teams where. That was the case where it wasn't the people who've been there the longest with the best debuggers, it's the people who enjoy debugging the most, you know, and, and that was it, you know, Parse at Facebook and here at Honeycomb. And so like, this is, this is one of the many, many ways, right. I feel like. The biggest hurdle that we face in computers is our low standards for ourselves. Our, our expectation that the world is just this crappy, and this is just as good as we get to expect. You know, like on-call like, you know, I, I, I believe that, you know, every engineer. Who builds 24/7 systems should participate in oncall in some form or another. And I believe that that, that commitment should be met by management in committing to giving you the time to actually fix the things so that it isn't hell, right. Like if you're getting woken up more than two or three times a year, right. It's too much. And we should treat that as like a heart attack, not like diabetes, you know, and, and like, and, and the reason that these systems are, are so flaky and so fragile is because we've never understood them. Like we keep shipping new code every day that we do not understand onto these coughed up hairball systems that we've never understood. And we're just like closing our eyes and hitting, deploy, crossing our fingers and hoping for best and like, The outcome's not going to be that great, but that is a choice. Right? And like, yes, that is the best observed. That is, that is the most visibility that you could expect from monitoring tools, but from observability tools where you can pinpoint exactly which requests are failing and what's different about them swiftly and correlate it to a change in the system. There's no excuse for this, you know, like. Your system should be comprehensible. People should be in the habit of spending time going and looking through the shit they just pushed to prod and understanding it and expecting it to be understandable.
Beyang: Yeah, it's kind of the idea that like, you know, I. I have some SQL, a terminal and maybe SQL is not the right analogy here. Cause that's, you know, structured, it assumes a schema, but like, so some sort of query language over a database that contains every single log line, every single, like request latency with metadata. Um, every single kind of distributed, trace. And that is kind of the way I think about debugging a production issue, like hopping into that and exploring is, is that the like, kind of Holy grail here?
Charity: I think I'm following. Yeah, I think so. Like, yes. just so, so, so the Holy grail here is to push back that moment of figuring out that there is a problem to, to wait earlier, right. Instead of getting a Jira task about something that's been broken for months and like everybody's forgotten, you know, it should be, it should be, you know,
Beyang: Okay. So it kind of starts with alerting then, like, being alerted that there's an issue?
Charity: So, so I think that most alerting systems are. You have to set the threshold somewhere that won't like kill you with alerts and we'll give you, you know, and therefore you're going to need to go and look at it. You're you need to go and look at your systems through the lens of your instrumentation and see is it doing what I expected it to? Does anything else look weird? And, and you're going you're humanized. If you're interacting with your data, you're going to. Pull out so many more subtler, bugs and problems then would rise to the level of paging someone, right. Which is why we really have to like, make it a production practice and an expectation that everyone who's writing code spends time every week with their eyes on production, on their code. Right. Because otherwise it just you're, you're going to accumulate all these little shitty bugs that are not quite catastrophic enough to wake someone up, but there's. Still bad, you know? so, but, but I feel like they're so like Honeycomb alerts, the way we've built them in there, we're thinking less of like high level you're on call. This is a sign that your customers are in pain. Like that's a job for SLOs. but. If you're an engineer who's developing on an end point, you might want to put a trigger on there cause, because you want to know if anything that seems out of the norm is happening while you're shipping. Right. So it's like over a two or three week period while you're making changes to a particular end point. Right. you might just put some triggers there just to like. Shoot you a Slack message during the daytime. And let you know if like, if, if you know, I don't know what, just make up some things that you think might be a sign of something weird or bad or odd. Right. And it's like, it's like bringing your systems into like this constant conversation with you, right? You're you're in conversation with your code and your users. Right now. Right. As you know, cause you're going to write some code, you're going to ship it. It's going to have some unintended consequences, but while you're working on it, you're going to be there. Your eyes are going to be on it. You're going to notice some of these things, right.
Beyang: Yeah. That makes total sense. It's like, well, I guess part of it would be, you know, you want to get. Not necessarily, you don't want to wait until the patient is in the emergency room to get when you get that phone call.
Charity: You don't want to. Exactly. You want to exercise and like go for walks and eat right. And stuff. You don't want to like wait to have a heart attack before you see the doctor.
Beyang: You want like an Apple watch indicators
Charity: Exactly. The Apple watch of, of systems. Exactly. This perfect.
Beyang: I like that. kind of going back to the beginnings of Honeycomb, I'd love to hear about kind of the story of how you met, your co founder, Christine, and you know, what was the point at which you both decided like, hey you know "let's go build a company called Honeycomb".
Charity: yeah. And, and first day, my brain was just kind of like wandering off in the direction of what you were saying earlier about, you know, we as engineers, how we, you know, we love to build things. We loved it. We lived to see the impact of what we've built. And I feel like this, you know, people who resist being on call for assistance and people who, who, who don't like the sound of this, you know, are people who've been burned. People who've been burned out. It is so deeply satisfying as an engineer to just watch what you've built work. I feel like there's just something like intrinsically, sometimes you have to like push people over the hump to get them to try it, but like, it's just, it's so much better than driving in the dark, you know? Alright. So, yeah. Christine, Christine is amazing. So Christine was at Parse with me. So I was the infrastructure tech lead and Christine single handedly built the Parse analytics product. and so she had built this product for our users built on top of Cassandra, time series database, you know, and she was started encountering all of these frustrations where our users were wanting to ask questions of their analytics and that they couldn't because they had been locked into the questions that they had decided to capture the data for, for upfront. And they couldn't ask new questions, you know? And so Christine would get frustrated and she would fall back to Scuba. so Christine left Facebook, while before I did, she went to the East Coast and stuff and we didn't really know each other, all that well, but then when she was coming back, she was asking me, do you know of anything interesting going on? And I was like, well, starting a company is totally gonna fail, but it might be fun, you know? And, and bless her heart. Like she, she was all in. We really, I, I was so sure we were going to fail and I was fine with it because the reason I was doing this was because, I couldn't imagine being an engineer without it, like the idea of having to live without the tooling that we had built. Like my ego couldn't take it. Like, I, I would have been such a less powerful engineer it just, and so I was like, okay, you know, there's some people, you know, pursuing us with some funding, we'll take it, we'll build it, we'll fail. And then we'll open source it and I won't have to live without it. That was really the grand plan from the beginning. And then we just kept not failing. By accident. and, and this is the first year where I'm kind of like, this might be a real thing, which probably means we're now doomed. That's my obstacle in talking now, but Christine is amazing. so like in the beginning it was three of us and, our third co founder didn't work out pretty quickly. And that's how I got pushed into being CEO, which I did not want to do. I had nightmares about being unemployable for the entire three and a half years that I was CEO. Christine and I swapped places a little over a year ago and she is now CEO. and, this is much better, CEO's the worst job in the world.
Beyang: talk, talk, talk to me about some of the, pain points of, of being a dev tools startup CEO.
Charity: Oh, Oh. Some of the pain points for me were, so I've never been one of those kids. Who's like, I'm going to start a company when I grow up, you know, cause I really despise those people. I really don't like the whole Silicon Valley cult of the founder. I find it very off putting, and. And when it comes to like the CEO gig, I disliked the, the, I don't know, there's just so much grandiosity about it and there's so much. I alienated more than a few venture capitalists, which isn't ideal when you're in my position. you know, it just, I am, I am so alright. Here's a pro tip. If, if you're into people or management or whatever, there's, there's a book called The Four Tendencies by, Oh, what's her name? she did The Happiness Project. Gretchen Rubin. Yes. And this I I'm pretty skeptical about almost all personality. Stuff, but this one's very straightforward. It's like, basically what motivates you, living up to internal expectations that you have of yourself or external expectations that people have of you and there's basically four possibilities, right? Are there, yes. You're motivated by both. No, you're motivated by neither. or you're motivated by living up to other people's expectations or you really only care about you yourself. And I am the rebel type that. Rejects as soon as there was an expectation of me, I do not want to fulfill it. I just, I'm kind of a little shit. And like that personality type, she even calls out in the book. She's she's like laughing. She's like, I don't see how any rebel could be a successful CEO. Ha. I was just like, fuck you lady. But there's some real truth and Christine is the upholder type where she just, she really gets a lot of joy and meaning out of living up to expectations, both that she herself has and others have of her. And I, you know, I feel very grateful and blessed to have a co founder who does.
Beyang: Yeah. I mean, what you're saying, I think it, I think I have a little bit of that, as well, you
Charity: it's a really illuminating book. I got so much out for my personal relationships, my work relationships, my understanding of like highly recommended.
Beyang: that's cool. I'll I'll check it out. Um,
Charity: Which type are you? I think most people just know off the bat. Well, are you motivated by other people's expectations of you or? Mmm, so, so the types are Upholder where uphold both Rebel where you reject both, Obliger where you it's external expectations or Questioner. Yeah, most, most people, most people are. And that's, that's great. That's why the world works because most people want to please other people. And that is fantastic.
Beyang: Yeah. I mean, I honestly think the rebel thing is the persona is probably what I identify most with. Like, I was a very difficult child, I would say, you know, I was one of those kids where like, if you told me I couldn't do something that I would like, you know, go and try to do it. And I think part of me never really grew out of that
Beyang: So like any sort of rule, that gets imposed even like, uh,
Charity: you're automatically like, fuck, you know, this is the last thing I'm
Beyang: one. Yeah. It's like, why do I need to listen to you?
Charity: Yeah. Yeah. You asked why, how I got into computers. Literally it was because I walked past the computer labs. I signed zero women in there and I was like, that's where I belong.
Beyang: That's awesome.
Charity: Yes and no.
Beyang: yeah, not the zero women being in the computer lab part, but. so one of the things that strikes me about the story of, of Honeycomb is you mentioned this a tool inside a Facebook called, Scuba that, you know, at least partly inspired. what a Honeycomb is, is doing, and Sourcegraph is a little bit the same, you know, there was this internal tool at Google that I had the chance to use called a code search that, partly inspired me to want to build something like Sourcegraph. And I would love to get your take on, you know, first of all, you know, people working inside a fantastic developer organizations like that, they're going to see tools that they find useful that are probably found nowhere else in the world. But at the same time, there's kind of a set of unique challenges that go along with that. Cause, you know, Facebook and Google are Facebook and Google for a reason, you know, there's really no other place like them. And if you're trying to build a tool that's inspired by stuff, that's inside those organizations. You kind have to adapt it tothe broader world.
Charity: Yeah. Hit or miss. Yeah.
Beyang: talk about that.
Charity: Yeah, it is interesting. I think Google has kind of made the, they are pretty services oriented organization from what I understand internally and Facebook is not like they do not have services at all. They like talk to each other and go to meetings to decide that they're going to do things it's super weird. Um, yeah, I think it's, it's interesting because I think that, I think that often, you know, people. And you can think of so many, like Quip came out of Facebook tools, Asana came out of Facebook tools. Like there's a lot of them. And, and the thing that. Often trips people up is they come out thinking that the same problems are going to be hard out here as were hard in there. And it is not like you get so used to just like turning on the spigot and whoosh people show up because they have no choice because they're going to use your tool or they're interested in it, you know? And like you get out here and a little startups and, and like, you turn on this spigot. Nothing comes, right? And you really have to work much harder to understand users and court them and, and look for ways to be, to surprise them with how helpful you can be in. And a scale is almost never. Yeah. Scale is never the hard, hard part. Never. Like, like I said about Scuba, you know, it's an aggressively hostile to user's tool and they can get away with that. They can just be like, fuck you, you know, we're, we're a big ads company. We're not a developer tools company. And so our developer tools are gonna suck and they do, and people use them anyway, you know, what are you going to do? so from day one, like we were, we knew that we had to pay a lot more attention to the user interface. yeah, I dunno. It's hit or miss. What I do love is I do feel that now just in the last few years, there's enough. There are so many of us that have kind of sprinkled up that you can, you can plumb together a full like CI/CD, pipeline that is. Facebook quality or Google quality, and you get everything from feature flags and observability and, you know, progressive deployments and all this stuff that for forever, like either you didn't have them or each shop had to like home, home, homebrew them, you know, for their custom to their environment. Um, now you can actually just like. Pick up a hundred bucks, bucks a month to a handful of tools and you can get some really nice things. So it's interesting. If you look at the, the DORA report, you know, the, the yearly DevOps research report that gets put out where they show, like they, they break teams up into like low performers, high performers, et cetera. And if you look at year over year for the last couple of years, like the bottom 50% is losing ground and the top 50%, and especially the. You know, elite 20% or so is they're getting better faster. There's this like total split down the middle. And I think this is because, you know, frankly in tech, if you're standing still you're moving backwards because complexity is always. Conspiring to overtake you. And, you know, there's always more, you know, exceptions. Like you have to be actively fighting against that in order to just keep up. but once, but the tools really have like, They really have like begun to make a real difference. And, and, and it's the kind of thing where if you get one, then you want another, and then you want another, like you get feature flags and you want observability and, and, and you start like getting all these things and with, with the free time that you buy for yourself, you, you buy more free time for yourself. And it's just, it's like getting on a treadmill of escalating awesomeness once you start. but if you don't start. Your life is getting worse and worse. And it's really, and the thing is that people often think, Oh, well this is only for great engineers, or I'm not that this is only for ex-Google. It has nothing to do with how good of an engineer you are, like nothing to do with it. It is it, these are sociotechnical systems, right? The people, the code, the tools you use for deploying and managing that code and observability is like, It is an important step, just so that you can see what the fuck is going on. But. It's it's about the effectiveness of the team. Like how high performing the team is because I have seen, you know, engineers, you know, who, who leave high performing teams and join low performing teams. And they don't drag the team up to their level. They go down to the teams level, right. And slowly I've seen people leave low performing teams and join a high performing team. And within three to six months they're holding their own, right? Like. Like, I feel like 80, 90, or more percentage of, of your velocity and your, your, your ability to ship code with confidence has nothing to do with your personal skills. It has everything to do with your team.
Beyang: Yeah, I mean, that's a fantastic point and I've never really heard it described in kind of like that fashion, but it really, I think clicks with me. I mean, this is kind of an old trope about like the 10x engineer in Silicon Valley and.
Charity: 10x engineer goes and joins a low performing team, and they're going to perform right down with them. You know, it's not about the person and this is why I feel so strongly about, managers, you know, about the pendulum, about engineers becoming managers and going, going back every couple of few years. Cause I feel like if you want to be a technical leader who has the. The, the skillset that it requires to tend to a team and, and help it raise its level of performance. You can't just focus on one corner. You can't just focus on the people. You can't just focus on the tech. You can't just focus on the tools. Like it takes the ability to reason about the full system, right? Which, which, which means that, you know, you, you can't bury your head in either side. Like it really takes, it takes out everything.
Beyang: What are some of the things that, Honeycomb does internally to kind of foster this, culture, this level
Charity: Oh God, you know, I, I almost feel embarrassed when I talk about are our metrics are an order of magnitude better than the most elite team captured in the DORA report. Um, The four metrics are the time to the time. How often do you deploy? Time between when you merge and when the code goes live? the, how, how long outages, how long till recovery? Um, and I think his duration of outages, and I don't know what the fourth one is, something like that. Yeah. And, and, and, and the reason ours are so good is because. Our, our system has always been well understood. You know, we just, we have the expectation of building the instrumentation in looking at the instrumentation when it's live, being able to swiftly, you know, pinpoint and fix problems. So it never becomes this hairball, you know, and even when things do go down, you know, it's comprehensible, people can get in there and fix it. And, and, and that, it's almost like cheating, you know, it's so much easier to. To me to, to work on a system that's never been terrible. There it is to dig yourself out of a pit. Like, I don't wanna, I don't wanna like, underestimate the amount of work that that can be, but like, it's not that hard, you know, and we very intentionally did not just go hire all the ex Google, Facebook people. You know, we, we knew we were building a team, a product for everyone, and we wanted, you know, normal people, but. But the thing is that you become a better engineer so much faster when you're working on a team that ships way more often and gets that feedback loop, you know, quickly, doesn't bury the feedback in ops, but gets it right to you. You know, like, like it's, it's almost, I was joking with the day that we're going to have to like keep hiring new junior engineers because they become seniors too quickly. which is a good problem to have.
Beyang: Yeah, that's a great problem to have, yeah. I want to revisit a kind of thing that you touched on earlier, which is, you know, you're, you're starting to notice more and more kind of gatekeeping, in the technology industry. And I think, you know, one of the things that I really care about is making technology and software, especially accessible to a wider range of. Of people, you know, so, you know, people don't have to have that experience of like, Oh, you know, there's no, women in that, computer science, lab. can you talk a little bit about, how, how Honeycomb can kind of, like realize I do, do you think that, using tools like Honeycomb can actually bridge the gap?
Charity: Oh, I really do, because like, think about how we learn from each other, you know, like you want to look over the shoulder of the senior engineer, you know, and just like, see how they do what they do. Like, how does their brain even work? Like what questions do they ask? which is something that we baked into the product, right from day one, you know, thinking about, you know, if you're too bashful to like approach someone, you should be able to go. Look at how they interact with their systems, just capturing a snippet of their brain. I fundamentally just believe that there are so many things systems that are broken, give people the wrong idea about their own abilities. You know, it's, it's not your fault. It's the shitty systems. It is, it is so hard right now to get that first job. My, my little sister just graduated with an engineering degree a couple of years ago, and I got to see like, Up close and personal. Just how hard it is to get that first job, you know? And like, once you've gotten that first job, like the sky's the limit, like you're got recruiters, like everybody's pounding on the door, but like, nobody wants to take the leap and just like, what does it take? Like three, three or four months to get them like, To begin to be productive, you know, like it's, it's not that much and you get loyalty, you know, they're so grateful and it's just the energy of the fresh eyes. I don't know. I feel like this, this industry, we really need to figure out how to mentor junior developers, remotely, because I feel like the push towards distributed teams is nobody's figured out how to do this. Nobody's figured out how to help teach. You know, and, and I'm nervous about it too, but like, we, we can't keep like expecting someone else to pull, the cart for us. Like we don't share that load.
Beyang: I, you know, I think that one of the bottlenecks to getting more people, especially, you know, in entry level positions in, into software jobs is also the time of senior engineers. so like, you know, if you have a limited number of senior engineers, you know, mentorship takes time, it's a, it's an active job. And, oftentimes you can get into the position where if you hire too many junior folks too quickly, the senior engineers spend all their time mentoring. So I guess where I'm going with all this is, you know, do you think that tooling can help with that? Like with better tools, do they not only help, you know, individual engineers be more productive, but also like, can they facilitate more scalable, mentorship.
Charity: that's an interesting question and I'm not sure what I know that tools can do is, do a much better job of rewarding individual effort with brief with results, you know, like, and, and scaling that out, you know, like somebody who's motivated to go poke around and like, you know, see how Ben like, solved that my SQL problem, you know, capturing that in a, in a way so that people have access to like how his brain was working when he did that. Um, I have confidence that tools can help with that. I there's a, there's a human, there's an interpersonal human element. There are people have different learning styles. You know, some people really rely on the interpersonal like emotional connection. And for some people, I think it's actually kind of, almost, confusing, but like almost an obstacle. Um, like I actually find it much easier to sit down and learn them by myself and I find it really confusing. I find it very when people are
Beyang: For me it's almost Like stressful. Like I have to be
Charity: stressful. Yeah. Yeah. I don't do my best thinking while someone else is watching or paying attention for sure. I don't know. I feel like what were tooling can help in general is just like. With, with making things asynchronous and anything that can be made asynchronous can be made scalable. so, you know, but, but when it comes to things like giving someone real fine grain, personal feedback on their, on their code, eh, that's just, only so much you can, you know, do there.
Beyang: Yeah. Yeah. You know, I, I, I am kind of the personality type where I like to learn by myself too. And I feel like there are all these kind of, how to put it. There are kind of these like rabbit holes, in software. not exactly rabbit holes, but like things that are extremely useful, but. You don't really acquire unless you, so like, an example of this is like bash scripting, right? Like when I first, yeah. There's no class on that. And then you're like, Oh, you know, I know how to write, you know, quick sort. I know all the, you know, data structures and you get out and, you know, and you're like, what the heck is this? Like, I need to like write this in order to, you know, make some change on the server. And so like that, that sort of thing, I feel like. There could be a lot better resources either in the form of tools or tutorials that help people kind of, teach themselves through that so that it's not such a black box and they dont view that as an impediment.
Charity: I mean, I definitely feel like ops, like operational skills in general have been a real blind spot as far as our educated, you know, and it's, it's not about the. Operations, but it is about the ownership and we can't protect people from the consequences of their code. Like we've got to help them be exposed to it and, and understand like what happens to their code after they've merged it. Right? Like everyone should know what happens between when they merge their code and when the users are using it. And, and, and that's just something where like in a classroom setting, you know, you need to have reality. To go debug before you can learn those skills. I think because there are some things you can't just really make into a lesson plan because they rely on the inherent, like unpredictability of reality.
Beyang: So if someone's listening to this and oyu know they want to get started with Honeycomb what should they do?
Charity: Go to Honeycomb.io. We have a blog that I think has probably the best round-up of observability resources. In in the industry. And there's my personal blog at charity.wtf. Follow us on Twitter, Honeycomb.io, or Mipsytipsy.
Beyang: My guest today has been Charity Majors, Charity, thanks for being on the show.
Charity: Thanks so much for having me. This was really fun.