Episode 2: Ryan Djurovich, DevOps and DevTools manager at Xero and Cloudflare

Ryan Djurovich is a DevOps manager at Xero and former manager of the DevTools team at Cloudflare. He shares with Quinn how he has seen the landscape of Continuous Integration and Delivery (CI/CD) tools change over the years, the three waves of CI/CD, and where he thinks testing and build tools are headed in the future.

Show Notes

Ryan Djurovich on Twitter: https://twitter.com/ryan0x44

Xero: https://www.xero.com

Cloudflare: https://www.cloudflare.com/

WebsiteBaker CMS: https://websitebaker.org/pages/en/home.php

Jenkins: https://www.jenkins.io/

Capistrano https://capistranorb.com/

CFSSL: https://github.com/cloudflare/cfssl

OpenMake Software: https://www.openmakesoftware.com/

Samson: https://github.com/zendesk/samson

Cruise Control: https://github.com/linkedin/cruise-control

Buildbot: https://github.com/buildbot/buildbot

Team Foundation Server: https://azure.microsoft.com/en-us/services/devops/server/

TeamCity: https://www.jetbrains.com/teamcity/

Docker: https://www.docker.com/

Linux containers: https://en.wikipedia.org/wiki/LXC

Pipelines in CI: https://docs.gitlab.com/ee/ci/pipelines, https://docs.microsoft.com/en-us/learn/modules/create-a-build-pipeline, https://www.jenkins.io/doc/book/pipeline/, https://buildkite.com/docs/pipelines, https://bitbucket.org/product/features/pipelines

Oracle Worker: https://docs.oracle.com/en/cloud/paas/app-container-cloud/dvcjv/worker-applications.html

GitHub Actions: https://github.com/features/actions

Bitbucket Pipelines: https://bitbucket.org/product/features/pipelines

Buildkite: https://buildkite.com

Pipelines as Code: https://www.gocd.org/2017/05/02/what-does-pipelines-as-code-really-mean/

Swagger / OpenAPI specification: https://swagger.io/specification/

Bazel (from Google): https://bazel.build

Buck (from Facebook): https://buck.build

Circle CI: https://circleci.com

Travis CI: https://travis-ci.org

Sourcegraph's dynamic CI pipeline generation script: https://sourcegraph.com/github.com/sourcegraph/sourcegraph@4e9fddd6fd0d99576532b04f54f284b072bd2139/-/blob/enterprise/dev/ci/gen-pipeline.go

Ember JS test performance: https://guides.emberjs.com/release/testing/test-types/

TodoMVC: http://todomvc.com/

PHP The Right Way: https://phptherightway.com/

Transcript

This transcript was generated using auto-transcription software and the source can be edited here.

Quinn: I'm here with Ryan Djurovich at Xero and formerly of Cloudflare, and we're going to talk about developer tools and his journey to understanding how important and valuable they are. So, Ryan, welcome.

Ryan: thank you.

Quinn: Thanks for joining with us and talking us through your journey along, figuring out how awesome dev tools are. So just, you know, to start out and set some background, what was the first code that you wrote in your life?

Ryan: Wow, that's a, that's a deep question that goes way, way back. I feel like it was probably embarrassingly some, a PHP 4. When I was, still a child, hacking away, learning about, you know, the future of databases and, and building applications that use databases.

Quinn: Do you remember what editor you used? What kind of OS did you use back then?

Ryan: Yeah, I can, I can say a Red Hat. I think it was Red Hat seven, and whatever the text editor of choice was back then.

Quinn: Very cool. Well, that is a good starting point for this journey then. So I would love to hear how you originally got into a professional development role and how you started to understand how valuable tooling can be for that development process.

Ryan: yeah. I kind of, walked in backwards, I think to the way most people went about their careers. But, you know, I started very much self taught. I had a friend come over and Google was still pretty new, and they said, basically, you can teach yourself anything if you learn how to use Google properly. I ended up, you know, doing some kind of building some things on my own. some projects, some open source stuff. I built the CMS, called website Baker when I was quite young and it did pretty well online. There wasn't too many, CMSs to compete with that, at the time. And then eventually, I ended up working at a web development shop. that was pretty impressive, you know, for, for the size of the team and what they were capable of. And after spending a bit of time there, I got a greater appreciation for some of the best practices and, theoretical stuff that, people that had done CS degrees kind of had. And I wanted to learn and expand more on that. So I ended up going and doing a CS degree after that.

Quinn: So what was impressive about the firm you're at? How many people, you know, paint the picture for us.

Ryan: So it's, you know, the kind of web design and development agency model that a lot of companies have. there were probably 20 people, maybe 15 to 25 people at the company, and of that maybe there were six or seven developers. And, one of those developers was particularly, passionate about CI/CD systems and kind of set up Jenkins, I guess Jenkins was relatively new back then and yeah, it was, it was really impressive what he had set up and I kind of took to it and learn from him a lot and, started to really adopt it in my day to day and how I went about shipping code.

Quinn: So for a web development shop running Jenkins, what would the jobs be? What were you testing and what's the alternative? You know, how, how would you have been held back in terms of productivity or quality without something like that?

Ryan: Yeah. So essentially, we, we were using a PHP and PHP unit was, obviously pretty, pretty common way to write your unit tests, integration tests and that kind of thing. And I think we started to getting into automating deployments with Capistrano, which is, kind of like more of the Ruby world tooling, but had a lot of templated, deployment scripts. And basically setting up Jenkins. What we were able to do is instead of running these things just on our laptops or computers, we were able to have a server that would run it for us. And by doing that, we could make it so that when we pushed up a poor request, we would automatically run these tests and we could ensure that our code quality, you know, was there before we merged to master. So we, we made sure the tests pass. we, we could run some. integration tests, you know, static analysis, tooling, what have you. And so that really helped us to improve the quality of our code. You know, track, information about what percentage of tests are passing and failing with each of these branches. How are we doing the code coverage over time? And a lot of those things that today, pretty, you know, pretty seen as best practice and pretty, pretty common. but yeah, this was, you know, a few years ago now.

Quinn: Yeah. Yeah. I remember those days back when it would be so rare to have automated tests running, especially in a setting like this. So I mean, just to reiterate that for some listeners who might've grown up in a world that. I always assumed testing is a good thing. How common do you think having CI at this kind of web development shop was back then? You know, were you the only ones who are doing it? And when people joined and they saw this whole setup, like what was their reaction.

Ryan: Yeah. That's a good question too, to be honest. I think one of the things that is really interesting about our industry is because it's moving and developing at such a rapid pace, we don't have that much visibility into where each of the companies are at, in their kind of, uh. Engineering maturity levels or service maturity levels for each of the things that are working on. And so I, I kind of felt like. At the time this, this seemed very normal to me. It's like, yeah, this makes, this makes a ton of sense. It's a little bit of effort to set up, but the output that you get is so much better. You can move so much more quickly, you know? And I just kind of assumed that this is what everyone was doing. Like everyone was doing every single one of these things because, this person that had joined our company that was very experienced, you know, he's, he's come from another company where they kind of had something similar and I just figured, okay. All the other companies out there must be doing this. and it wasn't until, you know, I met many, many people. I think moving to San Francisco and talking to such a concentration of, of web development, you know, product development shops that are all doing similar things. But then I started to get this understanding of, or picture of how many, how much variety there is in the. engineering maturity levels, I guess, of teams. I'm not sure. Maturity is, is like the most complimentary term, but it's one that people have used in the vernacular. And, yeah, that's probably the easiest way to describe, adoption of these practices and tools.

Quinn: Yeah. Were there any other kind of forward thinking things that you were doing at this web development shop that either have caught on now or that still are a ways away

Ryan: Yeah. Again, you know, it's, it's difficult to know because unless you kind of have worked at every company, you can't possibly know what everyone's doing. But I feel like the one thing that I saw at the shop that was incredibly effective at, in terms of. Impacting the culture there and, kind of raising the bar and, and getting, getting people collectively excited about raising the bar in terms of code quality and development velocity. And that was very much this, like gamification was a big term at the time. And, trying to gamify the CII system where if you introduced a, you know, a regression and, you know, this bug. Just comes back like your test, stop failing or, probably a better example is code coverage. If you reduced your code coverage, because you wear a bunch of code but didn't ride tests, for example, you would lose points, right? Those is like points based system. And then if, you know, to the opposite of that, if you were to. Right, a bunch of tests and in improve your code coverage, you would gain points. And then there was this leaderboard where each week a, I think it was at the, the end of the week, we'd get like an email report or there was a dashboard or something where people would look at it and they would think, how did I do? Did I make it to the top of the leaderboard? And also like. Am I at the bottom of the leaderboard, right? Like I don't, I, I might not want to be at the top, but I definitely don't want to be at the bottom. And the kind of effect on the culture around, you know, test coverage and code quality. It was just, I would say it was profound. Like it really changed the game for the whole team. And I still haven't really seen it done in such a way. I know, I know companies are doing those kinds of things, but not in such a way that was so connected with the culture. I think it was, it was incredible.

Quinn: That's really cool. So I have to ask because. In theory, it sounds great to measure things like this to gamified development, but you know, the developer in Maine, I think, you know, in, in everyone thinks a little bit like, well, what unexpected consequences are there from this measurement? Are people starting to gain the metrics? Did you see any of that or was it, you know, the culture just was really, really positive and it seemed to work out.

Ryan: Yeah, so I mean, absolutely. Right. Like any system that has points, people are going to try and game it. But did that was, that was kind of part of

Quinn: developers are good at gaming

Ryan: Yeah, absolutely. But that was actually, I think that was part of the culture of it that kind of made it fun and exciting, is that, you know, people would work out ways to, to, you know, get a bunch of points and, people like, you could see that, right? People looked at it holistically. They didn't just look at who's number one, who's number two. It wasn't like your number one and you, you somehow got. 300% more points than anybody else. You're this outlier, therefore you're, you're doing a great job. It was like, Oh, how did you. How did you do that? What did you do to cheat, you know, to get there. People would call it out and there was this, I think, shared understanding of what that leader board meant and what it meant to cheat it versus actually improve quality of card and improve test coverage and things like that. So I think it's one of those things, and the reason why we don't see it broadly adopted is because you do need to be very careful with these things. If you. Instituted in a way that, you know, if you're in the bottom 10%, like look out, you're in trouble. Obviously that's gonna have a negative impact on culture. But, it was done in a way that there was a shared culture and shared understanding. and it wasn't just looking at the numbers, but it was more like stopping to reflect and appreciate the work that was done to improve our situation and our code quality and test coverage and those kinds of things.

Quinn: Yeah, that's really cool. So the gamification, was that entirely developer led or was there any kind of top down management wanting to do this

Ryan: Yeah. In in to develop a lead, kind of one of the senior developers. He, he installed it as a bit of. For a bit of fun. I think it had like this meme thing in it or next to it he installed like this plugin that did the leaderboard and then directly next to it put some like fun kind of silly meme that reminded people that this isn't some like serious thing. This is a bit of fun. Okay, let's treat it as fun. Let's treat it as a way to reflect on what we're doing and how we're going.

Quinn: Nice. Yeah, that's good. I think it would be hard for some manager to take this to the board if there's some goofy meme next to it, so it's a good forcing function to keep it developer led.

Ryan: yeah, yeah.

Quinn: So you're at this, I would say very forward thinking, web development shop. You saw these practices that, you know, probably took another 10 years to get adopted more widely, and then you went and studied computer science. And when you're starting to do your science, what, you know, how did you, having been at this professional development shop, seeing these practices affect your studies.

Ryan: That's, yeah, that's a, that's a tough question. I think, you know, I took university very seriously. I, I kind of pretty analytical about these things and I looked at, not the cost. I mean, in Australia we're very fortunate. university I don't think is, as difficult to afford, financially. You know. As other countries. but I didn't look at the costs so much of university, in terms of dollars. I looked at the opportunity cost, and to me that was huge, right? There was, it was kind of like putting my career on hold to go and do something. You know, I was really keen to learn more about algorithms, learn more about like the computer science, kind of concepts and fundamentals and, you know, things like cryptography. And, uh. You know, I went, I went and approached it and thought, you know, how can I take what I know today and learn this new thing and how do I translate this theory and concept back to code that I've written, or, you know, like this theory about how teams should work together. How do I translate that back to how I've seen teams work together? So I felt that I got a lot more out of university because of that. but yeah, it's. It was definitely not the same experience I think that everyone has, but I was, I've thoroughly enjoyed it.

Quinn: Yeah, that's cool. One of the things, you know, I had a similar experience having worked at a company as a developer and then going to university and. I remember, you know, I write tests and mat lab for the math classes, and I'd write tests and you know, Python for our CS assignments. And a lot of my other classmates, they started to do that. So where you writing tests for all these other classes after you'd seen, you know, just how valuable that was.

Ryan: Yeah, absolutely. And I think I really, I asked a lot of questions around toying, that. Maybe other people didn't know these kinds of tools existed. And I really, I really got excited about, you know, C programming and things like Val grind and GDB and I was like, yeah, these, I tried to kind of sell the tooling, I think to other students in the class and be like, yeah, you should really invest time in learning about these tools because you're going to get this assignment done so much quicker. And I, I guess in hindsight I was trying to explain development velocity to them and how they could, you know, have shipped. Things faster, but it was just in the context of their assignments at university.

Quinn: Yeah, that's really cool. And one nice thing about doing this in university is the university probably has a site license for a lot of these tools, and if not, you can get them much cheaper as a student. And if you know, some vendor doesn't have a student discount, you can probably email them and they'd be overjoyed to give it, you know, to get it in the hands of students.

Ryan: absolutely. I think it definitely gives you an advantage in that you can learn about the tools without the cost and not just that bit information as well. You know, information's very powerful and there's unfortunately some things still behind paywalls that you can fortunately get access to a university. but yeah, there are definitely companies leading the charge in, in giving access to students and, education. Uh. For their tooling, which maybe this is a tough question for you, but, I'm not sure doing anything or looking at doing anything in that field.

Quinn: when we hear from students in university or high school that want to use source graph, if they have private code, then yeah, we will give them source graph and you know, a lot of times they're in the free tier anyone, so they can just search code and navigate code. And you were on a self hosted instance. And we also love to hire interns. And personally, whenever there's a student that you know, has questions or you know, they, they wonder how to do things. Like that's, those are the kinds of cases when I love to just, if I get an email at 9:00 PM on a Saturday night, just jump on a zoom with them and walk them through. So, you know, I think that everyone should have unlimited patients for students who are learning. And, you know, I know that a lot of people were very generous to me when I was a student and I try to pay that forward.

Ryan: Yeah, that's, that's great. I heard more and more people can, can do that.

Quinn: Yeah. So after university, you, you know, eventually you joined Cloudflare and what was the tooling situation like when you got to Cloudflare? Obviously a fantastic company.

Ryan: Yeah, I mean, clap. Cloudflare is an incredible company. it was, you know, I D I didn't, I, I got there because I'd used the product and I actually used the product at this web development shop and it kind of saved a website that was. Buckling under load and, like the infrastructure just wasn't standing up to it. And we put Cloudflare in front of it and it kind of saved the day for this website. And seeing that, made me really a early believer in that product. And that's kind of how I got there. through using the product, I wanted to learn more about the company and I. Didn't really know what to expect when I got there. And you know, I was moving from Melbourne to San Francisco was a pretty big move, a pretty big change. And yeah, it was just awesome. the greatest experience, cannot speak highly enough of Cloudflare and what the team's doing there. and yeah, I guess when I got there, I started as an engineer and I was primarily working in go. so that was really exciting. You know, go, was this language that I'd kind of built a few things, but hadn't, you hadn't worked on anything that was kind of a big production system in go. And I got there and picked it up pretty quickly and ran with it in a domain that I was very familiar with and it was. There were a lot of things, that move very quickly. And when you're, obviously, there's a huge focus in Silicon Valley on shipping products and making sure that, you know, we prove that this product is viable and, you know, then we start investing in it and doubling down when it works. and. There's always this difficult balance of, when do you invest in tooling and when do you invest in automation? And Cloudflare has some incredible tools. I mean, there's, they've got a lot of open source stuff that, is quite popular tools like CFS as well that gets used quite broadly. so there's definitely some world-class tooling there. And. yeah. I, I learned a lot, very quickly in my, my first six months, about go and the ecosystem there. And I think go sets a high bar or what it means to have good tools for a programming language.

Quinn: so how many people were at Cloudflare when you joined?

Ryan: I think, I think it was around 300 people in the company.

Quinn: how would you evaluate bringing in a tool from some company, building it in house, using something in open source or other alternatives.

Ryan: Yeah. I think, you know, people are very thoughtful about those things when evaluating a tool. And, one of the interesting things is the scale of requests coming in and the, the scale of the system that's being built there is, it's, it's so massive that, you know, the economics of some tooling. it makes sense, because there's a great team, very talented team. The economics sometimes make sense to build something where at other companies, it wouldn't make sense, right? So there tends to be, I think a bias away from as much of the cloud stuff, because it makes so much sense to do that where other companies might have a bias towards cloud things and, you know, tooling. It's, I think there is in a lot of cases, a pretty comprehensive, you know, vetting process and, and comparison and research process where, you know, you kind of come up with the list of, the top tools that you're interested in. Uh. Kind of eliminate early on things that clearly on a fit, get it down to, you know, a few. And then you were pretty thorough review and comparison of those tools for our particular use cases. And then you end up selecting a tool that. Is right for you, right? Like, everyone's going to have different requirements. and one thing that I think people are very good at, at Cloudflare is understanding the problem and the scope of the problem and the context of the problem and what they're trying to solve before jumping too much into finding a solution for it. And that's definitely something I learned from the latest that, and the great engineers that, if you can get that right, you tend to end up with better outcomes.

Quinn: Yeah. Did you yourself build any tools when you were there?

Ryan: Yeah, absolutely. So, I started off as an engineer on the billing team and eventually, started managing that team. And after a couple of years, had this opportunity to join the dev tools team. And, I think someone said, you know, we, we think you might be interested in this because a lot of the things that you've kind of done, uh. Along with what you're doing on billing was also impactful for dev tools. So, you know, you want to make this the thing that you're working on, primarily because in billing, one of the things I had done is, some deployment automation work, very early on. And, my team was moving very quickly without workflows. you know, we're not, we're not working off this one monitor repo. So we, we, we had quite a bit of autonomy and were able to move very, very quickly with a continuous integration, continuous delivery. And that was working well for us. So the opportunity to do this in the team that kind of supports the entire engineering org for that. it was awesome. And I jumped at it. And so the last kind of a year and a half at Cloudflare that I spent. I was on this team and you know, had the fortunate of being responsible for operating the CI/CD platform, helping build out the developer tools there and helping people to, to be productive and increase their productivity.

Quinn: okay. Cool. Well, I would love to hear more about the CI/CD tooling you had and what you think is coming next there. But just to set the background for this, when did Cloudflare establish the dev tools team? How long in their history did it take them? You know, how many people did they have at the point when they set that up?

Ryan: to be honest, that predated me, I think in some some ways, but it was very much an evolution. I think every organization, and it's kind of like this, you know, you, you might have a group of people that work on things. Maybe it's full time, maybe it's part time in their role, but there's kind of like. Even if you don't label it the team, it's the team, right? The people know that these are the people to go to. If you want to talk to them about tools and people who are writing tools because there's, it makes sense to write tools. So I think not just the cloud flavor to any company, at some point in time, there are these people working on these tools that are. A known is kind of that team. And then eventually that evolves into a formalized team. and sometimes, that team is, you know, partially responsible for tools and also partially responsible for other things. And then eventually it gets to the scale where it makes sense to have that dedicated team. so I think possibly that's how it went, from, from the bits that I know. And, you know, I think it's a very kind of organic, and, uh. Good way to approach that rather than trying to throw together a, a team too early, that maybe might not have full visibility across the org or, or, or exposure to systems that other teams are working on. It's kind of nice when you've got someone that's coming from a product team to work on tooling cause they can directly relate to the pain of that product.

Quinn: so if there's an engineer listening to this who is at a company and they think their company needs to start having a dedicated tools team, what would you recommend that engineer, how to get that set up? How to sell that to the rest of the organization?

Ryan: Oh, that's a, that's a tough question. I think it's, it's very dependent on the culture of that organization. I think if you look at how do you, how do you get anything done in terms of engineering within an engineering team? Look at what, what has worked, right? Not just for building a tooling team, but if you've seen someone do something in your engineering organization. That was successful, they achieved a successful outcome, right? And it might be running some like internal, you know, hack, hack day or something. Or it might be, doing some kind of presentation, or it might be something more specific like, open sourcing a particular tool that has been built. the piles within an organization to affecting change vary greatly. So try to look at how someone else did that and follow that path.

Quinn: You've seen how a lot of different companies run their dev tools teams, and specifically the CIO CD infrastructure that they have. What do you see works well? What do you see is not working well? And if you're at a company that doesn't have a good CI/CD infrastructure, how do you improve it? Yeah.

Ryan: Yeah, that's, that's a big question.

Quinn: The million dollar question.

Ryan: Yeah, I mean, I think CI/CD has, has evolved over the last 20 years. And you know, there, there've been kind of waves of changes as, as I would like to look at them. I'm less sober cause I'm a surfer, where the kind of first wave, second wave, third wave, analogy comes from a bit more so, cause I'm into coffee and that's how people talk about waves of coffee. and I think,

Quinn: I see. I see. That is something new I learned. Thank you.

Ryan: Yeah, fun facts. so yeah, I dunno. I've started recently when, when people have asked me questions about like, where are best practices, where have they come from? Where do you think they're going? I've kind of tried to bucket things into these three waves of CI/CD and right now I feel like we're kind of at the. Towards the end of the second wave, the third wave is coming, and I can take some guesses. That may be what that might look like, or what I'd love to see it look like. so

Quinn: well, it started with the first wave. What is the first wave of CI?

Ryan: yeah. before we jump into that, maybe it helps to provide a bit of background in case people aren't super familiar with, you know, what does actually, what does CI/CD mean and what are we talking about? I like to provide that little bit of context. . So, continuous integration, people kind of. mix CI/CD with, you know, the tools and the theory behind it, but kind of the, the theory around continuous integration. Hopefully I won't butcher this, but it was kind of mid to late nineties. this term came about, with the idea that the longer. That code sits in a branch, the greater the risk of a merge conflict, right? So having this, continuous integration kind of workflow meant that you would try and continually integrate your code into your main line. And. to do that you needed, you needed like tooling to kind of support that and achieve that workflow. So you wanted to make sure that you can run your tests locally, that you can push code up and have it run on a build server. That there was some kind of quality control, in there and things you could do. Things like static analysis, you can run your tests and that kind of thing. And, and, and then once that passes, you merge it into mainline. Maybe it's through a pull request, maybe it's automatically, so. That's why that's kind of the theory around the continuous integrations side of things. And then there's like continuous delivery or continuous deployments, which kind of, overlap a little bit. but the idea of continuous delivery is that you're always ensuring your main line. A isn't a state that can be deployed and that your deployments are fully automated and kind of the, the net outcome there is that you're continually deploying, so you can't really continually deploy unless you have that kind of continuous delivery, workflow and process and culture in place. And so CI/CD when people talk about it a lot in the modern context is taking these kind of concepts and the tools that help support that.

Quinn: Yeah. So for listeners who have grown up in a world where everyone is doing CI/CD, everyone accepts that it's good. Was there ever a time when people didn't assume that it was good, where it wasn't inevitable.

Ryan: Yeah, I mean, I, I think the mid, late, mid to late nineties, if you were to ask someone that was working at a big production shop back then, they'd probably be able to give you more context context on that. Back then, I was still in my early days, of writing code and hadn't had the exposure to what people were doing in large, larger companies. But I would say like anything. you know, there are skeptics when a new technology comes about or when a new process comes about. so they probably were skeptics of this at the time, and it was, it was difficult to share the knowledge back then. you know, the internet wasn't as broadly used and a lot of things, a lot of the way you learnt things were through books.

Quinn: Yeah. And you know, in fact, it's still not that easy to share best practices about these internal tools, internal practices. Among companies, you know, share this broad idea of CI. That's one thing, but one of the reasons why we're doing this podcast is to get deeper and share more best practices that people like you have learned through working in a lot of great companies. So, you know, we're trying to solve this. So, you know, getting onto the first, second and third waves. Walk us through that.

Ryan: Yeah, so again, this is kind of my rough bucketing of, of the story of CI/CD and how it's evolved, but I kind of see this first wave of, you know. Builds, build servers was kind of the way you would describe a CII system, in the early days, or what we refer to now as a CIS system or a CI/CD system, and these builds servers, they're basically glorified script runners, right? This is kind of the late nineties, early to mid two thousands where these systems came about. Where. You basically, you know, you have your tests and your builds and they're running through, you know, bash scripts or make files or some language specific tool. If you're in the Ruby world, that might be something like rake and it's just running these scripts where it's, you know, doing your unit tests, doing integration tests. and so these build servers, I say glorified script runners, cause literally a lot of these things, what they're doing is they have this web UI and you click start build. And it's literally just executing that build on that server and piping the output to the UI. And then, you know, at my, capture those logs and capture some kind of build artifacts. Right? So most of these builds are spitting out like a binary or some deployable. Yeah thing. And, presents the result in the web UI. So in the case of tests, that'll tell you, pass or fail, and you'll be able to jump into the log and see which tests failed. so that was like a pretty, you know, if you think about, you could probably build one of these things pretty quickly yourself these days. but back then it was such a new concept and, pretty impressive because it meant that you could run things on a server that's probably more powerful than your workstation. And it meant that you could run tests automatically and share the result with your entire team. It wasn't just on your laptop or workstation. So that's kind of the first wave of CI. And then the, I see the kind of first wave as CD is disconnected. Like there was a time where people saw a build server as literally just doing tests and builds and the idea of automating your deployments, maybe some people were doing it through their SEI systems, but some people started to build these custom deployment tools or deployment servers. And this, this, this came about a bit later, I think in kind of the late two thirds, two thousands, early to mid 2010s and the deployment servers were very similar in the sense that they glorified script runners, but maybe they had a bit more, kind of, of a modeled concept of what it means to deploy something. Right? So you might have released stages, you might have rollbacks and, you know, you want to see a log of when things were deployed. when they have a button to click to quickly roll it back. maybe have some feature where you can globally disabled appointments because it's a weekend. and so, you know, a really good example I think of this tool is as very nice UI on this tool is Samsung. Um. And that came out of Zendesk. And Samson is particularly focused around deployments. And so I've, we, we very early on, at this development shop I worked at, we had set up Jenkins as our CI system build server and we'd set up Samson as our, deployment server. And so we'd kick off the tests, you know, in the build, and then we'd have the deployment server actually do run Capistrano and handle our deployments.

Quinn: and so that would copy the PHP files to some production server. Was this running on a public cloud?

Ryan: Yeah. So, this, this is kind of a, I guess, pre-cloud before people called it the cloud, or just starting to, but yeah, for, from memory it was, you know, Jesus, tough hour, you know, pushing that, I guess SFTP, hopefully there was an S in there and getting that up onto, to a server hosted, um. Who knows, it might've been, you know, a Rackspace or, or someone like that. they were kind of multiple, pretty, pretty large, well known. VPs, providers back then were kind of the rage. and then. You know, it would copy it across and Capistrano could like, you know, untied the file remotely and handle all that stuff. And it could do things pretty nice. Like, it would, I think the more advanced stuff with like deploy it on Tara and then like, switch assembling Carver to the new files and then the rollback mechanism could quickly like switch that sibling back. So you didn't need to go through the entire process of, you know, uploading and. Untiring and, and all that stuff.

Quinn: Yeah. What were some of the other tools that you call first wave CI and first wave CD?

Ryan: yeah. So, I actually did a, did a bit of research on a list of CI systems. I anticipated this question. cause it's, you only know what, you know. Right. but it's quite interesting to look at. Which kind of tools came about. And as far as I could find, the first kind of main CII system was in 97 called CA harvest by catalyst systems, which is now known as open makes software. so it was definitely like the late nineties. There were CI systems that existed. and then kind of. As far as I can tell, you know, there was systems like cruise control, build bot, team foundation server, team city that came about over the next kind of, five to eight years. and that was kind of, that rise in Jenkins didn't come about I think till 2011.

Quinn: Wow. All right, so how do we get into the second wave? Why was the first wave not sufficient and perfect?

Ryan: You know, I don't think there's, I don't think there's a clear cut between the first and second wave, but I think it's this like clear evolution, evolution of CII systems and kind of, there are a couple of themes that have clearly come about, but I think what happened was containers, right? Docker happened and people started to use containers in their builds. Um. And then, and then I think people were looking at containers and this concept of pipelines, which kind of evolved over, I guess the, the two thousands late two thousands and earlier in this decade. And I think there's this like really nice overlap with containerization and this notion of a pipeline. So to kind of, to give you an idea of what we're talking about when we say pipeline is, you know, the idea that you can run like one kind of step. And it outputs some artifact. And then that artifact gets put into the next step. And it's like this chain of build steps or builds. and a perfect example of this might be, you know, in your pipeline you have a thing, a step that runs your tests maybe, and you know, the order. It differs completely depending on your scenario, but you might run all your tests and then if they pass, then you might run your build step, which outputs some binary, and then you might run your, like a Docker build step, which takes that binary, copies it in and builds a container that's ready to be deployed. And then you might have another thing in that pipeline that actually handles the deployment. And these pipelines. I think they came about because you know, while people in the first wave had these deployment servers, there are other people that were trying to use AI to do deployments and to be able to kind of do this chain or this pipeline, a new functionality got added to CII systems to the point where if you can do a pipeline, you can do CI and CD really effectively in this single kind of workflow in this single system. And. when containerization came about as well, it meant that each of these steps in the pipeline could be handled with containers. so that, you kind of, you could share, like, you know, if you're working in, say, the node world and working with MPM, you might have one container that, uh. That contains your test harness, right at installs all the things that you need to run your tests. and same with, you know, the Ruby world. And you could say like, I'm running like this particular version of reveal this particular version of a node, that I, I need this for my tests. And all the. All of the services that my company is building is just going to use the same container to run its tests inside. And I could run that container both locally and on the CII systems. So it's a good way to kind of iterate on the test harness or the test themselves, and have that consistent environment. So I kind of see this second wave that's come about in the last kind of five years or so as. Pipelines and containerization, that's kind of come together into this really nice model where you can achieve all of the things that you need in your release pipeline. and you can do it with containers. So it's portable. it's, you know, it's a really nice. No, I set up particularly, you know, it gives you this kind of baseline infrastructure that you can use across your teams in your engineering org, and you can reuse a lot of things because you know, each of the build steps in a pipeline can be reused, but plugged in and configured differently depending on each use case.

Quinn: Yeah, I remember in first wave CI systems, you know, if the machine that was running the build scripts, if you updated system software, if you updated the Python version, then that would mean you could never actually run tests on older commits. So, you know, with pipelines being containerized, you have isolation. You have, you know, steps toward reproducibility and it's a huge improvement. So what are some of the tools that are good second wave CIN and CD systems.

Ryan: Yeah. That's, you know, I think the second wave I tried to kind of map out what a timeline looks like for this and try and get an understanding where did this come from? And one of the tools that I learned about kind of early on that was. doing containerization, I think, well in CII was called worker. I think Oracle acquired it, more recently. But worker I think came about in like 2011, 2012 and that, that was the first time you saw this, like use of, containers in, in a pipeline and being able to do all these things. And. It wasn't until kind of 2015, 2016 where I think other companies started to follow this model and Bitbucket pipelines came out in 2016 and you know, people, a lot of people today know what get hub actions are, right? A lot of people have heard about this, and it's exciting because what you can do with GitHub actions is pretty powerful. and I think there's two kind of versions. I think of actions. There was like the very early one in and the more recent one, I think in August this year. And, I think then the newer one in August might be Yammel and outlets. You use containers or JavaScript. previously I think it only let you use containers potentially, but it's this notion that you've kind of had these reusable steps in a pipeline. and a lot of these systems. and this is probably part of the second wave as well, is this notion of pipelines as code, right? Configuration is card has, has been a thing for a while. And applying it to, you know, your server configuration is important. Implying it to you, applying it to your infrastructure is, is great. And now we're applying it to our build pipeline configuration because pipelines is so common and can get very complex to manage and you want, you don't want to be doing that through a UI and you want to be able to maybe. resurrect that if you need to, from code rather than doing it all manually. So I think like there's this kind of path of like, worker Bitbucket, pipelines get her actions and more of the modern, and really nice CIS systems like build, cut. they do an exemplary job of, pipelines and pipelines as code and containers.

Quinn: So, you know, starting with the first pipeline based CI and now some of the newer ones that are coming out, or you know, the evolutions made to the early ones. How our pipeline CI system is getting better. What are we learning as we're starting to adopt these more.

Ryan: Yeah. I think, you know, the first, the, the, the comment I made before about pipelines as code, I think that's a key piece of how they're getting better. I think the, the way we go about describing these pipelines and setting them up, I think that's, that's been a bit of a struggle, right? Like you look at the swagger specification and it kind of went through a couple of versions and then eventually, like the tooling and the ecosystem became so strong that. That became open API and now you've got this really nice open API spec that's getting better and better. But, like specifications take time and you can't, you can't design the perfect specification, without being attached to reality, right? You can't just do it all in theory. You need to actually build tools around it and iterate and iterate and see what works and see what doesn't work and, and make that better. So I think there's this really interesting thing where. You know, even get hub actions, they, they've kind of evolved their format because it's really hard to figure out what the right specification format of a pipeline is. and you know, I think that kind of leads into this third wave, cause I think this is a really good, segue for this. to me the third wave is really. I think there's going to be standardization of pipeline specification formats. I think somebody, or I hope somebody will come out with some kind of standardized spec that you can describe a pipeline of containers and. That pipeline. We could build tooling much like we have all this tooling around swagger and open API. We could build tooling that would automatically configure any CI/CD system to run your pipeline. And it's all off the standard spec. And so your developers, you know, you could be at any company and move to another company. And if you know how to use a spec at one company, you can write a spec at another company and you don't have to worry about what the underlying CI/CD system is like. CI/CD systems, they've got so many things that, can give them, you know, strengths or weaknesses against the other systems. You know, some might do reporting that are, some might have. You're better at UIs, and maybe there's one that has a really good mobile app, who knows? But the, the server itself is more about that kind of central thing. It's about, you know, how does the team operate that maybe it's a cloud based solution, maybe it's on-prem. you know, maybe you have the ability to extend it with plugins and modules. Who knows? But I think the, the real, the challenge in the industry at the moment is we're starting to see all these different. Kind of pipeline formats that exist and people are going to have to learn new ones wherever they go. But I think what people are really going to want to see is a standardized specification format. And then not just that, but one thing that I think kind of does relate to the second wave, but also, would be needed to be effective at this standardized pipeline specification is. Developers need to be able to run things locally. So if you have this pipeline specification format, uh. And you have to write this up and push it up, you know, to get or something and then, you know, get hub picks it up or bit bucket or whatever picks it up. if you have to like that cycle before you get feedback on whether you've configured it correctly, if that takes 10 or 20 minutes or even worse hours. Yeah. Setting up your pipelines is going to take you so long. It's going to take days, maybe weeks, and that's not how it should be. What it should be is you can write that little file locally. You can run some tool and it can run that whole pipeline locally and you know, okay, I've got this right. This is the way I want it. Now I'm going to publish this up and get it configured. You know, in CI/CD day, it just kind of picks up the file and automatically configures. And I think that that's a really essential part about making sure the ecosystem succeeds and that developers can be productive with it. And it's kind of like this kind of, you want something to just work for you with a small amount, a small feedback loop, right? A really quick feedback loop as to whether or not it's going to work.

Quinn: So as a developer, I'd be writing some code in my editor and with each keystroke or once I save, then it would somehow know what are the tests that should be kicked off based on that change. And hopefully by the time I go and commit, it would have already run these. It would have recorded the current state of my working tree and I would have, I've confidence that everything is going to be green when I push it up. Is that what you're thinking?

Ryan: Yes, yes. And, and, I think, you know, to clarify the, before you run your tasks, before you get into the cycle of, you know, that feedback loop of whether or not your, your code is working and your tests are passing. There's this like meta stage of setting up the pipelines itself, setting up the test harness and all that. And I think with this pipeline configuration format. It's that meta stage, right, where it's you, you, you maybe have set up some unit tests or whatever, but now you need to set it up in a way that you can configure the ICD. So there needs to be like feedback on that pipeline configuration. So you need to lint, like say it's a Jamo file. you know, cause it's, it's popular, but it's also tricky to work with. You want some kind of tool that's going to win. That file for you and tell you whether or not you've got that file even written correctly. And then you want some tool that's going to actually try to run through that pipeline. And maybe you said step one spits out like some kind of binary and step true. Does a Docker image build a, and you're expecting to pass that binary through to the second step? you know, you want the tool to actually try and do that and check that that actually works. Otherwise, if you apply that configuration to your CI/CD system. And the first step doesn't actually produce a binary. It's like, okay, what now? Now I've got to go and fix something else. So you want to get that feedback as quickly as possible so that it doesn't take you days to set up a CI/CD pipeline.

Quinn: Yeah. One of the things I dread most is troubleshooting the pipeline. You know, you push it up, you have to wait for it to be cute and that it takes a long time to run and you know that it could run so quickly if you could run it locally.

Ryan: Yeah, exactly. And I think that that's, that's something that, the, the better CI/CD systems right now are doing a really good job with is building that kind of tooling. but I think that there's still opportunity there. And I think with the second wave, if you have like a standardized config format, a key, a key part of making that successful is also building that tooling so that. You know, the actually using it becomes a very, very easy, and it's quick to set up and it might even be better than what all the different CI/CD providers are doing today. And, you know, I hope that that's how someone approaches it and I hope that's what we can see. I should also mention, the second part, to, to the third wave. so this is kind of true parts that I see in this third wave. Like one is a standardization of pipelines, but also with that standardization. Comes this notion of like off the shelf tools, right? Or off the shelf, build steps where, and I think this is where things get really exciting cause every time something's kind of off the shelf where it's kind of convention over configuration, and it works well, but you can kind of tweak it for your use case. That's where developer productivity skyrockets, right? It's, it's quick to get it up and running and. A lot of, a lot of what goes on at companies. It's so repetitive across companies, right? Everyone's kind of doing very similar things. And the tooling that is kind of pre-baked that you can just pull off the shelf, that's where the quick wins are. And so I think with the third wave, what we might start to see is, you know, tooling ecosystems, you know, go or PHP or JavaScript. you know, these pipelines, a lot of what they do in these pipelines are pretty similar. Right? The test harness that you're going to run, it's going to be pretty similar to the same thing that other people are going to run so much like what happened with containers, where you've got like this hub that you can pull down your containers for each different language. There's like a hub of a. Build steps in a pipeline or whatever you want to call it, the, the, each unit of that pipeline. and you can take that off the shelf so you can say, and maybe they're even templated pipelines, so you can say, I've got a project that's in go, or I've got a project that's in Ruby. Pick your language of choice. I can just take this pipeline off the shelf and all the different things that I need to do to make sure I have good code quality and make sure I have automation and can be deploy to production. All of that can come off the shelf and I'm using standardized tooling in an open format and. I no longer have to spend days or weeks, you know, struggling with the configuration format and struggling with my CI/CD system and iterating on how I want it set up. I can just grab something off the, off the shelf. I can grab a couple of different parts, configure it in the pipeline as, as I see fit and I can run with it. And it saves me a bunch of time. And that's kinda what I'm hoping for in the third wave. And that's why, you know, the third wave is, is usually, I think, the analogy is a third wave is great. So hopefully, you know, that's what we're looking at.

Quinn: Yeah, so that makes sense. In the NGO world, that already has a lot of standardization, as you said. You know, you take a go project, you can run, go tests, so you can run and go test cover. How do you see other language ecosystems that have a lot more configurability, variability, like JavaScript getting to a world where there's more standardization is the incentive of really easy pipelines enough

Ryan: I mean, it's, it's tricky and it's, I'm no expert. It's been quite a few years since I've been heavy on the JavaScript tooling. But, you know, once upon a time I was working pretty closely with JavaScript, on a daily basis. And, you know, I think it's achievable. I think w w what I see. What I see this third wave is having is if, if you have this off the shelf pipeline in each of the steps, you know, follow a particular convention. I think there are like clusters of conventions that people follow. Like it's, it is possible that everyone is doing something completely differently, but it is likely that there are clusters of, you know, like sub communities within the JavaScript ecosystem that follow a set of the same kind of tools. Right? and I think that if we have this standardized tooling that someone that's using, say react in a set, a particular set of react, you know, components, they have their kind of pipeline set up in a particular way, and however they're doing all the different things minification and whatnot, they've picked this kind of set of. These are the tools and this is what our pipeline looks like. If they publish that and share that, there's surely someone else out there that's doing a SA the same or very similar things that can just grab that off the shelf and run with it. And you know, it needs to, it needs to be in a way that you can support people's different setups. But it also would be great to have this notion of like published off the shelf pipelines that are set up and that you can run with.

Quinn: do you see large scale mano repo build systems like Google's Bazell and Facebook's buck and other related systems helping to solve this problem and bring the third wave

Ryan: possibly, possibly. I think we'll see. It's, you know, the Basil's basil, basil. it's.

Quinn: I heard it's basil.

Ryan: Bazell. Okay. Yeah. I've heard a couple of different ways. Bazell okay. So I've seen people use it. I've supported it like, Bazell remote cash. you know, my team supported that and. I think there's more more that can be done in terms of, develop a friendliness, you know, ease, ease of adoption, reducing kind of the learning curve. Uh. It's really hard to build good developer tools. it's in theory, it's easy, but the execution of it is, is super difficult because everybody's different. Everybody has kind of different preferences. Everybody comes into using a tool with a different set of experience, and so to pick something up and just be able to run with it, and so quickly grok. New concepts. you know, it's really, really, really challenging. And you know, these build tools, maybe they're built for specific use cases, that bridging the gap to other people's use cases can be very, very difficult. And maybe it's better that they focus on the use cases then trying to be the one tool for everybody. maybe that's their approach and I'm not singling out Bazell or any particular tool. It's just, you know, some tools. I think most tools are built for specific use cases and it's hard to do a one size fits all. And that's what I kind of imagine. imagine the future is that you can, you can pick those clusters of tools that are appropriate for your use case and everyone's kind of using a standard configuration format, but not everyone has to use the same cluster of tools.

Quinn: do you see any efforts to come up with a standard pipeline definition format?

Ryan: Not yet. I haven't heard of any, but, if anyone knows of any, please reach out to me on Twitter. And, you know, I'm, I'm super interested to know if anyone's working on this kind of thing or thinking about it, or if any in particular, if any CI CD vendors are interested in that kind of thing, I'd, I'd love to get a conversation going around this because I think that there is a lot of potential here. And, you know, maybe someone's already started, maybe someone's already built it. but from what I've. Tried looking, I haven't been able to find something yet.

Quinn: Yeah. If I were a CI/CD vendor, I would look at this as an opportunity initially because developers would see that this is, you know, aesthetically better. It means that the developer locally can run tests and can iterate more quickly, but eventually as a vendor building this, I would worry that. In the end, I'm just turning this into a commodity. And the only thing that companies need is somewhere to execute this code. And the large cloud providers are always going to be able to do that more cheaply than, you know, some new startup. So how do you think this gets introduced? Does it, you know, just, is it an open source effort? Is it something that one of the big cloud giants does the way to bring more, you know, execution to their cloud?

Ryan: I think, you know, there's the challenge for newcomers, is, you know, how do you, how do you grab market share from the incumbent? And I think that this could be something that more of a newcomer could use as a strategy to. you know, make the solution easier to migrate to, right? If there are a lot of companies out there using more, you know, established systems and, they're kind of wanting to do pipelines and you provide this nice open format and tooling that allows them to run pipelines on their, their currency ICD solution. But then the switch, the cost to switch to your CI/CD solution, then pretty much like, you know, disappears from a, how do I configure my builds on it perspective. That's a huge win for someone trying to grab market share. So, you know, hopefully one of the upcoming is can, can kind of lead the charge on this because compassionately it makes sense. But if not, maybe it's up to the open source community. you know, finding time. Outside hours or, you know, however we find time to work on opensource things.

Quinn: I'll give a plug for Buildkite, which is what we use at source graph, and it doesn't have an open standard definition, but it is very easy to work with and we actually run our bill. I think there's a common pattern when the build runs. The first step is to generate the pipeline. The pipeline that we want to run depends on what branch, you know, some other parameters. And then it'll generate the pipeline and uploads it back to build kite, which then orchestrates the rest of the build. So there's some dynamic. configuration of the build pipeline. I think that that would be a challenge for any kind of open standard build definition. Because at least at service graph we've said what we want is a, you know, a dynamic, imperative way of declaring the build pipeline instead of, you know, a Yammel based thing, like with GEV actions or with circle CI or Travis. How do you see that that need being accommodated? Or would you tell us, you know, Hey, that's a bad idea.

Ryan: I, I have opinions about, dynamically generated pipelines. we could, we could talk for days about this. I, you know, there are particular use cases where it definitely makes sense. Uh. In general, though, I think this, the simplest solution generally wins, and as soon as you introduce dynamic generation of anything, it becomes kind of like your, your whole layer. Kind of abstracted from what actually runs on the metal, right? So it becomes more difficult to understand what is actually running. and I, I generally personally stay away from those kind of things, but I, I've seen really compelling use cases for dynamically generated pipelines. And you're, you're right. It is a challenge. I think, you know, in a standardized format, it would definitely be a challenge. but also, you know. The thing with these pipelines as code. I think the thing that's really a massive challenge with this is secrets like secrets are very difficult to work with and make sure that you're keeping them secret. And. Anyone that's set up like some kind of production pipeline that is deploying somewhere. Maybe it's to a cloud environment. When you're doing that deployment, you need a secret, right? You need some kind of API key to be able to orchestrate it or, or whatever the thing is you're using, but you likely need secrets, you know, maybe, um. Maybe it's credentials for storage systems, you know, or something like that as well. But generally, like if you're pushing up to a Kubernetes cluster, you need to authenticate with that. So you need to store secrets in your pipeline somehow. So I think writing a pipeline's configuration format is. You know, that's, that's the secrets of your huge challenge and pretty much everyone's going to need secrets. And then, you know, dynamic, pipeline generation's also a challenge. But, you know, that's, that's why I think each fender doing its own thing is, kind of. A necessary step because everyone needs to kind of experiment with and figure out which way works. And, vendors tend to look at what each other are doing and, and follow trends. And, I think it'll be obvious what the best way to handle that is. And I think, you know, to the point, build, cut, it's, that's cool. you know that you guys are using it because I really like what they're doing. I think that they've done a great job. Their documentation is, is. Absolutely brilliant. How they've approached it. It's very thorough, very easy to understand. but maybe I have a bias because the based in Melbourne, which is where I'm from and where I am right now,

Quinn: well, there must be something in the water maybe in the waves over there.

Ryan: well, maybe it's just cause it was, you know, 43 degrees Celsius outside yesterday. Everyone's saying inside in the air conditioning and, and coding.

Quinn: Yeah. I mean, I hope that they're productive and 43 that's 109 Fahrenheit degree weather. You know, air conditioning is not going to work that well, so hopefully people are staying productive there. I know. I wouldn't.

Ryan: we're doing our best.

Quinn: So, how do we get to a world where as a developer, I never need to wait for CI. I never need to wait for a build.

Ryan: That's, you know, that's being, having been in a position of, you know, being responsible for operating a CSED system, being responsible for the tooling does this really, interesting dynamic when working with engineering teams because everybody wants builds to happen quick. Everyone wants the deploys to go out faster. And there's this kind of. There's this fine line depending on how the teams are structured and this, you know, it doesn't apply to just one company, but from multiple companies that I've seen, there's a fine line of ownership of how quickly builds, run versus, you know, in terms of what their builds are actually doing versus how they're running kind of in the runtime and in, in the environment that they're running in. And the way that I've kind of. Explain where the ideal responsibility lies in terms of ownership is the former, the, you know, what the builds are actually doing. If you look at how long it takes. Your, your builds to run all your tests to run locally on your machine, and then you run them in the CI/CD environment. If there is a drastic difference, you know, and obviously hardware is going to be a bit different, but you can probably roughly estimate what the difference should be based on the different hardware. But if it's drastically different than, you know, that the person that's kind of owning the infrastructure for CI/CD, this is probably something that they could be doing to improve that performance. But a lot of the time, you know, the, the most egregious, you know, hour long, multi hour, whatever, you know, tests or builds that I've seen, it's not because it's operating slowly on the CI/CD system. It's just because it's doing so many things. Right. And, there are a couple of techniques you can use to improve that. Um. You know, it's splitting and splitting things into a pipeline. Also, I think helps a lot because if you can split like unit tests from integration tests, you know, you can get much greater visibility into things that are like, you know, maybe, maybe you integration tasks kind of gets used. for a lot of different things, but say you have tests where you spin up something, some kind of database, or you're holding some kind of state between tests. Like they tend to run a lot slower than like a straight up simple unit test. It's just testing a unit of code and, you know, it's doing dependency injection, passing in, you know, mocks or whatever you want to call them. and. You know, splitting out the pipeline gives you greater visibility, usually. but there's a lot of optimization that can go into those things. And, I think the thing that I've seen that's the, the most effective way of improving the speed of your builds is through caching. So often, you know, it's, it's tricky depending on what programming language you're working with. but say you're working with compiled, languages, you know, maybe it's a C or go or whatever, and, you can kind of compile an object file or. You know, it combines, compile some kind of binary and, and do separate parts of your code base can build that piece. And then you can kind of have that one build that ties all those together. So you have this kind of dependency graph of, of your build. And then for the things, each of those smaller pieces, you can actually cache those smaller pieces. And it means when you make, when you make a change to a single line of code, often you don't need to go and rebuild the entire set. All of those pieces right there, the entire graph. You only need to maybe run two of those builds like the specific, a smaller object. And then the final thing that pieces that all together. So build caching, I think is, it's a great tool. And, and, it's, it's somewhat difficult, but once you understand, you know how to do it, if you can apply that to a code base, it can, can drastically improve the speed of your build times.

Quinn: Yeah, I've benefited from that locally. I used to run individual go tests and now he just run. Go test dot slash dot, dot, dot. Run them all and test caching saves a ton of time. But I've heard from a lot of companies saying they want that kind of thing in CI to know based on the change, what's the subset of tests that need to run. But I've heard from very few companies that actually have built that kind of system and have confidence in it enough to say that if the bill passes this minimal build passes, then it's as reliable or almost as reliable as if the whole test suite run. Have you seen companies do this well, and how much of an investment does it take to get there?

Ryan: Yeah. I think that's, I don't think I've seen anyone do it, at the, the level that you think you, you know, the idealistic level. I think the theory around this, again, is solid. Like it makes sense in theory, but in practice, executing on it extremely well, it's, it's very difficult. Right. and I don't know if it. It's often like there are some, it's quite often that they're a low hanging fruit. You know, in what you're doing in your builds and tests in your pipeline that will speed it up drastically. That's a small amount of effort and something like, you know, are only running a subset of tests because you've only changed a set of code. Uh. That. I feel like the complexity there and the amount of investment that goes into it for the time saved, it's much, much harder to justify. And so it's kind of, not every company would, it might not be feasible for every company just to kind of take that approach. But I do, I think, I think this comes down to kind of language tooling and language ecosystems. I don't think there's a one size fits all, a more generic approach at the CIO level. I think it's. Usually a bit closer to the code because it's more specific to each kind of language and use case. And probably less so for like the, the acceptance tests or anything. And, and more so for the unit tests and integration and, and things that are closer to the code. And I think there's the Ember JS community. This is a few years ago now, but I think they did some work, to, to drastically improve the performance. It might've been compile times, you know, they'd like unify the code or whatever the JavaScript word is for, for compiling. they've, they've got to do all this work and they figured out, you know, we can save ourselves from doing all of the work. We know based on where this was changed, what is the minimal set of work we need to do. I think there was something done pretty impressively, in that community, but I can't recall what it was called.

Quinn: Well, we'll find it in the show notes and we'll put it there.

Ryan: Yeah.

Quinn: Yeah. That is the Holy grail of test running. So also in this third wave, I touched on it briefly earlier, but as a developer, as soon as I hit that last keystroke before I go and commit. Theoretically, all the code in my working tree, if that could somehow be shipped up and if tests can start running as soon as I hit that last keystroke, well then by the time I write my commit message, push it up, we'll probably have, you know, at least two, maybe five minutes of time to have started running the tests. This is kind of like that famous Instagram trick where as soon as you take the photo on Instagram and you know you're writing the message and tagging it, it's uploading the image in the background. So when you hit post, it's instant. it seems like that's another kind of benefit that we could get. And it can get expensive if you know, after every keystroke, maybe D balanced, you're starting the entire test suite, but do you see any efforts there? And do you think that's, that's something we'll get to

Ryan: I actually would say that this is, this is like something that exists today. And, you know, you look at a lot of, a lot of development shops, engineering teams, you know, they have, they like to have nice laptops. I see a lot of companies that the standard issue is, you know, the, the full spec MacBook pro. you know, and maybe, maybe I'm misrepresenting, kind of the worldview of this because of having worked at the companies that I've worked at. And. you know, I'm sure it's not like that for everyone in the world, but there, there are definitely a lot of companies that spend a lot of money in really nice, powerful machines for developers. And I think that's a really good use of those resources that have got, is running those tests locally. I'm a big believer in. Anything that you run in CAI, you should be able to run locally and verify that it's working the same or, or as close to the same as possible. because then you can iterate on those tests and the harness and the way that it runs locally and quickly get that feedback on it and not be reliant on the time it takes to push it up to get, or, you know, version control and the time it takes for a build to be in queued and the time it takes for it to actually run in the time it takes for the. Results to be pushed back. It's kinda like the, I see this as like this speed of light, argument with the internet. You know, it's like these are necessary steps that need to happen and you can optimize them, but there's only a certain amount you can optimize them and it's far quicker to just not do that. Right. And just run it locally. And live reload was this thing that I think the F the front end weld, uh. Probably mastered, I'm not sure how many years ago this was now, maybe six or seven. And there was this idea that you change one line of code and your page would automatically refresh once the latest, you know, JavaScript had, been, I don't know, maybe you didn't even build it in minify or whatever, but yeah. You know, as soon as you change that line of code, your browser refresh and you get instant feedback on whether or not that change work the way you wanted it. And it wasn't just JavaScript, it was like CSS or SCSS or less than whatever you were using. And it was, that was super impressive. And I saw that this a web dev shop that I worked at, and I kind of looked at it and thought, okay. You know, when I'm doing my go code or whatever I'm working on, like I need to have this same workflow where I hit save and it instantly, I've got, you know, one terminal window, it kicks off my unit tests, another terminal window. It's got like a live binary reload, you know. Restarting the web server, you know, or that restarting the application and, and my page can actually refresh for me and I get instant feedback. and it's like this really nice small feedback loop as to whether or not things are working the way you expected them. And I think that using your local computer and local machine and the resources it has, that's a pretty, pretty effective use of that.

Quinn: Yeah, that sounds amazing. You know what I would love is for, you know, a few of the popular languages and stacks, if there was just a sample repo is a proof of concept repo that. Had a little mini project and it gave you all of these things. It had good pipeline definitions. It had a good test runner that could watch and only run the tests that were changed. You could run these things locally and it doesn't matter what the actual project is. It's about all the tooling around it. And I think some languages, you'd need a ton of work to get there. Other languages. You might not need any additional configuration at all, like for some go projects. Do you know of any efforts like this, like the, to do MVC example in the front end to show how different front end frameworks compare.

Ryan: Yeah, that's, that's a good question. I haven't really looked into that on kind of the language specific. I think there's like in the PHP, well does this website called PHP done the right way or PHP the right way, something like that. and that's like a really good look at how, like, what are the best practices for each of the things you could do. and that's pretty good. but I think, yeah, the front end, the tutor, your FAC thing, pretty much anyone that's. Done any kind of front end work knows about it because it's a quick way to evaluate each of the different, you know. Options for, for libraries or frameworks or whatever. And it does, it does a really good job of a baseline comparison between those. and also like a good example of a really simple, simple application. And that's something we could get better at with kind of API. Type systems or CLI type systems that you're building with languages like go. But I think for me, you know, I'm a huge fan of, of go, and one of the things that really attracted me to the language was the tooling, right? The, a lot of this stuff has just been done so well and I'm, I'm kind of starting to dabble in Rostin. One of the things I like about the Russ website is the kind of entry point view of when you're using the language is like, you're not. That you're not just riding rust, you're riding Russ to build a CLI tool, or you're riding Russ to build like some kind of API. They've kind of structured the documentation in this really nice way where it's kind of focused around the use cases and the things that the language might be good at. And, I think, yeah. In general, you know, it would be really nice to have one website that said, here, here are a list of like 10 really popular languages for building API APIs. And then here is an example of an API that's being built in each of those. And I've, I feel like I've tried to do this, you know, once or twice before, and you sit down and start writing the first example, and it's just like. I'm going to be able to reference solution. Am I the right person to do this? Am I going to be able to build a good reference solution? Because you kind of trying to set this bar and maybe it just, it just takes someone to, maybe we need to get the to MVC person, whoever made that, get them to learn all the different languages and build a great NVC to do MDC in all the different backend languages.

Quinn: Yeah, maybe we can get them on the podcast or maybe people are gonna listen to us talking and say, well, I can do it for us. And frankly, I think if someone feels the pain and they want to do this, they're probably the right person to go and put together this reference repository for that language.

Ryan: Yeah, absolutely. That'd be great.

Quinn: Yeah. Because you know, it's, it's, the state of tooling is so bad and anyone who wants to improve it is probably. You know, going to make something that's way better than the status quo and tons of people will be able to learn from that. You know, how to get tasks, how to get pipeline definitions set up, how to get, you know, test watching, how to get everything, you know, all the, all the different tooling. That's a widely accepted set up for language. So if you had to give some advice to people that want to improve the tooling in their company, what kind of tips would you give them?

Ryan: I would say as a general philosophy, like make sure you have alignment on. Build your own versus kind of off the shelf and an urban sauce. And if you go down the path of building your own, try to build it in a way that you would open source it. Because having that mindset like we're going to make this public, people are going to be able to scrutinize our approach. People are going to want documentation. How do we explain the broader concepts of what we're trying to do? Like that kind of philosophy and approach. Yields much better outcomes for tooling built in house. and you tend to get, more that Unix philosophy of, of kind of smaller components that, you know, you can pipe output from one tool into the next tool. you, you tend to get more of those kind of smaller module. Uh. Tools, that last, the distance they go to the distance they last many years because, you know, some things over time are no longer relevant, but you know, if you've got, it's all that could have been built as a monolith and Don 10 different things, or you've built 10 different smaller tools that can all be linked together or chain together. you know, at least. Five years from now, maybe you just throw out one of those tools as opposed to having to go through the process of deprecating all the features and the tools. Right. And you can monitor usage of those tools, hopefully, and, and, and see which ones are more relevant. I mean, this, this kind of conversation like tooling. I don't think people talk about monoliths versus micro tools as a thing, or I haven't heard too many people talk about it as a thing, but I think there's a lot of the things that we see between the monolith kind of web application or API service versus the microservices. I think a lot of those problems and advantages of one approach or the other, I think they apply it to tools as well. but we just don't see as many examples of it because there aren't as many people building tools.

Quinn: Yeah. Another benefit of open sourcing it is if you love it and you get hooked on it when you're at your current company, if you ever move on, then you can bring it to the next company. You don't have to say goodbye to this amazing thing that you built.

Ryan: Yeah, absolutely.

Quinn: Any other tips you have.

Ryan: so I think, uh. There. There was probably four or five tips that I could have for CI/CD via systems and develop a productivity and kind of, I dunno, the dev ops word gets used to apply. It gets applied to a lot of things, but you know, DevOps things. so I probably got four or five tips. so maybe we'll run through each of those. And then, I think after that, describe what I think good looks like in relation to those. And it kind of ties it all together.

Quinn: great.

Ryan: So the first one, you know, people have varying opinions on containers and containerizing the things, and I'll try not to say containerize, all the things, but you know, your tests, you'll build your deployment scripts, your dev environment. all of these things can potentially benefit from containers. Right? Because once you put it in a container, it gives you, you know, that's ability to replicate the same setup on multiple machines. So you get the portability, right? You can make sure that, things like the necessary dependencies that you need installed on your machine just baked into that container and it's pretty self-documenting then because it forces you to be explicit about what are all the things needed to make this run. Now. If you've experienced this, but I certainly have where I've jumped into some code base and you know, it's, it's Ruby or it's a JavaScript code base, and I haven't done that. You know, I haven't written Ruby in a few years, or I got a new machine a month ago and. I haven't written JavaScript for a few months, so my machine's

Quinn: Oh yeah. I know the pain.

Ryan: Yeah, exactly. Like I think a lot of people have experienced that pain, and so then it's like, Oh, I got to let you know the product manager. Now I'm going to be a day late on shipping this because I need to spend a day setting up my machine for JavaScript. Right. Or, you know, like something, some like a large amount of time just setting things up. And the nice thing about containers is if, if you set up your. Your containers, right? You can usually just pull down that image and let it run your test on us and all the things a built in. And actually the way I came about realizing this was the last time that I had to do something with JavaScript, I realized I had a new Mac and I didn't have. Any of the things installed and I needed to get a specific, you know, node version or whatever. so I thought this is all too difficult. I'm just going to use a Docker image. So I built a Docker image, you know, based it off the, the official node for the particular version that I wanted. And then I kind of did my NPM install of whatever the tools were that I needed. And then I just, I CA I committed this Docker file back to the repository and I was like, Hey, if you want to run these tests on your machine, all you need to do is type Docker run. You know, image name and you're good to go. And that kind of practice, I started to use more and more and I found it much easier for myself, you know, and future yourself, but also for collaborating across teams. And it made it easier for people to jump in and start working straight away and be productive with my code base. So I would say that's probably a tip number one. Number two, and kind of related to that, is, you know, jumping into, to a code base, you know, needs a good documentation. So once you've got those containers set up and you can run your test harness inside the container, you know, builds and employees like document how to actually run those. Write down what that Docker command is and write down that the only dependency you need on your machine is, you know, Docker and maybe the Docker version that the minimum Docker version you need to run, and have that in a read me or whatever the preferred system of documentation is at your company. And for larger organizations, try to have some kind of effort around standardizing that kind of one pager. Of how to dev and how to run tests and how to run builds and how to run deploys. Get that into like a standardized template, a format so that teams kind of get a nice nudge that they should make sure they include each of those things. But also so that when you're jumping into a code base that you haven't worked on before, you know what to expect in terms of documentation and where to find that information.

Quinn: So what is number three.

Ryan: So number three, I think, before, you know, I, I struggled with the order of these, these tips cause it starts to get very related. But I think number three is your branching strategy. So this notion of continuous integration, you know, the theory that I kinda mentioned before about the longer the code sits in a branch, the greater the risk of merge conflicts. you know, we have D tools, but yeah. Really how you do you get branching is pretty important. Or you know, a lot of people use get, but you know, it's not the only tool out there for version control, but the branching strategy that you're using, making sure you have alignment in your team about how you're using it and you've been thoughtful about the branching strategy. I think that's really important and easily overlooked. And sometimes it needs to evolve, right? Feel free to change it or talk about changing it. Don't let it just be the way it was because that's how someone set it up at the start. And I think, you know. Thinking about how you release and release stages. You know, a lot of companies have different types of really stages. You're maybe it's just a simple, we have a staging environment. We have production. you know, often people hopefully have good dev environments as well. So devastating. Then production, that's kind of promotion of code, so that you can kind of test it before it gets to proud and have confidence that it's going to work in prod. you'll get branching strategy, you know, some people mirror. The release stages, so they'll have like that staging branch or next branch or whatever you'd call it. And, you know, some companies, they do it differently. Everything merges to like master. And you have some other way of kind of working out the releases. you know, some companies, I think. Have hotter rules about months, you build an artifact. That same artifact must be used across your staging environment and your production environment, so that there's absolutely no chance that what you're releasing to production is different to what you proved wax in staging. so just being thoughtful about getting ranching strategy and, you know, your release stages and how your, what your workflow is to get a release out the door. So that's number three.

Quinn: right. Makes sense.

Ryan: number four, this is where it kind of all comes together, I think is automating running those in your CI/CD platform. So people sometimes jump, I think to the automation step probably quicker than is necessary or quicker than is going to. Pay dividends for them, like the return on investment in automation. Hopefully it's not too much work to automate, but the value that you get from those things I mentioned earlier, I think is, is much greater for the amount of effort you're putting in than the automation step. Because if, if you know your things are containerized and it's like one command to run something to do a deployment. Even if you're doing that 10 times a day, but if it's taking you 10 seconds, like the value you're getting back from automating it in your actual CI/CD system isn't as great as the effort to actually containerize it. but maybe, you know, everyone's different. Everyone has different priorities around this stuff, but I think about three or four is somewhere in the list and you know, of the automation steps. So, uh. The thing that's super useful though, and I think the most value that comes out of this is making sure that tests run on your PRS. If you're doing pull requests and code reviews and that those tests kind of inform reviewers, or if you don't actually require reviewers before you, merger PR to, to approve that PR, you can actually, in a lot of systems, you can set it up so that the tests just need a pass. So, you know, if you've got a smaller team. you might have an agreement that we only merge if the test pass. And you know, that's, that's I think what the most value is there. And then automated deployments encourages this culture of continuous delivery because if you kind of merged to master, you might automatically deploy it to staging or production or however you got it set up. but that's, there's a ton of value there because you're able to get releases out the door much, much quicker.

Quinn: So when you say automation is valuable but not as valuable as containerizing it are there, what build steps are you thinking where, you know, people wouldn't automate them. Because, you know, you talked about running the tests in PR, you talked about deployment, but what other types of steps, you know, might not be necessary to automate immediately.

Ryan: Yeah. I think, you know, even in the get, well, sorry, in the NGO world with, tools as good as they are. There. There are some things that people just aren't running in their CI/CD systems. Right? There are a lot of good, code coverage or static analysis tools that, provide super helpful information, provide useful statistics to track over time. but people don't kind of. I dunno if it's just, you know, an afterthought or it's, it's not kind of front and center, but, you know, tools like there was originally a thing called , which was just a tool that ran a bunch of other tools. And then I think the newer, more popular one now is go Lang CII, lint and that tool, instead of executing all the different tools, I think builds it all into one binary. And that has so many different cool static analysis tools. but. You know, not just running go test in NCI, but running tools like that, that provide the additional feedback and automating that because, you know, running it locally, there might be a bunch of different things that you've got to run, and you might forget to run them. So making sure that you're running all of these things in CIS is super useful. that said, there's kind of this thing, in the go community. I think there's a notion of. The tools ideally as are good, that you don't need to ride like a make file to run all the tools. But then there's a lot of people in practice that writes, you know, bash scripts. They'll make files or whatever they're doing to run, you know, go tests with the parameters that they want and then to run, you know, go metal into or go Lang sea Island, with the, the configuration that they want. but yeah, automating those kinds of things. I think there's a ton of value in that.

Quinn: Yeah. But it's not necessarily crucial because they might print out a lot of errors or warnings and you don't want to address them all. and getting that workflow where you know, you get down to zero warnings and never allow them in. That might be, you know, more than is needed. That's the point.

Ryan: Yeah. I think there's a, there's a real fine line between blocking a PR on something and adding that information to help with people reviewing it. Right. And you know, sometimes you might have a situation where you don't require other people to approve it, but the fact that that PR is automatically running your full test suite and all your code coverage tools and static analysis tools. Is enough to just inform you as the author of that change that, Oh yeah, I actually reduced card coverage and maybe we haven't started blocking on code coverage dropping below a certain percent yet, but. As a practice for myself. I want to try and get that number up. And it's like that useful kind of informative nature. That's very helpful. And that's coming back to that story of, you know, the, the gamification. I think that's where that was really powerful is that it kind of built this. Culture of thoughtfulness where people were really thoughtful about the impact that we're having on code quality. Even if it wasn't a hard requirement, it would just get into, think about it, and often that's enough to help raise the bar.

Quinn: Yeah, that makes total sense. You know, we've heard this from a lot of our customers at source graph that there are a lot of these awesome static analysis tools that they don't necessarily want to treat as pass fail. They want to use them as diagnostics and they want to gradually ask them tonically reach passing. But you can't just throw that NCI and block every single build until everything is green. And this, you know, very strict tool. You need to work toward that gradually. And I think that is a gap in today's CIO tools. And you know, it'll be interesting to see what can solve that. It's something we're looking at. Solving in source graph itself, actually. So stay tuned.

Ryan: Interesting. I'm really excited to see what you guys can do that. Yeah. I think it's not just the CI/CD tools as well. I think it's, you know, the, the source, you know, source code, repository, tooling, you know, the, the GitHub and Bitbucket and, get lab and all those tools. I think that UI, like there is a lot of opportunity there to improve that and provide that additional feedback. And, you know, that's one thing that I guess it makes sense that. A social scrap would approach this because you've got those browser plugins and you've got those, you know, plugins for a bit bucket and get hub, that can expose additional information. And I think there's a huge opportunity in that space.

Quinn: All right, so saving the best for last, what is tip number five?

Ryan: Yeah. Tip number five. So this, this is related in a very kind of minor way. but the headline is configuration as code. I think a lot of people know about this as a concept, but, to specifics in terms of the value that it delivers, infrastructure as card is super, super valuable. it's tough to do early on when you have a small team and you're moving quickly and setting up infrastructure. But being able to repeat an entire environment is. Super, super useful. so if you can spin up and tear down your entire set of infrastructure, you know, maybe it's AWS or Google cloud or whatever account you have, you set up a new account for a new release stage, and you can just go and spin everything up again and tweak a couple of environment variables and have different secrets. But that ability is super powerful. And I think the way that that is powerful is also kind of the same way as pipelines, as code. So I think tip number five is if you have the tooling to write pipelines as code or somehow put your CI/CD configuration. In code. absolutely. You should be trying to do that because it's, it's great. It's like, you know, it's a form of documentation. It makes it accessible to people. Like sometimes people don't even have access to the configuration options on a build configure in the CI/CD system, right? Like they might, one person might get access to configure it, but then there might be a whole team that. writing code that uses that pipeline that don't actually get visibility into how it was configured. And then it creates this bottleneck on that one person that can go in and check settings and things like that. So pipelines is, Kurt is kind of good documentation. But it also means, again, like maybe your company operates multiple CI/CD environments and maybe you want to iterate on changing. The configuration in like your test CI/CD system and then release it to production at a later stage. So it gives you that kind of ability as well. So infrastructure is code, pipelines as code, just in general. All the configuration as code is, is super helpful.

Quinn: Yeah. When do you think we'll see the third wave more widely adopted by companies.

Ryan: I think it's, it's going to take. Somebody in the industry to really drive the effort to consolidate things into a good standard. you know, who knows? Maybe this podcast is the start of it.

Quinn: Yeah. Maybe there's someone out there listening, and if you are, then how would someone reach out to you on Twitter to get in touch.

Ryan: yeah, absolutely. So my Twitter handle is Ryan , R Y a N zero. X four, four. let's, don't ask me about the zero X four four. It's just Ryan D. If you can work out where the zero X 44 comes from,

Quinn: Ah, I see. no need to ask then, so that's great. Well, my guest has been Ryan Djurovich. Thank you, Ryan, for joining us on the Sourcegraph podcast.

Ryan: Yeah. Thank you so much for having me. It's been great.

Episode 2: Ryan Djurovich, DevOps and DevTools manager at Xero and Cloudflare

Show Notes

Transcript

Start using Sourcegraph on your own code