I think that’s a good question, and if you think about who’s building these systems, they’re mainly developers these days. I would like to see more ops people, and we do see them building features around it, but I think you’re thinking about this world where the developers are now spending a lot more time in operations, building operations-specific tools. Stepping back, it’s just all software. This software just happens to deploy other software. It’s kind of like writing your own unit tests, right? You’re writing software to test your application, so in this case it’s software to manage applications, but it’s just all software.
As a developer, you may be less interested in deploying and managing a Kubernetes cluster. There are different roles on the team for that, and if you’re in a position where you have a dedicated group of people who are using a hosted system, that’s great. As a developer, what you kind of care about is R20;I have this application, an application running on my local machine, and Docker’s a fantastic piece of software that lets you take a Linux machine and abstract away the needs of SSH, systemd unit files, logging – all of that kind of goes away, but it’s still there underneath the covers.” Docker sits on top of a single machine and gives you this really nice API, you’re packaging the app, push it through a repository and running it anywhere that you find a Docker. This is perfect.
But in production, you have a lot more concerns than just starting and stopping an application. Who’s going to collect the logs and push them to a central place? How do you express the need of “I want this application to run across multiple machines, multiple data centers.” That particular set of requirements needs a higher level tool or language in order to express and enforce to make it happen. A lot of those concerns – you start to get into this idea of clustering. I have multiple machines, and the machines that we’re using these days, these are mainframes; they’re not built to last forever. So we’re dealing with these machines that are going to fail. If you’re in a cloud provider, the VMs are ephemeral, so you have to plan that the machines that you’re running on can be blown away at any given time. They can die, like migration, so you probably want a system that can account for that.
[00:27:54.03] From history in general, the core principle, even if you’re not using Kubernetes, your team is going to have to build something very similar. How do I decide which machine my application runs on? Well, if you have a dev ops team and depending on how they do things, they may be recording that decision in a spreadsheet, in a tool like Ansible where you say, “This app runs the database. This server runs my web application.” That would be manual scheduling, right? That concept of scheduling, as a human you would say “Well, we know that’s the database server, because it has this name or it has this storage.” That act of scheduling in Kubernetes is an automated process where you as a developer can craft one of these manifests where there’s a deployment manifest. You can say, “My application needs one CPU and let’s say 16 megs of RAM. That’s enough to handle this many requests per second. If you want to scale above that, then give me more of those instances running and then we can actually scale horizontally.” So as a developer, there’s that dev test cycle, which is great for a single machine, but when it goes to production, as a developer you need a new set of concepts to express “I need five copies of these applications running, with these resource requirements. And oh, I’d also like to expose these to our customers over this particular port, not that port.”
So if you think about it, Kubernetes takes a dev ops team and rolls it into a system, and in return gives you the developer an API that you can use to express what your application needs to run in production.