工作面试教会我关于 Kubernetes 的事
What job interviews taught me about Kubernetes

原始链接: https://notnotp.com/notes/what-job-interviews-taught-me-about-kubernetes/

近期的求职经历显示,Kubernetes (K8s) 已成为行业标准,即便对于不需要其复杂扩容能力的小型初创公司也不例外。虽然 K8s 常被视为处理大规模工作负载的技术工具,但首席技术官们越来越多地因其带来的组织效益而非技术优势而采用它。 推动这一转变的主要因素有三个: 1. **统一性:** K8s 要求所有服务遵循标准化的部署流程,消除了各种“诅咒般”的特例配置。 2. **知识标准化:** 通过将架构知识从个人大脑转移到 YAML 文件和 Helm Chart 中,入职流程变得更快,值班轮换也更高效。 3. **可追溯性:** K8s 与 GitOps 天然契合,实现了合规性、审计追踪和安全的部署工作流,从而防止了“影子”基础设施的产生。 尽管有这些好处,作者仍提醒道,K8s 会带来显著的调试开销。初创公司在早期应优先考虑产品开发,而非基础设施的复杂性。然而,一旦团队规模超过一名工程师,K8s 在记录和强制执行标准方面的组织优势便变得不可或缺。托管服务的广泛普及和人才库的不断壮大,已使 K8s 成为现代软件部署的通用语言。

这段 Hacker News 讨论探讨了围绕 Kubernetes (K8s) 在初创公司环境中的持续争论。虽然有些人将其推崇为一种“无聊”、标准化且有利于职业发展的选择,但另一些人则认为,对于小型团队而言,它引入了不必要的复杂性和维护负担。 **主要的争论焦点包括:** * **“标准化”观点:** 支持者将 K8s 视为一种“通用语言”,认为它解决了部署一致性问题,简化了招聘流程,并确保了基础设施即代码 (IaC) 的最佳实践。托管服务(如 EKS、GKE)被认为显著降低了入门门槛。 * **复杂性成本:** 批评者认为 K8s 并非“开箱即用”。用户必须管理大量的控制器(Ingress、Cert-Manager、DNS),处理频繁的破坏性升级,并应对大多数初创公司并不需要的深层网络架构。 * **AI 因素:** 一个反复出现的主题是,大语言模型 (LLM) 改变了这一格局,它们使得生成清单文件和故障排查变得更加容易,从而可能降低了管理集群所需的技术门槛。 * **替代方案:** 人们对 NixOS、Docker Compose 或专业工具(如 Nomad、Kamal)等“更简单”的解决方案表现出浓厚兴趣。这些工具能够在避免 Kubernetes 沉重的控制平面开销的同时,提供类似的声明式配置优势。
相关文章

原文

So I've been job hunting lately. Reading job postings, doing interviews, talking to engineering teams at like a dozen companies. And I noticed something compared to five years ago when I was last doing this: literally everyone is on Kubernetes now. Every single company I talked to.

Last time I was job hunting that wasn't the case at all. There were basically three camps: the rare Kubernetes adopters, the systemd-on-VM/VPS/EC2 crowd, and the serverless people (Lambda, Cloud Run, etc.).

That surprised me, because where I work we have actual Big Tech-scale problems, so K8s makes obvious sense for us. But a 10-person startup with two services? None of these places were doing microservices or anything close to high scale. So I asked why.

Spoiler: they don't care much about the technical side of K8s.

Why?

A technical interview is actually a great place to ask why, especially when you're talking directly to the CTO. So I did. The answers were basically the same everywhere.

Uniformity

First one was uniformity. Every service deploys the same way. No one secretly knowing that the payments service runs on some bare VM with a cursed bash script from 2019 while the API is on Docker Compose because nobody ever touched it. One way to deploy, for everything.

Standardized knowledge

Second was shared, hireable knowledge. K8s is basically a lingua franca now. My first day at my current job, I pulled up the repo with the Helm charts and Kube configs and had a solid picture of the whole architecture within an hour. The knowledge is in the YAML, not stuck in someone's head. Lose someone, their replacement isn't spending three weeks digging through docs trying to figure out how anything runs.

At my current company, on-call SREs can keep any service up even if they've never touched it before. They know Kubernetes, and Kubernetes patterns are the same everywhere for all teams. Try doing that with a bunch of VMs where every service is set up differently. (Caveat: this only holds if nobody went exotic with the setup, of course.)

Tracing who does what

Third was traceability (with or without compliance). At my current company, nobody can just kubectl apply -f something straight to the cluster. You push a Helm chart to git, there's a trace, there's an MR approval process, then FluxCD or ArgoCD handles the actual deployment. Nothing happens in the shadow. That composes really well with compliance: it's basically how we ace ISO certifications. And since GitOps pairs naturally with Kubernetes, you get all of that almost for free.

What I took from it

The CTOs I talked to aren't making a dumb choice. They're solving real problems.

I was focused on the technical side only, and Kube always has been a technical solution to technical problems, for me. But it looks like a lot of CTOs are interested primarily in the non-tech benefits. More than I thought. Their technical problems just don't require it. I bet you won't find any topologySpreadConstraints in their manifests, they don't care. No HPA, no Pod Disruption Budgets, no node affinity rules. Just the same number of nodes they'd have VMs otherwise. But they accepted to pay the price of having a complex piece of software for the organizational benefits.

Honestly, I think it's mostly fine. But I still think most companies should start without it. Clusters are genuinely hard to debug when stuff goes wrong, and at that stage you want your energy on the product, not the infra. When you're still pitching to your next big customer, spinning up a VPS and doing a dirty git pull is a totally valid emergency fix. Suboptimal, sure. But fast, and you know exactly what's happening. You really don't want to spend two hours figuring out why your pod is stuck in CrashLoopBackOff right before a customer call.

Why the shift happened recently

I still don't totally get why the shift happened when it did. Five years ago all three camps were doing fine. Now the VM+systemd crowd has basically disappeared from job postings, serverless stayed niche, and K8s just won.

My best guesses: managed K8s (EKS, GKE, AKS) got mature and the talent pool flipped: enough people learned it that hiring for anything else became the harder choice. And Helm made "just use someone else's chart" a real option. But I'm not certain. If you were there for the shift and have a better theory, I'd genuinely like to know.

When to use Kubernetes

My personal threshold would be the moment the CTO isn't the only engineer anymore. As soon as a second person shows up, the problems K8s solves become real. Now you've got someone who didn't set up the servers but needs to deploy. Someone who needs proper access controls, not SSH keys to everything. Someone who'll leave eventually and take everything they know with them. That's when you want the system to hold the knowledge, not people.

联系我们 contact @ memedata.com