我們的工程團隊如何使用 AI

Hacker News·4 個月前

MetalBear 工程團隊分享了他們在開發 Kubernetes 開發工具 mirrord 時，日常實際使用 Claude Code、ChatGPT 和 Gemini 等 AI 編碼工具的經驗。文章著重於 AI 如何協助理解不熟悉的程式碼，並為實際軟體開發提供見解。

How Our Engineering Team Uses AI

Posted January 15, 2026 by Arsh Sharma - 10 Min Read

AI tools are everywhere right now, and our engineering team uses them daily. In this post, we’re sharing how we actually use AI coding tools and agents while building mirrord, what’s been useful and what hasn’t. If you’re looking for ways to leverage AI in real-world software development, hopefully some of this would be useful for you.

What do we do?

But first, some context about our team and the product we’re building. At MetalBear, we’re a completely remote company building mirrord, which is a Kubernetes development tool written in Rust. mirrord is not a typical SaaS CRUD microservice app. It’s a local tool that communicates with your on-prem Kubernetes environment, including components like a layer that injects itself into your process, an ephemeral agent that runs in Kubernetes, a Kubernetes operator, and lots of glue.

We raised our seed round a couple of months ago and are a team of 34 at the time of writing, out of which 15 make up the engineering team. We don’t mandate any AI tooling, but we do strongly encourage it, and engineers are free to pick whatever works for them and experiment. We have a Slack channel where people publicly share how they use AI, in the hopes their use case is relevant for others on the team. Here are a few of the examples that were shared recently, including use cases for Claude Code, ChatGPT, and Gemini.

Where AI helps the most

Getting oriented in unfamiliar code

One of the most consistent and least controversial ways we have been using AI is as an entry point into unfamiliar code. This is especially useful when understanding a new area of the codebase, coming back to something that hasn’t been touched in a while, or trying to understand code in external libraries. Instead of starting by opening files and reading through the code manually, people often use tools like Claude Code or Cursor for a high-level explanation of how a certain part of the system is structured and how the pieces relate to each other. A prompt might look something like:

It’s important to note here that engineers aren’t asking the AI to explain mirrord as a whole, or to be an authority on the architecture. That simply doesn’t work for a codebase as large as ours. They’re using it to form an initial mental model for a specific part of the system they’ll be working on. So even if that model is incomplete or slightly wrong, it still provides a useful starting point and makes the next step, reading the actual code, much easier.

Exploring ideas and alternatives

Another area where AI has been useful for us is early in the development process, during the planning stage before any approach has been chosen for solving a problem. Engineers often use it to explore ideas by describing the feature they want to implement or a bug they’re trying to fix, and seeing what kinds of approaches the model suggests. Having AI lay out a few different options can surface trade-offs earlier, or help rule out directions they don’t want to pursue, without paying the full cost of writing and rewriting code. A prompt in this case might look something like:

That said, objections have also been raised internally about using AI this way. Once a model proposes a concrete solution, it can unintentionally narrow your thinking. Even a mediocre solution can anchor your brain and make it harder to explore better alternatives on your own.

Scripts

If there’s one area where everyone on the team agrees AI consistently delivers value, it’s scripts. For debugging scenarios or local workflows, being able to describe what you need and have a working script generated for you can save a huge amount of time. One of the engineers used these prompts to create a reusable PowerShell function which they needed:

But there’s another benefit besides time-saving. These AI-generated scripts tend to be more structured and readable by default compared to what an engineer would write, because spending extra time on a throwaway script usually isn’t worth it for them. This makes them much easier to tweak, extend, and reuse later when similar needs come up. Over time, many of these scripts have stopped being one-offs and instead become part of a small personal toolkit that gets reused again and again across debugging sessions.

Where AI struggles

Complex architectures

mirrord has a complex and fairly unusual architecture, which general purpose LLMs struggle with. If you ask an AI tool to do anything that requires full context of how mirrord works, it will most likely fail. We’ve had very few instances of someone on the team using AI to fix a bug successfully without much manual intervention. And we’re still very far from letting fully autonomous agents loose on our codebase, even with a human reviewing the output, because it often requires more work fixing that code than writing it yourself.

That said, some engineers have had better results by explicitly giving the model persistent architectural context. In practice, this means maintaining internal CLAUDE.md or AGENTS.md files that describe mirrord’s structure and its major components. These files aren’t static, and engineers use the models themselves to keep them updated for future use. In case you’re curious, this is what one of those files looks like currently:

If you check out the full file here you’ll see, it’s a lot of context that we need to provide the model to get it to a state where it’s decent, but still not enough to be trusted entirely.

Long-running reasoning

Another place where AI tools consistently struggle is when the scope becomes large or the context stretches over time. Models will often forget why they made an earlier decision in the same session. It’s common to see them fix a bug in one place and accidentally break something unrelated elsewhere, simply because they lost track of an earlier constraint. This makes them unreliable for iterative changes unless the engineer is carefully tracking the logic themselves. The output often looks plausible at first glance, which makes these failures easy to miss if you’re not paying close attention.

The performance of different models has also varied in this area for our engineering team.

So is AI changing how we build software at MetalBear?

AI hasn’t replaced engineers on our team, and it hasn’t removed the need to deeply understand the systems we’re building. It hasn’t magically solved complex architectural problems, and it certainly hasn’t made it safe to hand over large parts of the codebase to fully autonomous agents. What it has done is reduce friction and save time.

It helps engineers get oriented faster, explore ideas earlier, write handy scripts, and offload a lot of mechanical or repetitive work. The biggest difference we’ve seen isn’t which model people use, but how intentionally they use it. The engineers getting the most value are the ones who scope problems tightly and control the context they give the model.

So yes, AI is kind of changing how we build software, but not in the way most marketing would have you believe, at least for a low-level, deeply technical product like ours. Right now, for us, AI is best thought of as a powerful tool around the edges of software development. It’s very good at accelerating parts of the process that are tedious or exploratory, but bad in areas that require deeper understanding.

Arsh Sharma

Senior DevRel Engineer @ MetalBear

Accelerating AI-Assisted Development With mirrord

July 24, 2025 by Arsh Sharma

Want to dig deeper?

With mirrord by MetalBear, cloud developers can run local code like it’s in their Kubernetes cluster,
streamlining coding, debugging, testing, and troubleshooting.

Want to try this yourself?

Get hands-on experience with a ready-to-use development environment.

— Hacker News

你的個人知識庫