
AI狂熱
一位在開源Python數據領域的資深工程師,鼓勵經驗豐富的開發者擁抱AI,強調其能提升生產力並開拓新領域,同時也承認與從組合語言轉向編譯器類似的合理擔憂。
AI Zealotry¶
I develop with AI today. It's great.
There are many articles you can read on why AI is great (or terrible) or how to
use it. This is mine. I focus on the experience of a senior engineer (and why
we in particular should use AI), on my experience operating within the OSS
Python Data world, and on practical suggestions that I've found myself
repeating to colleagues.
This article contains learned lessons of two types:
We'll interleave these two. I'm hopeful that this approach will make this more fun.
Why AI¶
AI development is more fun. I do more of what I like (think, experiment,
write) and less of what I don't like (wrestle with computers).
I feel both that I can move faster and operate in areas that were previously
inaccessible to me (like frontend).
Experienced developers should all be doing this. We're good enough to avoid AI
Slop, and there's so much we can accomplish today.
I like this quote from this blog
I get it, you’re too good to vibe code. You’re a senior developer who has been doing this for 20 years and knows the system like the back of your hand.
[...]
No, you’re not too good to vibe code. In fact, you’re the only person who should be vibe coding.
I think that really good engineers, the kind that think hard before writing,
can have a tremendous impact and fun while developing with AI. I
wouldn't ever go back.
Why Not AI¶
That being said, there are some serious costs and reasonable reservations to AI
development. Let's start by listing those concerns:
These are super-valid concerns. They're also concerns that I suspect came
around when we developed compilers and people stopped writing assembly by hand,
instead trusting programs like gcc to pump out instruction after instruction
of shitty machine code.
We lost a deeper understanding as developers when we stopped writing assembly
but we gained a ton too. As in any transition, we need to navigate the
situation to capture the advantages while losing only a little, balancing the
costs and benefits of a new technology.
This article is how I've been navigating this transition personally.
Big Idea: Minimize Interruptions / Climb Abstraction Hierarchy¶
Early in using Claude Code (or Cursor) many of my interactions were saying
"Yes, it's ok to run that". This was frustrating and dehumanizing. Mostly my
job was to enable AI, rather than the other way around.
There are many tricks to resolve this (see below), but more broadly "stop doing
simple shit" has been a mantra that I've found myself constantly coming
back to. The more I identify and reject simple tasks and add automation to my
workflow, the higher an abstraction I'm able to climb to and the more
effectively I'm able to work. Our goal in programming is to climb an
abstraction ladder and gain more intellectual leverage. This requires thought
and consistent attention.
Fortunately AI can help with this. If you complain and say "I'm always doing
X" it'll suggest solutions like what I'll talk about below, but more tailored
to your situation.
Tip: Hooks¶
AI developers, like human developers, benefit from structure.
Most people start with an AGENTS.md or CLAUDE.md file. This is a great
start, but I find that the AI agent often forgets what's in there. The real
solution for me here (at least for Claude Code) is
Hooks.
First, let's outline a couple of annoyingly common problems.
Example Problem: Ignoring instructions in CLAUDE.md¶
Let's say you tell AI that you want to run tests with uv:
when running tests, use uv run pytest tests
While this works sometimes, AI often decides to run
While the agents read CLAUDE.md, they don't always follow the instructions.
And so you're stuck saying "no, use uv" over and over again. Gah.
Solution: Hooks¶
Here's a hook that catches pytest commands missing uv run. You could put
something like this in ~/.claude/settings.json:
There, we've just automated that annoying task for you forever.
I don't actually do this though (I allow Claude to fail and then it finds the
right approach.) Mostly this works because I've gotten good at giving Claude
fairly broad-yet-safe permissions, which is coming up next.
Example Problem: Incomplete Permissions¶
Even worse, Claude often asks for permission to do things that are just
slightly different from what you've already granted.
You allow uv run pytest *, but Claude keeps finding variants:
Claude Code's permission language sucks. It only supports prefixes, while I
wish it could handle regexes, or maybe even just arbitrary Python code.
Solution: Hooks for permissions¶
I have a complex Python script as a hook which overrides the permission
system. It uses regexes, but also arbitrary Python code as logic. This allows
me to encode arbitrary combinations of rules. It's great.
On the rare occasion when Claude asks me for permission for something new, I
have a running Claude agent that thinks about this file and considers if it
should update the permission script.
Solution: Hooks for sounds¶
My personal favorite hooks though are these:
They play subtle little sounds whenever Claude is either done, or needs input
from me. This lets me ignore Claude when it's busy. Previously I found that I
was constantly checking back in with Claude to see if it was done, and that
action was dehumanizing, so I automated it by asking Claude to play a sound.
Hooks are great. There are more ways to provide structure (Skills, Commands)
but I've found that Hooks are the most dependable, a great starting place, and
often augment any other structure that I put in place (like Skills).
Big Idea: Build Confidence Without Looking at Code¶
In a recent large AI-assisted PR a frustrated reviewer said the following:
To me, this [size of PR] implies that either
It's a valid problem, even in single-person projects. We're able to generate
code far more quickly than we're able to read it. How should we handle
review? Everyone needs to figure this out for themselves, but my answer is
"find other ways to build confidence".
We already do this today with human-written code. I review some code very
closely, and other code less-so. Sometimes I rely on a combination of tests,
familiarity of a well-known author, and a quick glance at the code to before
saying "sure, seems fine" and pressing the green button. I might also ask
"Have you thought of X" and see what they say.
Trusting code without reading all of it isn't new, we're just now in a state
where we need to review 10x more code, and so we need to get much better at
establishing confidence that something works without paying human attention all
the time.
We can augment our ability to write code with AI. We can augment our ability
to review code with AI too.
Tip: Self-review¶
Testing¶
Mostly I establish confidence on AI-generated work by investing heavily in
tests and benchmarks, the same as I would with humans, just moreso. TDD is
baked into most of the prompting structure I have with agents.
Remember that this is way cheaper than it used to be. Now rather than write a
benchmark I can type
How does this compare in performance to the old version? I'm particularly
interested in memory use.
And that's it. If it's bad, the agent will say so (and then diligently work to
make it good).
Grilling¶
Additionally, if I'm nervous about something subtle like "Is it possible this change
might unexpectedly affect performance in this other feature?" then I'll ask the
AI exactly that question:
Is it possible this change might unexpectedly affect performance in this other feature?
And it'll just go and investigate exactly that question. Unlike human authors,
the AI has no ego at stake in its work, and isn't in the least bit lazy. It's
our job to ask "Have you thought of X" and its job to go learn if that might
be an issue. Don't trust its answer? Ask it to prove it to you.
AI has flaws, but it is diligent, and it lacks ego. If you question it, it'll
investigate thoroughly and critique its own work honestly.
Simplifying¶
Also, my favorite command:
Let's review our work and see if there is anything we can simplify or clean up
Before Opus 4.5 came out this was essential. Now it's merely nice. I've
turned this into a /cleanup command and integrated it into most of my Skills
as a final phase in development.
Tech debt¶
From time to time I also ask a fresh agent to do a full review of the project,
with an eye to cleaning up technical debt. I tell it to review everything and
think hard. It takes a while, but it often comes back with a nice list of work
for itself, which it then of course diligently performs.
AI creates technical debt, but it can clean some of it up too.
(at least at a certain granularity)
Feedback¶
In general we want to give our agents good automated feedback. Tests do this,
benchmarks do this, prompting them to assess themselves does this, asking them
to explain things to us and have us weigh in on high level topics does this.
LLMs are smart enough today that if they're given enough of the right feedback
they converge to a good solution as-well-or-better-than a senior human engineer
(that's my experience at least).
Our job is to construct a system that gives them the right feedback at the
right time, hopefully without our intervention. This is the same job we have
when we build human teams; now it's just more impactful to do well.
Cursor vs Terminal Tools¶
I started AI development with Cursor. It was great having the AI experience
inside a VSCode-like editor, where I could see everything that was going on.
When I saw terminal-based tools like Claude Code I thought "whoa, that doesn't
seem sensible, I need to see what's going on".
Today I code with Claude Code, git diff, and occasionally vim. I don't
feel a need to OK every change in the diff. I've got more important
things to do. I suspect that you do too.
Big Idea: Drop Python. Use Rust and TypeScript.¶
I deeply respect the philosophical position of Python, which I'll state as
follows:
Prioritize human performance over compute performance.
By optimizing for ease and iteration speed we're able to search solution
space more broadly and more quickly, finding much better solutions, making
that 100x drop in performance negligible.
Python was a bold bet, and a bet that paid off amazingly well. No one expected
this silly dynamic language originally designed for education to become the
world's juggernaut in performance software.
With AI though, the usability benefits of Python no longer apply as strongly,
and we're more free to choose different ecosystems.
Personally, I use ...
Regarding TypeScript, I still love easy interaction tools like rich and
textual, but when the entire React ecosystem is a sentence away and when you
get to use things like, you know, fonts, there's really no comparison. Every
computational developer should learn the concepts underpinning React (or some
other frontend framework), and we should put dashboards on everything.
Of course, I still hook into Python for the ecosystem. Everything is
Python-importable and I still use the protocols and design patterns developed
by the Python data community. Those are the durable assets of Python. Not the
code or the language; those will die. Rest in peace dear friend.
Big Idea: Think Hard. Write Clearly.¶
As an introductory project, I rewrote Numpy in Rust.
It was great fun.
It was also much easier than I expected (I expected it to be impossible).
It was easy for a few reasons (good test suite, well-reasoned abstractions) but
mostly it was because:
NEPs: Numpy's Enhancement Proposals / design documentation is thorough and extremely clear.
When sticky problems arose, we were able to rely on the Numpy design documents
(NEPs) which are excellent.
The Numpy team thought hard and wrote clearly, two hallmarks of
excellent developers. This made the job of reimplementation relatively trivial.
The Numpy development community is famous for doing this well. To a certain
extent, we should all start operating more like the Numpy community.
Tip: plans/ and docs/ directories¶
I keep two directories in each repository:
Plans end up being very useful during development, while docs end
up being useful to point other agents to in the future. Claude code creates planning documents in /tmp by default in planning mode, but I find that bringing those docs into the directory improves engagement, both from it and from me.
Docs end up being tricky. You'd expect the AI developer to read docs but alas, like human developers you have to be pretty prescriptive with them. Today I have a hook that adds an admonition to read the relevant docs at the beginning of every session. It looks like this:
I then keep docs/README.md updated as a sort of index over my documents. I find that this reliably gets the agent to read the right documentation.
I've also found that my normal writing style (brutal concision + front-loading important content to maintain attention span) isn't necessary with AI. You really can just shove information at them and they absorb it. It's nice 🙂
Big Idea: Take Long Walks¶
Historically software engineers had to both think well and execute well.
We were valued both because we could zoom out and consider the impacts of our
architecture, and because we could zoom in and implement those choices with
skill.
Our ability to zoom in and implement code is now obsolete. Our ability to zoom
out and think well is not. On the contrary, our ability to think well is now
10x more valuable than it was before, because implementation is now mostly
free.
And so it's now more important than ever to hone our craft of thought. This
probably means less caffeine and more walks through the park.
Final Thoughts¶
The craft of authoring code has transformed time and time again during our
lives. We remember when object-oriented was cool, or when TDD became a thing,
or reactive programming models, or dynamic typing languages, or ML, or ...
As programmers we've opted into a system which changes by its very nature.
Our job is to automate our job, and to continuously climb the ladder of
abstraction. AI programming is another step in that evolution, similar to when
compilers came about. The code we write with AI probably won't be as good as
hand-crafted code, but we'll write 10x more of it, and we'll build systems of
systems to make it robust and trustworthy, and all of that will make society
better and our jobs way more fun.
I'm looking forward to having way more fun.
Appendix: Permissions file¶
After writing this a couple friends asked me for a copy of my regex/Python code
that replaces Claude's permission system. I'll include it below, but really,
you don't need it. Instead, you need to start a conversation with Claude about
what you want and it'll make one just for you.
Code is free these days. Extending the "AI is like Compilers" analogy, asking
for someone else's script is kind of like asking for someone else's compiled
binary. There's no need; just make it yourself. It's trivial.
Here was my original prompt to Claude Code:
I recently wrote this reddit post
https://www.reddit.com/r/ClaudeAI/comments/1puqrvc/claude_code_annoyingly_asking_for_permissions/
I'm wondering if you have any suggestions on how to resolve this? Adding stuff to CLAUDE.md or permissions to settings.json doesn't seem to be working well enough.
That, along with subsequent conversation as I've been working, resulted in
this Python script
But really, you're better off working with Claude to make one just for you.
Code is free now.
Comments
相關文章