Agentic Coding is a Trap
自主编码是个陷阱
Remaining vigilant about cognitive debt and atrophy.
时刻警惕认知负债和认知衰退。
"AI does the coding, and the human in the loop is the orchestrator"
“AI 负责写代码,人类在循环中担任编排者”
This is the sentiment being hyped up around the industry currently: traditional coding is all but dead, and Spec Driven Development (SDD) is the future. You generate a plan, and disconnect from writing any code. The agents know better, and handle all the implementation. You are there as the expert, to provide "good taste", review the outputs, and constantly steer the agent(s) to execute the plan that you meticulously put together.
这是目前整个行业被大力炒作的观点:传统编码已死,规范驱动开发(SDD)才是未来。你制定计划,然后完全不碰任何代码。智能体更聪明,它们搞定所有实现。你作为专家,只需提供“好品味”,审查输出,并不断引导智能体去执行你精心制定的计划。
The workflow takes many shapes at this point, but in general, it is a process where someone defines the project's requirements (simultaneously at a micro and macro level), generates a plan, and then pulls the slot machine lever over and over, iterating and reiterating with often multiple agent instances until it's done. All the while, putting a growing distance between the "orchestrator" and the code that is being generated and committed.
到了这一步,工作流会演变出各种形态,但大致流程都是:某人先定义项目的需求(微观和宏观层面同步进行),生成计划,然后像拉老虎机杠杆似的,反复拉动,不断迭代,常常同时启动多个智能体实例,直到搞定为止。而在这个过程中,“编排者”与正被生成、提交的代码之间的距离,也在悄然拉大。
Coding Agents are helpful, and powerful, but there's already some quantifiable trade-offs that need to be discussed:
编码智能体确实好用,也很强大,但有些实实在在的取舍,现在就得拿出来聊聊:
- An increase in the complexity of the surrounding systems to mitigate the increased ambiguity of AI's non-determinism.
为了应对 AI 非确定性带来的模糊性加剧,外围系统的复杂度也跟着水涨船高。 - Atrophying skills for a wide swath of the population.
大部分人掌握的 skills 在慢慢萎缩。 - Vendor lock-in for individuals and entire teams (Claude Code outages have already had entire teams at a stand-still).
供应商锁定对个人和整个团队而言(Claude Code 的故障已经让整个团队彻底停摆)。 - Fluctuating and increasing costs to access the tools. An employee's cost is fixed; tokens are a constantly moving target.
访问工具的成本忽高忽低,还在不断上涨。员工的薪资是固定的,而 token 消耗却像移动靶一样难以捉摸。
Being successful with this approach to coding agents hinges on a rather crucial element: only a skilled developer who's thinking critically, and comfortable operating at the architectural level, can spot issues in the thousands of lines of generated code, before they become a problem.
要用这套方法搞定编码智能体(coding agents),成败取决于一个关键因素:只有具备批判性思维 skills、并且能在架构层面从容操作的资深开发者,才能在数千行生成的代码中、在问题爆发之前就发现隐患。
Yet, in an ironic twist of fate, it's the individual's critical thinking skills and cognitive clarity that AI tooling has now been proven to impact negatively.
然而,命运偏偏开了个讽刺的玩笑——研究证明,AI 工具反而对个人的批判性思维 skills 和认知清晰度造成了负面影响。
Not Just Another "Abstraction"
不只是又一种“抽象”
A common refrain we hear in the community is that programmers are just "moving up the stack" and into a different type of abstraction. Whether or not these tools are really an abstraction layer in the first place is not a settled matter; a higher level of ambiguity is not a higher level of abstraction.
社区里常听到一种说法,程序员不过是在“往栈上走”,换了一种抽象形式而已。但这些工具到底算不算真正意义上的抽象层,其实还不好说——模糊度更高并不等于抽象级别更高。
If we put that to the side though, it is true that programmers tend to be wary of new languages and new ways of programming. When FORTRAN was released, programmers were skeptical of it, too. They had similar claims: it was likely to introduce more bugs and instability, and writing assembly directly was more efficient. Later, there would be discourse around the integration of compilers introducing too much "magic" into the process. These were normative arguments around a fear of what might be lost if these new technologies were embraced.
不过撇开这点不谈,程序员确实容易对新语言和新编程方式心存戒备。当年 FORTRAN 刚出来的时候,程序员们也一样不买账,说法跟现在如出一辙:这东西容易引入更多 bug 和不稳定,直接写汇编反而更高效。后来编译器普及那会儿,又有人吐槽它往流程里掺了太多“黑魔法”。这些争议本质上都是怕拥抱新技术会失去什么。
The difference with what is happening today is that those previous fears were speculative and theoretical. In just the short few years that AI tooling has existed, we are already seeing significant impacts. These aren't just junior developers, but even those with a decade (or more) of experience:
但今天的情况跟过去不一样——以前的那些担忧更多是猜测和理论层面的。AI 工具才出现短短几年,我们就已经看到了实打实的影响。受冲击的不只是初级开发者,就连有十年以上经验的老手也未能幸免:
Junior developers are faced with an even steeper climb, as we truncate their ability to work with code and replace it with reviewing generated code. Reviewing code is important, but it's only 50% of the learning process, at best. Without the friction and challenges that come with working with code directly, their ability to learn is seriously diminished.
初级开发者面临的爬坡难度更大了,因为我们削弱了他们直接编写代码的能力,转而让他们去审查生成的代码。审查代码固然重要,但充其量只占学习过程的 50%。缺少了直接跟代码打交道带来的摩擦和挑战,他们的学习能力被严重削弱了。
Studying this phenomenon takes time, so anecdotal evidence is important to gather to get a real-time view of the situation. But it has also been studied, and there are numerous reports reinforcing that this is a real phenomenon.
研究这种现象需要时间,所以收集个人经验证据来实时了解情况很重要。不过这种现象也有过研究,大量报告都证实它确实存在。
It actually is different this time.
这次真的不一样了。
When a C++ developer moved to Java or Python, they didn't complain of brain fog. When a sysadmin moved to AWS, they didn't feel like they were losing their ability to understand networking.
当 C++开发者转到 Java 或 Python 时,他们可没抱怨过脑雾;当系统管理员迁移到 AWS 时,他们也没觉得自己的网络理解能力在退化。
A Senior Engineer losing their coding edge and becoming "rusty" over time as they move into managerial roles and practice coding less is not a new phenomenon. This was the natural progression of expertise: an engineer who had decades of coding, friction, and experience logged would have the time and experience to solidify those skills and wisdom. And they could apply that wisdom when their job became less about syntax, and more about higher-level architectural decisions. Those individuals are not only exceedingly rare, but you won't get the next wave of seniors if we're all abdicating the friction of writing, problem-solving, and debugging.
资深工程师随着转向管理岗位、编码实践减少,编程能力逐渐退化、变得“手生”,这并非新鲜现象。这曾是专长自然演进的路径:一位积累了数十年编码、实战磨砺和经验的工程师,有足够的时间和阅历将这些技能与智慧沉淀下来。当他们的工作重点从语法细节转向更高层次的架构决策时,就能运用这些智慧。这类人才不仅极为罕见,而且如果我们都放弃亲历亲为的编写、解题和调试过程,就不可能培养出下一批资深工程师。
What is happening right now is a trend where developers, who've never had that longevity or the 30+ years of friction that led to that deep understanding, are being moved into higher-level workflows requiring the same skills to manage the AI agents that the senior engineer took decades to obtain.
当下正在发生的一种趋势是:那些从未经历过同样漫长的资历积累、也没有花 30 多年时间在实战磨砺中获得深层理解的开发者,正被推入需要相同技能的高层工作流中——去管理那些资深工程师花了数十年才掌握的 AI 智能体。
However, Senior Engineers aren't immune, either. Simon Willison, a developer with nearly 30 years experience, has reported not having a "firm mental model of what the applications can do and how they work, which means each additional feature becomes harder to reason about"
不过,资深工程师也并非免疫。拥有近 30 年开发经验的 Simon Willison 曾报告称,自己“对应用程序能做什么以及如何工作没有清晰的思维模型,这意味着每增加一个功能,推理起来就更困难”。
The "Skilled" Orchestrator Problem
“熟练”编排者的困境
Buried in a recent study by Anthropic was a surprisingly honest moment when speaking about the risks of engaging with coding agents on a regular basis:
在 Anthropic 近期的一项研究中,有一段关于定期使用编码代理的风险的表述,出人意料地坦诚:
One reason that the atrophy of coding skills is concerning is the “paradox of supervision” ... effectively using Claude requires supervision, and supervising Claude requires the very coding skills that may atrophy from AI overuse.
编码技能萎缩令人担忧的原因之一是“监督悖论”……有效使用 Claude 需要监督,而监督 Claude 恰恰需要那些可能因过度使用 AI 而萎缩的编码技能。
Sandor Nyako, Director of Software Engineering at LinkedIn who oversees 50 engineers, has noticed it proliferating throughout the organization and requested his team not to use them for "tasks that require critical thinking or problem-solving."
领英软件工程总监 Sandor Nyako 管理着 50 名工程师,他注意到这种趋势在整个组织内蔓延,于是要求团队不要将其用于"需要批判性思维或问题解决的任务"。
"To grow skills, people need to go through hardship. They need to develop the muscle to think through problems," he said. "How would someone question if AI is accurate if they don't have critical thinking?"
“要提升 skills,人就得经历磨难。他们得锻炼思考问题的能力,”他说,“如果一个人没有批判性思维,又怎么能质疑 AI 的准确性呢?”
There is also the question of what constitutes "overuse". We already have evidence, both data-driven and anecdotal, that these skills can atrophy and dissipate rather quickly (within months in some cases).
还有个问题,就是什么才算“过度使用”。我们已经有了证据,既有数据驱动型的,也有来自实际经验的,表明这些 skills 会很快萎缩、消失——有时甚至几个月就没了。
This is the contradiction that has many AI boosters talking out of both sides of their mouths: The use of coding agents is actively diminishing the very skills needed to effectively manage the coding agents.
这就是让很多 AI 鼓吹者自相矛盾的悖论:编码代理的使用,恰恰在削弱有效管理编码代理所必需的 skills。
LLMs accelerate the wrong parts.
LLMs 加速了错误的部分。
Contrary to the current narrative that is being espoused, we didn’t necessarily need to write code faster. Especially code we didn’t fully understand, and particularly in huge swaths that we couldn't review in reasonable time frames.
与当下主流鼓吹的说法相反,我们真没必要把写代码的速度提那么快——尤其是那些我们自己都没完全吃透的代码,更别提动辄大段大段、根本没法在合理时间内审完的代码了。
Before AI, a (good) developer's priority list might look like:
在 AI 出现之前,一个(优秀)开发者的任务优先级大概是这样:
- Understanding of the code and its relation to the codebase
对代码的理解及其与代码库的关系 - If the code is aligned with the documented and efficient standards
如果代码符合文档规范且高效 - As few lines of code as needed to accomplish the goal (while maintaining readability)
用尽可能少的代码行数实现目标(同时保持可读性) - Turnaround time 周转时间
Agentic coding, and LLMs in general, completely invert this list.
智能体编程以及 LLMs,完全颠覆了这份清单的顺序。
Their capabilities and usage tend to focus on speed by increasing the amount of code that can be generated in a specified time frame. Speed is a natural byproduct of high aptitude. When it's forced, it always leads to lower accuracy. The integration of these tools doesn't tend to focus much on deeper understanding or conciseness.
它们的能力和使用往往倾向于通过增加在指定时间内可生成的代码量来追求速度。速度是高水平能力的自然副产品。强行追求速度,总会导致准确性下降。这些工具的集成并没有太注重更深层次的理解或简洁性。
Can they be used that way? Yes, with determination, they certainly can be.
能这么用吗?当然,只要有决心,绝对可以。
Are they? No, not really; forced mandates and hype around token usage across organizations is demonstrating as such.
是吗?其实并不是;各组织强制推行代币使用,以及围绕代币的炒作,已经证明了这一点。
Coding === Planning 编码即规划
There is a divide between developers that isn’t highlighted as much: Some of us plan, and think, better with code. Thinking and working in code isn't just meaningless drudgery; it forces you to think about things on a technical level that involves everything from security to performance to user experience to maintainability.
开发者之间有一个不怎么被提及的分歧:有些人用代码来规划和思考,效果反而更好。用代码思考和创作并非毫无意义的苦差事——它迫使你从技术层面通盘考虑问题,涉及安全性、性能、用户体验乃至可维护性等方方面面。
In a recent interview discussing "Spec Driven Development", Dax, the creator of OpenCode (an open-source coding agent, no less) was quoted saying:
最近在一次聊"Spec Driven Development"的采访里,OpenCode(那可是个开源的编码代理)的创始人 Dax 说了这么句话:
“When working on something new or something challenging, me typing out code is the process by which I figure out what we should even be doing.
“当着手做新鲜事或富有挑战的事情时,我敲代码的过程恰恰就是搞明白我们究竟该做什么的过程。”
I have a really tough time just sitting there, writing out a giant spec on exactly how the feature should work.
我很难干坐在那里,把功能的具体实现细节写成一大份规格说明书。
I like writing out types. I like writing out how some of the functions might play together. I like playing with folder structure to see what the different concepts should be. And this is all stuff that I think most people—most programmers—have always done. I don't really see a good reason why I would stop that personally, because it's how I figure out what to do.”
“我喜欢写类型定义。我喜欢琢磨函数之间如何协同工作。我喜欢摆弄文件夹结构,看看不同的概念该怎么组织。而且我觉得大多数人——大多数程序员——一直都是这么做的。我个人实在想不出有什么好理由要停下这种习惯,因为这就是我搞清楚该怎么写代码的方法。”
What you say is often not what you mean, and LLMs fill in ambiguity with assumptions (or hallucinations), which leads to: more review, more agent revisions, more tokens burned, and more disconnection from what is being created. Inversely, You can marvel at the most beautiful, unambiguous, perfectly structured prompt you've ever written, and the LLM can still output a hallucinated method because it is fundamentally a next-token-prediction engine, not a compiler. You cannot replace a deterministic system with a probabilistic one and expect zero ambiguity.
你说的话往往并不是你的本意,而大语言模型会用假设(甚至幻觉)来填补歧义,结果就是:需要更多审查、更多代理修订、烧掉更多 token,并且与你真正想创建的东西越来越脱节。反过来看,就算你写出了一生中最漂亮、最无歧义、结构最完美的提示词,大语言模型仍然可能输出一个凭空捏造的方法——因为它本质上就是个基于下一个 token 的预测引擎,而不是编译器。你不能用一个概率系统去替代确定性系统,还指望零歧义。
Even the most AI-enthusiastic senior developers are starting to see this disconnection as a looming and growing issue.
即便是最热衷于 AI 的资深开发者也开始意识到,这种脱节正成为一个日益逼近且不断加剧的问题。
"Vendor Lock-In" "供应商锁定"
When I was browsing LinkedIn during the Claude outage that occurred a bit ago, I noticed numerous posts highlighting that certain developers and engineering teams were at a standstill. Their workflows, their own coding abilities, had already reached a point where they were largely dependent on these vendors. What used to be a skill that they could execute with just a keyboard and text editor suddenly required a subscription to an AI model provider.
前阵子 Claude 服务中断那会儿,我在刷领英的时候看到好多帖子都在说,有些开发者和工程团队的工作彻底停摆了。他们的工作流也好,自己的编码能力也好,已经发展到严重依赖这些供应商的程度。过去只需要键盘和文本编辑器就能搞定的技能,现在突然得去买 AI 模型提供商的订阅才能接着干。
It's not unreasonable to play this pattern forward, where we could be creating an industry where you need to pay for token consumption to accomplish something that used to be the product of your own critical thinking and problem-solving abilities. This would resemble a type of "vendor lock-in", but for an entire industry skillset (and I'm sure the model providers are gleefully rubbing their hands in anticipation for that). The financial, and intellectual, rug-pull could come at any moment, and local LLMs are nowhere near ready to scale to absorb that level of usage. So, while I am not referring to specific vendors of the model providers in particular, the term is still tangentially related to the phenomenon we're seeing with over-reliance on these systems.
顺着这个逻辑推演下去并非毫无道理:我们可能正在创造这样一个行业——你得付费消耗代币,才能完成那些原本靠你自己的批判性思维和解决问题的能力就能搞定的事。这有点像某种“供应商锁定”,但锁定的却是整个行业的技能栈(我敢肯定,模型提供商们正搓着手、偷着乐,盼着这一天呢)。财务上和智力上的“抽毯子”随时可能发生,而本地 LLM 还远远没准备好,无法规模化地扛住那种级别的使用量。所以,虽然我并没有特指哪家模型供应商,但这个术语还是与我们看到的过度依赖这些系统的现象有着千丝万缕的联系。
This isn't theoretical conjecture; it's being reported on right now. Even the model providers themselves are bringing it to light. Yet another Anthropic study showed a precipitous 47% drop-off in debugging skills:
这可不是什么理论推测,而是眼下正在被报道的事实。甚至连模型提供商自己都在揭露这个问题。Anthropic 的另一项研究显示,调试技能骤然下降了 47%:
“Incorporating AI aggressively into the workplace—especially in software engineering—inevitably comes with trade-offs...developers may lean on AI to deliver quick results at the expense of building critical skills—most notably, the ability to debug when things go wrong.”
在职场中激进地融入 AI——尤其是在软件工程领域——必然伴随着取舍……开发人员可能会依赖 AI 快速交付成果,却以牺牲培养关键技能为代价——最典型的就是当问题出现时进行调试的能力。
You can't predict your token cost.
你根本无法预测你的代币成本。
Model providers are heavily subsidized, and the models themselves are built on shifting sands. Every new model release follows the same pattern of high benchmarks, followed by hype, followed by the reality of usage and everyone complaining of them being "nerfed" and burning through 2x-3x as many tokens to get the same job done.
模型提供商受到大量补贴,而模型本身却建立在流沙之上。每次新模型发布都遵循同样的套路:一开始基准测试分数高得惊人,接着被吹得天花乱坠,然后用户实际用起来发现根本不是那么回事,纷纷抱怨被“削了”,并且完成同样的任务要多烧 2 到 3 倍的 token。
You know how much your employees cost; you have no idea how much your token costs will be day to day, month to month, year to year. If your entire team is using agentic coding as the default, your expense account will need to remain highly nimble. As Primeagen said recently: "when you use these fully agentic workflows, the model providers essentially own you".
你知道雇员工资是多少,但你根本不知道每天、每月、每年 token 会花掉多少。如果整个团队默认都用智能体编码,那你预算账户得时刻保持高度灵活。就像 Primeagen 最近说的:“当你全面采用这些智能体工作流时,模型提供商实际上就掌控了你。”
There's a way to avoid all of this, of course. LLMs are a powerhouse technological advancement, and when used responsibly, they can be a stellar tool for learning and upskilling. They enable me to dive deeper and wider into concepts and techniques, expanding understanding and enabling exploration of new ideas that used to be more arduous and time consuming to experiment with. This is where I think they will offer the industry the most long-term value.
当然有办法避免这一切。LLMs 是一项强大的技术突破,只要用得靠谱,它们就能成为学习和提升技能的绝佳工具。它们让我能更深入、更广泛地探索各种概念和技术,既拓宽了理解,也让我更轻松地去尝试那些过去费时费力的新思路。我认为,这才是它们能给这个行业带来的最长远的价值。
My Approach: Demote AI's role
我的方法:弱化 AI 的角色
I'm certainly not advocating for typing code out manually. Programmers have always been looking for ways to create code without having to write code. This is why we even have Emmet, autocomplete, and snippets in the first place. Even COBOL was designed to encapsulate more instructions with less writing by using "English-like" words such as MOVE and WRITE. jQuery's motto was "write less, do more". LLMs are another addition to this array of code generation tools.
我绝不是主张手敲代码。程序员们一直在寻找不用写代码就能生成代码的方法。这也就是为什么我们会有 Emmet、自动补全和代码片段这些工具。就连 COBOL 当时设计的时候,也是想用 MOVE 和 WRITE 这样"像英语"的词汇,来用更少的代码封装更多的指令。jQuery 的座右铭是"写更少的代码,做更多的事"。LLMs 不过是往代码生成工具这个大家庭里又添了一位新成员罢了。
What I am advocating for, though, is leveraging LLMs and coding agents as secondary processes. A way that doesn't sacrifice the individual's skills at the altar of productivity. You can flip the script and lean on them to brainstorm the planning parts of the process while staying actively engaged throughout implementation, delegating to them on an as-needed basis. You can leverage the productivity gains, and mitigate the comprehension debt.
我所提倡的,是把 LLM 和编码代理当作辅助流程来用——一种不会以牺牲个人 skill 为代价来换取效率的方式。你可以换个思路,让它们帮你构思流程中的规划部分,同时在执行阶段保持主动参与,按需委派任务。这样既能享受生产力提升,又能减少理解上的负债。
My daily workflow: 我每天的工作流程:
- I use LLMs to help generate specs and plans, while I facilitate the implementation. This is an inversion of the "orchestration" workflow; I am still manually coding anywhere from 20% to 100%, depending on the task.
我用 LLMs 来帮忙生成规格和计划,而我则负责推动实现。这颠覆了传统的“编排”工作流;我仍然需要手动编码,比例从 20%到 100%不等,取决于具体任务。 - I very often am writing pseudo-code when I do engage with the models, closing the distance between the request and the generated code.
当我与模型交互时,我经常是在写伪代码,从而拉近请求与生成代码之间的距离。 - I use the models as delegation utilities for ad-hoc code generation and interactive documentation, as well as research tools so that I can constantly ask questions, iterate, refactor, and gain clarity around my approaches.
我把这些模型当作委派工具,用于临时代码生成和交互式文档,同时也是研究工具,这样我就能不断提问、迭代、重构,并厘清自己的思路。 - I never generate more than I can review in a sitting. If it's too much to review, I slow down and split the task up, manually refactoring where needed to ensure a comprehensive understanding of the end result.
我生成的内容从来不会超过一次能审完的量。要是审不过来,我就放慢速度,把任务拆开,手动重构需要调整的地方,确保对最终结果有全面的理解。 - I never ask an LLM or agent to implement something that I've never done before or couldn't do on my own, except perhaps purely for educational or tutorial purposes (and often discarded afterwards).
我从不要求 LLM 或智能体去实现我未曾亲自做过或无法独立完成的事情,除非纯粹出于教学或辅导目的(而且之后往往会丢弃这些代码)。
If I had to TL;DR this list, it would be: Use them like the Ship's Computer, not Data.
如果让我一句话概括这份清单,那就是:像使用星舰电脑那样使用它们,而不是像 Data 那样。
(any Star Trek fans should get the reference)
(任何《星际迷航》粉丝应该都能明白这个梗。)
I'm not going faster, but I'm doing better quality work.
速度没变快,但工作质量更高了。
The productivity gains from these models are real, and so is the friction and understanding that come from engaging with the work on a tangible and frequent basis.
这些模型带来的效率提升是实打实的,而通过具体且频繁地参与工作所产生的摩擦和理解也同样真实。
Despite the countless failed attempts at trying to democratize coding while not understanding coding, we're faced with the reality that you cannot understand code without engaging with it. And it's become clear that if you don't keep engaging and writing it, you can lose touch with that understanding, which will in turn make you a less capable orchestrator in the first place, rendering this phase of AI coding a strange and needlessly stressful interlude.
尽管无数次试图在不了解编程的情况下将其大众化的尝试都失败了,但我们不得不面对一个现实:不亲自上手就无法真正理解代码。而且显而易见的是,如果不再持续参与和编写代码,你便会逐渐丧失这种理解力,这反过来会让你一开始就成为一个不那么称职的统筹者,从而使 AI 编程的这个阶段变成一场奇怪且徒增压力的插曲。
Perhaps I am worrying too much, but history contains lessons.
也许我担心过头了,但历史自有教训。
This all feels similar, though, like another large experiment we're running on ourselves. We've been through a similar period with the introduction of social media without understanding the long-term implications, and we're now faced with attention deficit (amongst many other issues) on a wide scale.
这听上去似曾相识,就像我们又在自己身上搞了一次大型实验。当年引入社交媒体时,我们也走过这么一段路,根本没想清楚长远的后果,结果现在好了,注意力缺失(外加一堆其他毛病)已经成了大面积的普遍问题。
This time, we're gambling with something much riskier.
这次,我们赌的风险可大多了。
“People who go all in on AI agents now are guaranteeing their obsolescence. If you outsource all your thinking to computers, you stop upskilling, learning, and becoming more competent.”
“那些现在全力押注 AI 智能体的人,其实是在为自己创造被淘汰的命运。如果你把所有思考都外包给电脑,你就停止了技能提升、学习和成长,不再变得更强大。”
– Jeremy Howard, creator of fast.ai
——杰里米·霍华德,fast.ai 的创始人