Finding your value as a software engineer

When code is free, what is the engineer paid for.

I have used coding agents long enough that the novelty has worn off and a less comfortable question sits where it used to be. What am I doing here, exactly?

I can prompt my way to a working app in an afternoon. Components, routes, state, half-decent styling, all generated, often faster than I would have typed the boilerplate. Every time I ship something this way the same thought lands. If a tool can do most of what I do, what is left that is actually mine.

This is my attempt at an honest answer.

Code is becoming free

The economics shifted and most teams have not priced it in yet.

For most of the history of software, code was the scarce, expensive thing. You hired engineers because writing code was hard, slow, and required years of training. The output of an engineer's day, the lines committed, was a meaningful unit of value.

That is no longer true. The marginal cost of producing a plausible-looking line of code has collapsed to roughly zero. I can describe a feature and get a working implementation in seconds. I can ask for three variations and pick one.

Plausible-looking code is also not the same as shipped value, and developers themselves turn out to be bad at telling the difference. METR's randomised trial of experienced open-source developers in mid-2025 found that engineers using AI tools believed they were twenty percent faster. They were measurably nineteen percent slower.

Their follow-up in February 2026 tried to walk the result partway back as tools and habits matured, but the researchers called the new data unreliable. Between thirty and fifty percent of participants refused to do tasks without AI. A study designed to measure how much AI helps could not run its no-AI condition, because the developers would not work that way. That is itself a data point.

Nobody has a clean number yet for whether these tools speed up experienced engineers on real work. What survives across both studies is the perception gap. Developers cannot accurately tell, in the moment, whether the tools are speeding them up.

When supply explodes, value moves elsewhere. The question is where.

Three million lines

Earlier this year the CEO of Cursor posted on X that his team had built a browser from scratch with GPT-5.2. Hundreds of agents running uninterrupted for a week, three million lines of Rust across thousands of files, a from-scratch HTML parser, CSS cascade, layout, paint, and a custom JavaScript VM. He said it "kind of works."

Anyone who has shipped a browser, or read enough of one, knows this is not what shipping a browser looks like. A Servo maintainer called the design "uniquely bad" and incapable of ever supporting a real-world web engine. Most of the work in a real browser is not the rendering pipeline you can name in a post. It is the long tail of standards, accessibility, security, and the twenty years of edge cases the public web contains. Three million lines is not progress toward that.

The model did not know the work was misframed. The CEO either did not know or did not say. The recognition that something is wrong, even when it compiles and "kind of works" and the line count is impressive, is the part that was missing. That gap, between plausible and correct, is where the value lives now.

It was technically coherent. It was a press release in the shape of a codebase.

Knowing what good looks like

A lot of what fills the gap is taste. The ability to feel, before you can fully articulate why, that a button is in the wrong place, that an API surface will be painful to live with, that a particular abstraction will bite you in six months.

Taste is hard to defend on a resume. You end up sounding like Rick Rubin in that clip everyone passed around: "I have no technical ability, and I have minimal knowledge of music." Fine if you are producing the Beastie Boys. Less compelling when you are explaining to your manager why you felt the dropdown should go on the left.

But taste is not magic. It is pattern recognition built from years of shipping things, watching them break, and learning what actually works. The mechanism looks like intuition from outside. It is compressed experience.

That compression is what the model does not have. It has the average of every opinion ever expressed. You have yours, sharpened against reality.

But isn't this just cope?

The version of this argument I find least convincing is mine. There is an obvious counter. Senior engineers have a strong incentive to insist that the part they are good at, the judgment and the taste, is the part that survives. Models eat the typing, our jobs are safe, the kids will be fine. That is a comfortable story for someone whose career was built before models could ship plausible code on demand.

The counter is partly right. Some of what gets called "judgment" is preference dressed up. Some of it will get absorbed by better models faster than the seniors writing thinkpieces would like. Anyone telling you the line between what a model can do and what an engineer does is fixed is selling something.

What I will defend is narrower. Recognising that a result is wrong, when nothing in the result is obviously wrong, is hard, and getting harder as outputs get more polished. That is the failure mode the Cursor browser demonstrates. Whether you call it judgment, taste, or scar tissue, someone has to do it. Right now that someone is a person who has shipped enough to know.

In ten years that may not be true. For now it is.

The junior engineer problem

The part I am worried about is what this does to people entering the field now.

Junior engineers have a real problem now. The instincts I am describing only come from doing work that goes wrong and then fixing it. The model is eating the grunt work that used to build those instincts. The reps are how you build the model in your head, and the agent makes it very easy to skip the reps.

There is a second-order problem too. Junior engineers used to learn taste by reading the code around them: production code, peer PRs, internal libraries written by someone slightly more senior. A lot of that code was mid-quality. Reading mid-quality code is how you learn what bad smells like, because you are the one cleaning it up six months later.

As more code in any given codebase is generated and accepted on the basis of "looks reasonable," the supply of mid-quality code juniors used to learn from is drying up. The training set for taste is being replaced with a uniform layer of plausible.

I do not have a clean answer. A few things I would say to anyone starting now.

Write a parser by hand once. Not because it is the right way to write a parser. Because you need to feel how a small change to the grammar makes the code want to fall apart, and an agent will not let you feel that.

Read one production codebase end-to-end in a language you do not know, without an agent to summarise. The point is the slowness.

Take an on-call shift before you take an agent shift. The fastest way to develop a sense for what is risky in a system is to be paged at 3am when the risky thing happens. Agents do not have phones.

Ship something without an LLM in the loop for a month. Notice what you forgot you knew.

None of this is to be a Luddite about the tools. It is that the tools are letting people skip exactly the work the tools were trained on, and the bill on that lands in five years on the people who skipped it.

What is left

The job is shifting from execution to judgment, from producing code to deciding which code is right. Selection means looking at three model-suggested implementations and saying "none of these, here is what we actually need, and here is why." That "why" is most of the job now.

The other half of the new judgment is writing. Specifying a problem precisely enough that the implementation is obvious, and then knowing whether what came back is any good, is a writing problem dressed as an engineering problem. Engineers who can write a tight spec win twice. Once with humans, once with models. The bottleneck moves from typing code to typing the things that produce code.

None of this is new. What is new is that the skill is no longer bundled with the typing. It has to stand on its own now, and that takes deliberate practice.

If you are entering the field, the worry is real, and the way through it is to do the slow work the tools would let you skip. If you have been doing this for a while, the worry is whether the next generation will be able to inherit the thing you do not know how to fully describe.

That is the version of this job I am trying to be good at. Ask me again in 2030.

← back to writing