AIs don’t produce well organized code. They duplicate effort, which is tech debt. Maybe one day they will be able to clear their own tech debt. And who knows, maybe they’ll still be heavily subsidized by VC money then.
You can organise the code well once, template that and put guardrails in place for it to follow the structure you and the team have agreed is good. The engineering task becomes building the system that is capable of building the system to a high standard.
Skill issue. We don't have that problem. The opposite is true. Every time the harness does something we're not happy with we figure out how to engineer out that failure mode. Tech debt decreases.
Having them clear it is trivial. I have my harness refactor automatically on a steady cadence, something I could never afford to take the time to do manually.
So you have an AI refactor AI generated code? What am I missing here, if AI is the cause of the tech debt because it doesn't write great code, won't you just end up with more tech debt if you ask AI to refactor it?
If a human produces tech debt, do you think a human can't refactor?
Most of the time a human works over code multiple times, and still produces tech debt.
Give an AI agent enough time, by prompting it multiple times, and explicit instructions to look for and address tech debt of various forms, and it will.
A human has taste. They learn over time from codebase patterns and develop a sense of when an abstraction can be reused, improved or refactored. Agents often generate repeated code because the original file wasn't added to the context, and it's up to a human reviewer to recognise this.
In my experience, an agent will rarely recognise a common pattern and lift it into a new abstraction. It requires a human with taste and experience to do it. For example, an agent will happily add a big amount if branches in different places of the codebase where a strategy pattern or enum would be better (depending on the language).
If you have a working prompt or harness that ameliorates this, I'd be glad to see it.
Yeah I must be missing something again. Comparing human to AI here seems to be fundamentally wrong. A human will learn over time and improve their mental model of a problem and ability to code. An AI agent for the most part is fixed by its model. I just don't see how pointing an agent at AI generated code to refactor without direct human guidance results in better code.
Maybe you can describe what the various forms of tech debt are that you are talking about?
> Yeah I must be missing something again. Comparing human to AI here seems to be fundamentally wrong. A human will learn over time and improve their mental model of a problem and ability to code. An AI agent for the most part is fixed by its model. I just don't see how pointing an agent at AI generated code to refactor without direct human guidance results in better code.
There is no need to improve their mental model of a problem and ability to code to recognise the refactoring opportunities that already exists in the code. It only takes a sufficient skill, and effort invested on refactoring. The way to get a model to invest that effort is to ask it. As many times as you're willing to.
> Maybe you can describe what the various forms of tech debt are that you are talking about?
Any. Whether or not you need to prompt much to address it depends on consistency. In general I have a simple agent whose instructions are just to look for opportunities to refactor, and do one targeted refactor per run. All the frontier models knows well enough what good looks like that it is unnecessary to give it more than that.
The best way of convincing yourself of this, is to try it. Ask Claude Code or Codex to "Explore the code base and create a plan for one concrete refactor that improves the quality of the code. The plan should include specific steps, as well as a test plan." Repeat as many times as you care to, or if in Claude Code, run /agents and tell Claude Code you want it to create an agent to do that. Then tell it to invoke it however many times you want to try.
It has even less copyright. A prompt text you write, if sufficiently creative enough, is copyrighted. The output of an AI, no matter how “creative”, is always pubic domain.
All large language model output, unless it is infringing someone else’s copyright, is public domain. Did you read the quotes from the copyright office I linked to?
reply