The limits and possibilities of AI Assistants
The discourse on AI-generated code is stuck in a false dichotomy peddled by both proponents and detractors.
The discussion about AI-generated code, whether coming from hype-merchants, or skeptics, frequently makes it sound as if you should stop thinking about the code and architecture and just accept it as is, once it compiles and seemingly works.
Nothing could be further from the truth: AI-generated code does not absolve the developer from thinking about code, architecture, tests, refactoring, edge-cases etc. All it absolves us from doing is some of the typing!
I generate a lot of my code these days. I rarely accept it at face value. I almost always spend significant time refactoring what I’ve received, thinking about edge-cases and tests I’ll need. And I have an active discussion with my AI assistant about these things. What generated code has allowed me to do is focus my on what matters, and save energy on typing out boilerplate.
What is an LLM capable of, really?
LLM’s, somewhat simplified, are not real intelligence, but a combination of three things:
Autocomplete on steroids.
A compressed, lossy search across the LLM’s training corpus (compressed, lossy = hallucinations).
Conversational search across the above, predicting the most likely/wanted response to a query.
The last point is key. We should not anthropomorphize AI, but if we are to do it, I think we can translate the “most likely/wanted response”, into thinking about an LLM as a sycophant - it will tell you what you want to hear.
The second order consequence of an LLM as a sycophant, is that if you ask it to write your unit-tests, there is a high risk/chance it will generate tests that pass if possible, even if the code tested contains obvious bugs. This means a few things:
You can never fully trust tests written by an LLM without thorough human inspection.
You can increase trust by prompting with a high degree of specificity of the criteria to pass.
You need to question edge-cases that are likely not covered.
Now, if you do the last two things, the lines are blurring between code and natural language, don’t you think? This doesn’t mean we can’t work this way, or that it isn’t useful. However, it does mean that we still need to have in-depth awareness of the problem we are solving, and how it interacts with the code we are using, writing and generating.
I discussed tests here, but obviously all of the above here also applies to implementation code! Just because you didn’t test it, or see a failing test, doesn’t mean there aren’t bugs lurking.
AI only helps with autocomplete?
In a word: No.
As I previously wrote, I think we are only starting to scratch the surface of what AI assistants are capable of, but we also need to learn to work with them, and understand their limitations.
Some ways I have found AI Assistants to help:
Code generation
Inspection/review: did we catch all edge-cases? Are there any issues?
Interactive documentation: How does this code or API work? If I do A, are my assumptions correct?
Getting unstuck: “better, contextual StackOverflow”
Architecture feedback: describe my architectural ideas, get feedback on trade-offs.
But, remember what I said earlier: your Assistant is a sycophant! You need to take all feedback and code generated in consideration with a critical mind. The model might just be telling you what it thinks you want to hear, not what is most useful.
So how much time and cognitive load do I save?
This is the billion-dollar question.
My feeling is that, like many other technological improvements over the last 50 years, we yet again reduce time used, and cognitive load of low-value work.
Calculators did not replace mathematicians or undermine the value of mathematical knowledge.
What AI Assistants save software engineers is in typing, remembering API’s & documentation, and having interactive feedback on work, without always requiring a physical human as your pair programming partner.
The result is, we will spend less time typing out code, and more time thinking about the problem to be solved. This will certainly reduce the time and cost of producing software. But as any seasoned professional will know, conceptualizing a solution, and in particular, writing the code, is only a small portion of the problem software engineers solve.
Conclusion
If AI Assistants only touch a small surface area of what software engineers do, are they useful?
The answer to that is a resounding “Yes!”. They are useful in the same way calculators and spreadsheets are useful to mathematicians and statisticians.
They won’t do the high-level thinking and problem-solving for us, but they will certainly aid us in doing it, by providing feedback and reducing the cognitive load of low-level details.
Personally, more often than not, I have an idea of the direction I want to go in to solve a problem, but in the past, going from idea to solution was broken up by a LOT of typing code, looking at documentation, getting stuck on small, but non-obvious bugs.
If I can do less of that grunt work, and more of the more fulfilling thinking, I will not only be more productive, but also a lot happier in what I do. I think this goes for almost anyone.
We should embrace our brave new world, but keep our expectations grounded in what the tools can- and cannot do at any point of progress. This is how we ensure we stay at the leading edge of progress, without wasting time going down blind alleys of false promise.