Cameron Hotchkies
3 min read

Categories

  • ai
  • coding

Tags

  • ai
  • llm
  • quality

Using LLMs or AI to develop software isn’t new-new, but it’s still new-ish and nascent. Over the past year, my usage has gone from “this chatbot is a stupid party trick” to “this auto-complete is getting better” to “hey $LLM_OF_THE_WEEK, write me a whole app that does… something”..

I’m constantly vacillating between “the game is forever changed” and “what a pile of junk, if this is the future, humanity is cooked”. We are all still learning what works, what doesn’t. What is vaporware nonsense, and what vaporware nonsense just started to work this morning.

About six months ago, I was responding to an email about vibe coding (in case the term dies soon, that’s coding using only prompts) and an analogy struck me and stuck with me.

vibe coded software is great for what it is: getting an idea into reality. It’s akin to fast fashion clothing, where traditional software is closer to a bespoke tailored suit.

I like this analogy because fast fashion isn’t inherently bad1. If fashion changes drastically between seasons, why spend resources on making something durable that will never be worn again? We can now do the same with software. It’s become cheaper to greenfield a whole product with a fresh and improved series of prompts.

The other end of the spectrum is that system that is intended to be durable, generational, repairable. The heirloom quality tool. This is for software that needs that extra care, and craft. I think this is where in-depth engineering will continue to thrive. Reinforce the critical cores with the added strength of human ingenuity and pattern recognition.

Where I think we, as an industry, need to do better: recognizing where each approach is important, where it fits the task. You could make a peanut butter and jelly sandwich with a screwdriver, but it’s never going to be the right tool for the job. I don’t care if it’s a Phillips, flathead, or a hex driver.

I asked an LLM to review this post for grammar, it replied with “The PB&J/screwdriver metaphor doesn’t quite work—screwdrivers spread just fine, which undercuts your point about wrong tools.” proving the robots don’t understand peanut butter.

As of this writing, my experience with AI driven code is that it gravitates away from reusable code. This should be expected. Reuse requires context, and context is expensive. This already creates a natural tension for code that has a large impact on the user base.

Since the LLMs run in a limited lifespan sandbox, the constant reset between sessions makes it impossible to genuinely understand the scope of any change they introduce outside of the local environment. Their universe is defined by the narrative of the immediate prompt. I feel this runs deeper than chaining session contexts together. As engineers, our job is evolving into acting as higher order context filters, with visibility of scope, and consequences, outside ourselves.


One interesting thought as we start to think of the whole artifact as being disposable: we care less about the insides. There are already whole suites of scripts I’ve generated where I’ve never checked the contents once I accomplished the actual task. Build, apply, delete.

A friend of mine pointed out this is reminiscent of neural networks in general, a confusing black box whose inner workings we barely understand. If that is true, am I now simply the product manager? Does that make the LLM (and its subagents) the engineering staff building the system? If that is the case, we have functionally shipped the org chart.

And with that, Conway’s law has struck again.

  1. Except for the environmental impacts. Those are bad.