After the Chrome plugin moment — after I had seen the AI reason its way to the right UI decision without being asked — I needed to show someone.
Not to prove something. More to check if I was seeing what I thought I was seeing.
I showed it to two friends: The Quiet One and The Django Guy. Both of them I hold in high regard as engineers — the kind who don't just write code but challenge you on the product thinking behind it. If they were impressed, the impression would mean something.
They were not impressed.
Their pushback was the intelligent kind. Not dismissal — engagement. Especially from The Quiet One.
Yes, AI generates code. It's good at small snippets, functions. But a language model that uses a probabilistic model to guess the next word — are you really betting on that for a complex application? Front-end, back-end, database, the whole thing?
It was a fair question. And I had heard versions of it before — from myself, not long ago.
We went back and forth. The debate was good. But at some point I recognised the shape of it: two people talking about what a thing might or might not be able to do, without actually testing it.
I had been in that conversation before. On the other side of it.
So I said: stop arguing. Start wondering. Why don't we just build something?
We had a candidate sitting in a drawer.
A while back, the three of us had started building something called ListBee. The idea came from a book — The Checklist Manifesto — which made a compelling case for how checklists, applied to complex processes, produce dramatically better outcomes. The book covered aviation safety, surgical procedures, construction — domain after domain where a simple list had reduced catastrophic failures.
I was struck by the argument. And struck by the gap: there was no good software for building and managing checklists at scale. Not really. Not in the way the book suggested they should be used — as living, collaborative, auditable process tools, not just to-do lists.
So we had designed it. Discussed it. Started it.
And then let it go. Day jobs. Energy. The gap between a good idea and the sustained effort of building, maintaining, and running a product.
But now we had AI. And AI was claiming it could collapse that effort.
So let's see.
We created a WhatsApp group called Weekend Pairing. The rule was simple: Saturdays, a couple of hours, we pair on building the new version of ListBee. See how far we get.
If my memory serves right, we had a working version in about three weekends. Four to five sessions of roughly two hours each — call it ten to twelve hours of productive pairing. And at the end of it, we had something that met the standard engineering bar: good structure, readable code, tests.
Not a demo. Not a proof-of-concept. A working application.
Here's the part that surprised us the most: we ended up building three separate variations.
The Django Guy was bullish on Django and wanted to build in that stack — and also, quietly, wanted to know if AI was better at some stacks than others. So he built a Django version.
The Quiet One had a different approach. He wanted to go step by step — Mobile-first, methodical, incremental. His version reflected that sensibility.
And then there was the React version.
Three codebases, three approaches, three sets of opinions about how the thing should be built. All working. All built in roughly the same amount of time.
The AI had no strong preference about any of it. It just built what you asked it to build, in the way you asked it to build it. Which was either deeply useful or slightly unsettling, depending on how you looked at it.
What I came away with was not just a stronger belief in AI's potential — though I had that. It was something more specific.
I had seen it handle not just syntax and structure but intent. When we explained what we wanted ListBee to do, it reasoned about that intent, not just the literal request. When something in the design was ambiguous, it made a decision and explained why. When something was clearly wrong, it flagged it.
This was — I want to be careful here — two years ago. The models were much weaker than they are now.
And it still did this.
By the end of those weekends, I had converted two more sceptics. Not by arguing. By building. There is no better counter-argument than a working application.
But the more interesting question came after.
Once we had done it — once we had seen how fast, how solid, how navigable the build was — we started asking a different kind of question. Not can AI do this? We knew it could. The question was: what does this mean?
What happens when the cost of building — the time, the effort, the months of sprints and the late nights — collapses almost to zero?
What does that do to how we think about software? To how we staff teams? To what gets built and what gets abandoned?
We started discussing it there, among the three of us. And then we took it back into the company — what does this mean for us, for the way we work, for the SDLC itself?
That is the next story.