Fetched Decoded Executed #2

Because who doesn't like to navel gaze?

Dec 19, 2025

Special thanks to Microsoft Copilot for all the cat pictures.

As an academic, it’s my job to not just do research, teach students, and grumble about mindless admin, but also to try and propagate knowledge more widely. Fetch Decode Execute is my ongoing attempt to do so. Last year, I bookended my ramblings with a summary of what I’d covered and a look forward to what might happen next year, and named it Fetched Decoded Executed #1. Here’s more of the same, but a year later!

What did I talk about this year?

Well, mostly about LLMs and generative AI (yes, apparently it’s a thing), with a smattering of old-school machine learning and the occasional bit of broader computer science. For instance:

I made disparaging remarks about AI-assisted software development in Vibe coding: is this really the future? and Argh, we’re drowning in technical debt!!!
I showed considerable disrespect in Reasoning: a small step for LLMs, The boring truth about DeepSeek, and The boring truth about AlphaEvolve.
I spread fear in Emerging security risks of GenAI in ML.
I chastised my fellow ML practitioners in Too little data, too much confidence and Stop using 42 as a random seed.
I railed at the limitations of LLMs in ML Pitfalls #2: LLMs can’t spot ML pitfalls and ML Pitfalls #3: Data Contamination in LLMs.
I laid into big tech in Big tech did not invent modern AI.
I complained about computer literacy in the media in “Big Computers” will save us all! and Operating systems are worthy of our attention.
And I had another go at my nemesis, Python, in Can Python be replaced in machine learning?

All in all, a pretty positive year, I think.

Where has AI been heading this year?

What, you didn’t bother reading all that? And you want me to summarise the main directions of travel this year in AI? Okay, it’s either that or more mindless admin…

Reasoning is an obvious one. LLMs now spout out reams of text saying why they think what they think. Well, except when this text is hidden from the user. And when it’s not just a hallucination. But it has, in general, improved the abilities of LLMs. See Reasoning: a small step for LLMs for more.
Agentic approaches are another biggy. LLMs can now use tools, or at least pretend to. This means they can interact with the world, which can be useful. It can also be disastrous, and has seriously opened up the AI attack surface. See Plumbing LLMs into the world and Emerging security risks of GenAI in ML.
But they don’t necessarily need to use tools, because LLMs are increasingly learning to do things done by other things. Or at least attempting to. I talked about this move from “let’s use human-written software as much as possible” to “let’s get LLMs to do everything” in Will everything become a neural network?
Then there’s vibe coding. LLMs generating code has moved to the next level. They can now write entire applications. But the application may not do what you want, and the code may be chock-full of vulnerabilities; see Vibe coding: is this really the future? and Argh, we’re drowning in technical debt!!!

Beyond this, I think AI has started to undergo a reality check. Yes, there’s still a depressing amount of hype and boosterism, often led by the self-interests of big tech and the naivety of governments and media, but there’s also been plenty of serious work on understanding the limitations of current models and thinking about how these might be addressed.

What will the next year bring?

People have been talking about bubbles bursting. This feels a little negative to me. Yes, I think there will be a contraction: the reality check will continue, and a lot of the venture capital-led excess will withdraw. But generative AI is not an empty vessel, and there’s plenty of potential yet to explore.

I suspect there will be a lot more focus on addressing the core pathologies of current models. This includes hallucinations, which are a big factor behind mistrust; for instance, I wouldn’t be surprised to see white-box uncertainty quantification approaches roled out commercially. It also includes explainability; whilst I think this is a more fundamental limitation of complex models, my prediction is that big tech will try hard to crack this nut, and might make some meaningful progress.

I also think there will be increasing focus on out-of-distribution behaviour, i.e. getting generative AI models to do things significantly beyond their training data. We’ve seen the start of this already with DeepMind’s AlphaEvolve (see The boring truth about AlphaEvolve), which uses an evolutionary framework to achieve this. Maybe I’m a bit biased by my background here, but I think there’a lot more mileage in combining LLMs with broader AI approaches.

Last year, I predicted that small models would be a thing this year. This has happened to some degree, e.g. Apple’s products now have an on-device LLM (though they don’t talk about this much). For many use cases, having to query a remote model is problematic. Smaller models have been getting more capable, and I suspect this trend will continue into next year, with small specialist models becoming a better solution for many applications than large generalist models.

And generative AI will continue to take over the world. And at some point, I hope there will be an honest conversation about where we want this to stop. Sure, we can replace all our jobs with AI, but should we? If not, are we ready for a universal basic income funded by the productivity increases driven by AI? Or will we just settle for a dystopian future where tech barons own everything and the rest of us survive off their crumbs? You know, that kind of thing.

Finally, a bit of reflection

As tradition now dictates, I’ll finish off with a bit of reflection on the writing of this Substack. It’s now up to a heady 133 subscribers, with most posts getting in the region of 150-300 views. Yes, not exactly a threat to the blogosphere, but it’s about 50% up on this time last year. I’m happy when anyone wants to listen to the things I have to say, and really appreciate those who have stuck around for the long term. The most-read post, with about 900 views, was Stop using 42 as a random seed. It seems to have reached the number 2 spot in Google for “42 random seed”, though sadly people are still very much using 42 as a random seed. This was followed by ML Pitfalls #1: Classification metrics and Can Python be replaced in machine learning? Those with the least views were generally published during the summer holidays — it’s almost as if people had better things to do. But if you want to give the readership stragglers a bit of love, then check out ML Pitfalls #3: Data Contamination in LLMs and Will everything become a neural network? Finally, I know I shouldn’t have favourites, but mine was Big tech did not invent modern AI. Sock it to the man!

Anyway, enough navel-gazing. For those of you indulging in festivities, have a good break. And I wish all of you a Happy New Year when it comes.

Fetch Decode Execute

Discussion about this post

Ready for more?