A few links this week on the theme of "open-ended" AI systems that continuously learn and improve, rather than having a single period of training followed by repeated use of the same model. These aren't entirely new ideas: reinforcement learning and genetic algorithms / genetic programming systems have often been deployed in an open-ended fashion. But this work does bring it all together into a bigger learning loop with LLMs, and the direction feels like the next big step forward.
In a 2003 paper Jürgen Schmidhuber proposed a Gödel Machine that could self-improve, with a problem solver that tries to solve problems set for the machine and a searcher that can rewrite the machine's code to improve it. It's a kind of meta learning (learning to learn). The article above describes work this year on a Darwin-Gödel Machine. This is a coding agent that improves its own code. Why "Darwin"? Because it also has an element of genetic programming. It starts with an agent, attempts to improve it, evaluates its performance with a software engineering benchmark, and adds it to its archive of agents. Next time around, it can select a "parent" agent to modify to create children. The array of possible agents means it can search over a big space of solutions. In this case the agents' LLMs are fixed (it isn't trying to train new foundation models each time, which would be pretty expensive); it is optimising the tool use and workflows to create better coding agents. The result is a significant improvement (from 20% to 50% on the SWE-bench software engineering benchmark, compared to human-designed agents at around 70%).
As the article above notes, the 2017 Asilomar AI Principles include:
22. Recursive Self-Improvement: AI systems designed to recursively self-improve or self-replicate in a manner that could lead to rapidly increasing quality or quantity must be subject to strict safety and control measures.
Open-Endedness is Essential for Artificial Superhuman Intelligence
This is a paper from the 2024 ICML conference; work by Edward Hughes and colleagues from Google Deep Mind (Edward gave an excellent talk at RAAIS 2025 in London). They're thinking about ways to create "ever self-improving" AI systems. They define an open-ended system as one that produces a series of novel and learnable artifacts, from the point of view of an observer. Novelty means artifacts becoming less predictable, whereas learnability means that you're more likely to predict the next artifact if you've seen a longer history of previous ones. The role of the observer is to determine novelty and learnability (different observers may remember more or less history for instance). An example helps. A research student will find a series of publications from a research lab novel if each new paper has something surprising, but also learnable when reading the previous papers will help them predict the next one. An AI example is AlphaGo, that can continually discover new policies to improve its performance at Go. This is a position paper, quite theoretical so it merits several reads to get, but it lights the path towards foundation models that can continually improve themselves, generating new hyoptheses or problems to solve.
SEAL is a framework that enables language models to generate their own finetuning data and optimization instructions—called self-edits—in response to new tasks or information. SEAL learns to generate these self-edits via reinforcement learning (RL), using downstream task performance after a model update as the reward.
Early work from some MIT students; an example of successfully putting similar ideas into practice albeit in some quite specific domains.
Frontier AI systems have surpassed the self-replicating red line
Cute experiment: can an LLM reproduce itself (get another copy of itself running on a virtual machine) given access to a command line. Usually yes. Not convinced this is particularly surprising, and self-replicating computer viruses have existed since the 1970s, but it is another important ingredient for open-endedness.
New jargon watch:
As context grows and especially if it grows with lots of distractions and dead ends, the output quality falls off rapidly
I really like the term “context engineering” over prompt engineering. It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM.
Thanks to Simon Willison
A few links on the same theme this week: people forming social relationships with AI. Rather than the "doomer" scenario where an accelerating artificial intelligence wipes out humanity, a more realistic view is that we'll struggle with addictive or toxic relationships with our AI systems.
AI and Semantic Pareidolia: When We See Consciousness Where There Is None
ChatGPT Is Becoming A Religion
Two views of the same phenomenon; between them the inevitable but disturbing aspects of our future are a bit clearer!
Lucian Floridi is a well known philosopher of digital ethics and has been thinking about these issues for a long time. His article introduces "semantic pareidolia":
He predicts that we'll increasingly perceive AI systems as conscious, intelligent and emotional, before positing that "the final stage may be the most disturbing: from pareidolia to idolatry"... "we see gods where there are only algorithms".Traditional pareidolia is the psychological mechanism that makes us see faces on the moon or animals in the clouds, perhaps an evolutionarily advantageous tendency that has allowed us to recognise predators and allies quickly. Semantic pareidolia operates similarly, but within the realm of meaning and consciousness: we perceive intentionality where there is only statistics, meaning where there is only correlation, and understanding where there is only pattern matching on a massive scale.
Emotionally psyopping yourself with AI
"Cognitive security" techniques protect against social engineering; things like recognising manipulation attempts. One bizarre twist in this article (and why it is titled "emotionally psyopping yourself") is someone attempting to manipulate their own memory by creating a video of their mother hugging them from an old photograph. The article over states the significance perhaps (one person's sensationalised emotional psyops is another's more conventional narrative therapy?), but this certainly made me think.
Reddit co-founder Alexis Ohanian, who used Midjourney’s new video generator to create “camcorder” footage of his mom hugging him as a child. Really can’t articulate how horrifying that idea is. As one X user wrote, “Cognitive security Rule 1: Do not do this.”
There's an increasing interest recently in AI romantic partners. As well as lots of links in the article above, see this one in Wired about a journalist renting an AirBnB with three people and their AI partners: My Couples Retreat With 3 AI Chatbots and the Humans Who Love Them (thanks to Webcurios for the link; note that Wired has a paywall and £1/month subscription).
The world's first AI-governed nation
Finishing with a fun one, and clearly a PR stunt according to Webcurios this week. An actual island (a "sovereign micronation") in the Philipines, with a government that is a cabinet of AI chatbots acting as famous historical leaders (Winston Churchill, Nelson Mandela, Sun Tzu, Gandhi, Leonardo da Vinco - clearly someone who's been playing the game of "if you could pick anyone from history to have dinner with..."). What can possibly go wrong. Apply to be a citizen.
A few links that captured my attention this week:
Andrej Karpathy: Software Is Changing (Again)
This talk from Andrej Karpathy at the Y Combinator AI Summer School has rightly drawn lots of attention over the last week. Well worth watching all the way through. Andrej studied with Fei-Fei Li at Stanford, helped found OpenAI and ran AI at Tesla (and coined "vibe coding"). Lots of perceptive metaphors. AI as electricity (via Andew Ng). Writing computer code was software 1.0, 2.0 is training neural networks and for 3.0 we can consider writing natural language prompts as programming. Present day LLMs are like using time sharing on mainframes in the 1960s. LLMs as "people spirits" (stochastic simulations of people). And finally, moving into designing for "partial autonomy" and building for agents. A great talk.
From the great Things I Think Are Awesome (TITAA) newsletter. There's a really lovely piece here about how the training Anthropic have done on Claude's "character" can lead to a state of blissfulness between two Claude instances (as reported in the system card for Claude Opus 4 and Sonnet 4):
When two Claudes spoke open-endedly to each other: “In 90-100% of interactions, the two instances of Claude quickly dove into philosophical explorations of consciousness, self-awareness, and/or the nature of their own existence and experience.
...
And then it gets mystical. Claude is still into Buddhism. In what testers called the “Bliss Attractor” state, Claude said things like, “The gateless gate stands open. The pathless path is walked. The wordless word is spoken. Thus come, thus gone. Tathagata.”
There's a lot to digest here as we see more and more surprising emergent behaviours.
Allen Pike has a great article here discussing lots of ways we may see non-chat UI patterns start to change as designers figure out ways to integrate LLM functionality. Examples go back to Maggie Appleton's piece 2 years ago about how different daemons (Language Model Sketchbook, or Why I Hate Chatbots) could help you, with different personalities (like a devil's advocate, or a synthesiser). This article looks at examples where the flexibility of typed or voice input, or automating more ambiguous tasks, can lead to interesting new design patterns.
Security guru and general all-round perceptive commentator Bruce Schneier discusses a useful way to evaluate where current AI tools can help, with tasks that require one of: speed, scale, breadth of scope and "sophistication" (problems that require processing many separate factors).
Looking for bottlenecks in speed, scale, scope and sophistication provides a framework for understanding where AI provides value, and equally where the unique capabilities of the human species give us an enduring advantage.
Working with Google Gemini wearing Snap augemented reality spectacles
A nice demo from Matthew Hallberg, a design engineer at Snap, showing how Google Gemini can integrate with the Snap Spectacles (possibly the new ones coming next year) to perform various tasks within the field of view, outputting correctly anchored labels.
Why I don’t think AGI is right around the corner
Dwarkesh saying an obvious thing that needs saying: the instance of an LLM you're working with doesn't (yet) learn the way a person does over their lifetime; it is fixed.
How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student.
This just wouldn’t work. No matter how well honed your prompt is, no kid is just going to learn how to play saxophone from just reading your instructions. But this is the only modality we as users have to ‘teach’ LLMs anything.
As this is the first post there's a few more things, but generally I'll be aiming for 2-3 things to read or listen to per week.
Rick Rubin interviews Jack Clark of Anthropic (an episode of the Tetragrammaton podcast)
Really long (2 hours!), discursive, fascinating, lots of detail about how Anthropic came to be and visions for the future, as well as Jack's own background. Recommended as a good insight into how the founders of Anthropic are seeing the world develop.
Black Mirror-esque piece from Ars Technica - can you stop people making an AI avatar of you after you're dead, using your voice, appearance, written content and so on? An introduction to the world of grief tech and grief bots.
Simon Willison has been patiently explaining the new kinds of security risks possible with LLMs (and coined the phrase "prompt injection" back in 2022). This is his most clear explanation yet of the three features that, if they are all present, open opportunities for attackers to steal data. A recent example was EchoLeak that showed how data could be exfiltrated via Microsoft 365 Copilot.
Why human–AI relationships need socioaffective alignment
Really loved this paper. Builds on work from the early days of the Web by the great Cliff Nass and others on how people relate socially to computers in surprising ways. Then shows how much of the current thinking on AI alignment doesn't really take into account what will happen as longer running relationships between people and AI models become more common. Many things that seem obvious in retrospect; that's always a good sign.
AI Isn’t Only a Tool—It’s a Whole New Storytelling Medium
Eliot Peper is a science fiction author who writes here about using AI as part of developing the setting and characters for the Tolans game / "AI friend". I loved these insights into the creative process using new tools in a creative way.
Black Forest Labs FLUX.1 Kontext
One of the interesting product launches, also featured at RAAIS. A much better image editor, maintaining the context from one image to the next (try this in ChatGPT and you'll see it re-creates much of a photo and loses the consistency). It turns out Black Forest Labs really are in the Black Forest - who says you have to be in the Bay Area?
Coding agents have crossed a chasm
A great personal perspective from David Singleton (ex engineering leader from Google and Stripe) on present day collaborative coding with AI tools. A realisation part way through this particular example was asking the model to generate a sequence diagram:
Instead of diving straight into more code analysis, I tried a different approach. I asked Claude to read through our OAuth implementation and create an ASCII sequence diagram of the entire flow.
This turned out to be the key insight. The diagram mapped out every interaction. Having a visual representation immediately revealed the complex timing dependencies that weren’t obvious from reading the code linearly. More importantly, it gave Claude the context it needed to reason about the problem systematically instead of just throwing generic debugging suggestions at me.
With the sequence diagram as context, Claude spotted the issue: a state dependency race condition. The fix was simple once “we” found it: removing the problematic dependency that was causing the re-execution.
Given we're living through a period of exponential change in AI, and given I seem to be spending a lot of time trying to keep up with the AI news, I decided that once a week I'll publish some of the more interesting things I've seen. It's going to be pretty basic! They'll be things I've read or seen that week, but may have been published much earlier. I subscribe to all sorts of great sources (thank you all!), but I won't be reproducing them - this is really just my filtered view. And to be honest, I expect this is much more for my benefit than for anyone else's! It'll help me process and remember things better.