I don't normally post the big, obvious news stories here, as there's plenty of sources for those. But... what a week! Look at the sheer number of significant new launches:
- Anthropic: just a new 4.1 version of Opus, so perhaps more to come soon...
- Google: the Jules asynchronous coding agent is now fully released (to compete with OpenAI Codex, Claude Code, Github Copilot but also LangChain OpenSWE also coming out this week). AI mode in Google search launches in the UK (and remember, Google already has the whole web crawled regularly). Also SensorLM, a new foundation model trained on 2.5M person-days of Pixel Watch or Fitbit data from 100K people, that can recognise activities and create captions.
- Eleven Labs: a new music generator, with deals announced with Merlin Network (many independent artists, and 15% of the global recorded music market) and Kobalt Music Group (8000 artists, including many big names, from Paul McCartney to the Pet Shop Boys). It isn't clear how the rights for songs for training data actually work, and how much could be included in the future. The demos are impressive.
How might we accommodate both needs: the generous, informative, helpful assistant and the critical teacher and interlocutor?
She raises important questions. Is it the responsibility of the foundation labs to help you become a better thinker, rather than attempt the thinking for you? Will the more agreeable, borderline sycophantic personas win out in the marketplace, or is there a place for a tool that challenges? I also believe the vast majority of users won't be fine tuning prompts let alone crafting different personas, so in the end whoever controls the default interface will, like Google's first page of search results, have undue influence.
Genie 3: A new frontier for world models
This feels like a big deal that I don't fully understand yet. Google Deepmind are continuing to work on models that effectively simulate 3D worlds that you can navigate around (with no underlying 3D model or game engine). These systems seemed like quirky demos last year, without clear applications. You can't try Genie 3 out for yourself yet, but the demos are remarkable: real time rendering of the next frames, with an apparent ability to "remember" the environment. In early versions of this kind of technology, you'd see an object, look in the other direction, look back, and it would most likely be gone or replaced by something entirely different. In generating the world frame by frame, it is hard for an AI system to keep any continuity. Genie 3 seems to have it solved, for minutes at a time. I am still unclear on the applications. GDM discuss using these generated environments to train AI agents, and that makes sense. But surely there's more.
Veo 3 Just Lost Its Crown to this AI Video Tool
Another recommendation - AI Film News from Curious Refuge is a really detailed roundup with demonstrations of new features and products. This week they discussed Genie 3 but also spent time on the Seedance video generator from ByteDance. They claim it has better results than Veo 3 (to me they look pretty close, but it does score higher on benchmarks).
tokens are getting more expensive
A forthright, opinionated treatise on how people only want the latest, best models, and the latest, best models need more tokens. People prefer a flat rate monthly price, and may not tolerate per-token fees, but that isn't sustainable. If a deep research query costs the AI company $1 but they're charging $20 a month, it doesn't stack up. Worth reading.
Wonderful skewering of current AI debates :).