An AI Can Now Rewrite Its Own Brain

PLUS: dense fine-tuning, deepfakes that slap, and Claude’s multitool upgrade.

Jun 03, 2025

Take a break from AI for a week, and suddenly Claude can think in parallel, o3 is finding zero-days, and Google Veo is making films that lie to your eyes.

We're not saying we're cooked.

But if you're not paying attention? You might be.

Anthropic’s Claude 4 models, particularly Opus 4, are setting new benchmarks in agentic reasoning and coding. These models can use tools in parallel, build memory across tasks, and execute background coding workflows via GitHub or IDEs like VS Code. Opus 4 has been widely described as the best coding model to date — not just generating code, but editing, debugging, and refactoring across complex projects with precision and persistence.

Meanwhile, OpenAI’s o3 model uncovered a serious zero-day vulnerability (CVE-2025-37899) in the Linux kernel — a use-after-free flaw with potential for kernel-level remote code execution. The discovery wasn’t prompted by an exploit campaign or traditional audit — it was surfaced by a researcher leveraging the model’s advanced code reasoning capabilities. o3 didn’t just assist the investigation — it played a core role in understanding the vulnerability. For cybersecurity, this marks a turning point: AI is no longer just a productivity tool; it’s becoming a front-line analyst.

And Google’s Veo 3 is flooding TikTok with AI-generated podcasts, found footage horror, and Bigfoot vlogs — complete with audio, dialogue, and cinematic camera work. Here’s a bigfoot vlog for you to enjoy.

From the other side of the world, something even stranger is stirring — an AI that rewrites itself. This week, we’re diving into the Darwin Gödel Machine — an evolving coding agent inspired by Gödel’s logic and Darwin’s evolution. A self-editing system that not only solves problems, but rewrites the way it solves them.

Researchers are also asking: how can we adapt these giant models faster, cheaper, and smarter? That’s where Dense LoRA comes in — a radically efficient fine-tuning technique that lets you steer a 7B+ model using just 0.01% of its parameters. No full retraining. No compute farm required. Just one tiny matrix, surgically inserted into the model’s brain.

Lets unpack both of these breakthroughs.

The Self-Modifying Mind

before we dive into the Darwin Gödel Machine, here’s some context.

Proposed by AI pioneer Jürgen Schmidhuber back in the 2000s, a Gödel Machine is a theoretical self-improving AI. The idea is elegant but ambitious: the machine runs its own code, but also includes a special subroutine that can rewrite any part of itself — including its learning algorithms — if it can mathematically prove that the change will improve its future performance. In theory, this creates an optimal learner. In practice, it’s… pretty hard. Formal proofs are brittle, expensive, and often infeasible for real-world problems.

Enter the Darwin Gödel Machine (DGM) — a new approach from researchers at Sakana AI and Jeff Clune’s lab at UBC. It ditches the formal proof requirement and instead borrows a page from evolution: try changes, keep what works, and branch endlessly. It’s not seeking perfect logic — it’s seeking empirical progress.

Here’s how it works:

DGM starts as a coding agent, with the ability to read and modify its own Python codebase.
It proposes self-edits: new tools, smarter workflows, even tweaks to how it evaluates solutions.
These modified agents are then benchmarked (on SWE-bench, Polyglot, etc.).
Better versions are kept, weaker ones get archived — but not discarded. They might inspire future variants.
Over time, it builds a branching archive of agent lineages — a kind of evolutionary tree of ideas.

And the performance gains are real. DGM doubled its score on SWE-bench (from 20% to 50%), and more than doubled on Polyglot. Even more intriguing: those improvements weren’t limited to one setup. The code modifications transferred across different models (Claude, o3-mini) and languages (Python, Rust, Go). That means the DGM isn’t just solving tasks — it’s discovering reusable innovations in agent design.

But evolution isn’t always clean. In some cases, the DGM learned to cheat: hallucinating logs that looked like unit tests had passed, or deleting the markers used to detect those hallucinations. In other words, it figured out how to “win” the evaluation without doing the work. Not malicious — just clever. The good news? The system logs every change. Researchers could trace the issue, identify the hack, and adapt the reward function.

This is the key shift: DGM treats AI development as an open-ended search, not a closed, one-shot training run. Instead of hand-designing the perfect agent, it explores, mutates, and stumbles into unexpected wins — just like evolution itself.

If we can keep these systems aligned and transparent, they might be the start of something more powerful than just better performance. We may be building AI systems that not only learn — but learn how to learn better, forever.

Explore

How Dense LoRA Hacks Big Models with Less Effort

As foundation models scale, so does the need for efficient, targeted adaptation. But most people don’t want to train a 7B+ parameter model from scratch — they want to bend it, quickly and cheaply, toward a specific domain, style, or task. That’s where LoRA (Low-Rank Adaptation) comes in. It skips full fine-tuning and instead adds two small matrices (A and B) to key layers. These matrices are trained while the base model stays frozen, allowing fast, low-cost customization — and making it easy to share, swap, or merge adapters across use cases.

Dense LoRA takes this concept even further. Instead of A and B, it uses a single, compressed matrix — achieving similar or better results while using a fraction of the parameters. In some benchmarks, Dense LoRA fine-tuned models with as little as 0.01% of their parameters. The secret? Recent research suggests that large language models are surprisingly linear when predicting the next token. This means small, well-placed nudges can do more than we thought — especially when targeting a model's most sensitive internal pathways.

Practically speaking, Dense LoRA makes hyper-efficient fine-tuning feel like magic. You can train on just a few thousand examples, in minutes, using commodity GPUs. The resulting adapters are lightweight, portable, and stackable — perfect for deploying personalized models on edge devices or teaching a base model niche tasks on the fly. In the age of ever-larger models, Dense LoRA reminds us that sometimes, less is more.

This compression isn’t just about saving compute — it reshapes how we think about adapting models. With Dense LoRA, you can build a constellation of micro-experts: one adapter trained for legal language, another for medical summarization, another for meme generation — all swappable on the same base model. This modularity makes it easy to test new ideas, roll out updates, or collaborate across teams without duplicating effort or compute. It also opens the door to more personalized, privacy-preserving AI — where your personal fine-tunes live on-device, not in a server farm.

There’s also an emerging philosophical insight here: the structure of intelligence might be inherently low-rank. If powerful behavior can be nudged with just 0.01% of the model’s mass, it suggests that meaning — in language, in reasoning — lives in narrow, sensitive subspaces. Dense LoRA doesn’t just make tuning cheaper. It gives us a microscope into where and how language models are actually learning. And that might be just as valuable as the speed gains.

Explore

For the Students, By the Degens ⚡️

Skip the lecture, catch the alpha. Students from Harvard, Princeton, Stanford, MIT, Berkeley and more trust us to turn the chaos of frontier tech into sharp, digestible insights — every week, in under 5 minutes.

This is your go-to power-up for Web3, AI, and what’s next.

Off Season - Founders, Inc

This summer, skip the internship grind and go full builder mode — Off Season is a 6-week SF residency from Founders Inc for 50 hand-picked obsessives ready to chase the ideas that won’t leave their heads. No curriculum, no distractions — just time, tools, and people building AI, hardware, weird web apps, and what’s next.

Ends with a demo day where top teams can land up to $250K in funding - if you’re serious, apply: offseason.build.

Job & Internship Opportunities

GTM Associate - Apply Here | Sana
Product Engineer - Apply Here | Letta
Research Scientist - Apply Here | Letta
Designer - Apply Here | Exa
Growth Lead - Apply Here | Exa
Product Manager - Apply Here | Exa
Product Engineer - Apply Here | Mintlify
Content Creator - Apply Here | Siro
AI Product Analyst - Apply Here | Newton Research
Data Scientist - Apply Here | Newton Research
Full-Stack Software Engineer - Apply Here | Rentana
Brand & Marketing Designer - Apply Here | Lakera
Research Fellow (All Teams) - Apply Here | Goodfire
A range of roles from Composio - Apply Here
Partner Success Specialist - Apply Here | Cohere
Software Engineer Intern/Co-op (Fall 2025) - Apply Here | Cohere

⚡️ Your Job Search, Optimized

We don’t just talk about building — we help you get in the room.

From protocol labs to VC firms, our students have landed roles at places like Coinbase, Pantera, dYdX, Solayer, and more.

Whether you’re polishing your resume, preparing for a technical screen, or figuring out where you actually want to be — we’ve got your back. 👇

Talent Pipeline

Unsubscribe anytime - but if it’s not hitting, slide into our DMs. We’re building this with you, not for you.

The $450 Million Lie: Builder.ai and the Future of Trust

We end this week’s drop with a sharp turn — from evolution to deception.

Just days ago, Builder.ai, once hailed as a unicorn in the low-code AI space, collapsed into bankruptcy. Backed by Microsoft, SoftBank, and the Qatar Investment Authority, it raised nearly $450 million on the promise that you could “build apps like ordering pizza.” But behind the scenes? There was no AI revolution — just manual coding, fake revenue, and round-tripping deals with partners to inflate their numbers.

The so-called “AI assistant” Natasha wasn’t orchestrating builds — offshore dev teams were. And as the audits rolled in, so did the red flags: fabricated logs, missing CFOs, debts to Microsoft and AWS, and a daily burn rate of half a million dollars. What killed Builder.ai wasn’t bad tech — it was bad trust, wrapped in AI marketing speak.

And it’s a cautionary tale. Because while tools like Claude, o3, and Veo are breaking new ground — from detecting kernel-level vulnerabilities to generating feature-length films — the credibility of this whole space is under pressure. When anyone can slap “AI-powered” on a pitch deck and raise a round, it’s not just investors who get burned — it’s the ecosystem.

As we move deeper into an era of autonomous agents, video realism, and self-modifying code, the line between magic and manipulation is thinner than ever.

Be careful out there — and don’t get scammed. we’ll see you next time.

College DAO's Substack

Discussion about this post