Free Republic
Browse · Search
General/Chat
Topics · Post Article

Skip to comments.

Building a C compiler with a team of parallel Claudes
anthropic ^ | 02 05 2026 | Nicholas Carlini

Posted on 02/06/2026 6:02:22 AM PST by yesthatjallen

I've been experimenting with a new approach to supervising language models that we’re calling "agent teams."

With agent teams, multiple Claude instances work in parallel on a shared codebase without active human intervention. This approach dramatically expands the scope of what's achievable with LLM agents.

To stress test it, I tasked 16 agents with writing a Rust-based C compiler, from scratch, capable of compiling the Linux kernel. Over nearly 2,000 Claude Code sessions and $20,000 in API costs, the agent team produced a 100,000-line compiler that can build Linux 6.9 on x86, ARM, and RISC-V.

The compiler is an interesting artifact on its own, but I focus here on what I learned about designing harnesses for long-running autonomous agent teams: how to write tests that keep agents on track without human oversight, how to structure work so multiple agents can make progress in parallel, and where this approach hits its ceiling.

Enabling long-running Claudes

Existing agent scaffolds like Claude Code require an operator to be online and available to work jointly. If you ask for a solution to a long and complex problem, the model may solve part of it, but eventually it will stop and wait for continued input—a question, a status update, or a request for clarification.

SNIP

(Excerpt) Read more at anthropic.com ...


TOPICS: Computers/Internet
KEYWORDS: agents; agentteams; ai; anthropic; arm; autonomous; c; ccompiler; claude; claudes; claudistic; claudistics; linux; llm; nicholascarlini; notjustariverinegypt; parallel; riscv; rust; x86
Navigation: use the links below to view more comments.
first previous 1-2021-4041-45 next last
To: catnipman

You can run LLMs on your local machine. 80% of what most people need can be accomodated by running the model on your machine.

Then use the proprietary models when you need the power. But most simple tasks can be done with an open-source model running under Ollama.


21 posted on 02/06/2026 7:03:54 AM PST by dfwgator ("I am Charlie Kirk!")
[ Post Reply | Private Reply | To 19 | View Replies]

To: Resolute Conservative

Use the AI to refine specs and test cases and thoroughly review them.

I’ve been looking at the BMAD method, which looks promising as an approach to developing software, since it focuses up front on building the requirements documents.


22 posted on 02/06/2026 7:05:25 AM PST by dfwgator ("I am Charlie Kirk!")
[ Post Reply | Private Reply | To 20 | View Replies]

To: cymbeline

Good questions. However, I think you’re on target with the academic exercise comment.


23 posted on 02/06/2026 7:06:33 AM PST by 556x45
[ Post Reply | Private Reply | To 13 | View Replies]

To: grey_whiskers
"What happens when the AI companies have to start charging enough for tokens to pay their actual infrastructure cost?"

After every other input, the AI will play a 90 second video ad.

24 posted on 02/06/2026 7:11:17 AM PST by alancarp (George Orwell was an optimist.)
[ Post Reply | Private Reply | To 4 | View Replies]

To: alancarp

Ads in ChatGPT

https://help.openai.com/en/articles/20001047-ads-in-chatgpt


25 posted on 02/06/2026 7:12:29 AM PST by dfwgator ("I am Charlie Kirk!")
[ Post Reply | Private Reply | To 24 | View Replies]

To: CodeToad

I’ve been wondering that myself. Considering the money being spent on AI data centers, how much work will they actually generate that is billed at a profitable rate?>>> As long as we subsidize the power generation it will be profitable.


26 posted on 02/06/2026 7:30:34 AM PST by kvanbrunt2
[ Post Reply | Private Reply | To 6 | View Replies]

To: cymbeline

Exactly. It just would create bloated executables.


27 posted on 02/06/2026 7:37:57 AM PST by ImJustAnotherOkie
[ Post Reply | Private Reply | To 13 | View Replies]

To: yesthatjallen

Fun article.

Now I know why my lights were browning out the other night!

Claude is working on documentation; when he gets around to it?

” Over nearly 2,000 Claude Code sessions across two weeks, Opus 4.6 consumed 2 billion input tokens and generated 140 million output tokens, a total cost just under $20,000.”

I would like to see a detailed, line-item budget covering total energy usage, cooling cost, percentage of total costs, and administration.


28 posted on 02/06/2026 7:41:43 AM PST by DUMBGRUNT ( "The enemy has overrun us. We are blowing up everything. Vive la France!"Dien Bien Phu last messag)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Dr. Sivana

A great C compiler produces outstanding assembly code. Complete with understandings of the cache, out of order execution, and all the intricacies of the target CPU. As a guy who learned BASIC and assembly at the same time I get what you’re saying, but yes, modern C compilers are good enough.


29 posted on 02/06/2026 7:53:43 AM PST by FrankRizzo890
[ Post Reply | Private Reply | To 3 | View Replies]

To: grey_whiskers
What happens when the AI companies have to start charging enough for tokens to pay their actual infrastructure cost?

The Feds will bail them out and print more money to cover it.

30 posted on 02/06/2026 7:58:08 AM PST by montag813
[ Post Reply | Private Reply | To 4 | View Replies]

To: montag813
The Feds will bail them out and print more money to cover it.

Weimar and collapse.

31 posted on 02/06/2026 8:08:55 AM PST by grey_whiskers (The opinions are solely those of the author and are subject to change without notice.)
[ Post Reply | Private Reply | To 30 | View Replies]

To: Resolute Conservative

[Luckily, I will expire before the wave come ashore. I have been in software for 20+ years and this latest push with AI is junk. Every time they task me with using it my velocity slows down because I am constantly correcting the AI. So, I turn it off or avoid it, except for looking up simple syntax issues that I have forgotten over the years.]


A friend formerly in the trade has been testing a few of these things. Problem is in code maintenance. Every iteration is different.


32 posted on 02/06/2026 8:18:27 AM PST by Zhang Fei (My dad had a Delta 88. That was a car. It was like driving your living room)
[ Post Reply | Private Reply | To 20 | View Replies]

To: kvanbrunt2; CodeToad
I wish I'd saved it, but in the last week I came across an article by an analyst for a household name bank, saying that continuing CapEx for AI would be $1.5 TRILLION by the end of 2028.

There is no way in this world or the next that they can generate that kind of revenue.

33 posted on 02/06/2026 9:49:05 AM PST by grey_whiskers (The opinions are solely those of the author and are subject to change without notice.)
[ Post Reply | Private Reply | To 26 | View Replies]

To: grey_whiskers

I saw a similar article touting the $1.5T CapEx. The OpEx was also in the few hundreds of billions already with limited revenue for it. They seem to be following the Build It and They Will Come model.


34 posted on 02/06/2026 9:52:32 AM PST by CodeToad
[ Post Reply | Private Reply | To 33 | View Replies]

To: yesthatjallen
Given that the LLMs were almost certainly trained on the Clang and GNU compiler source files this doesn't seem like such a big achievement.

It seems more like a port of the existing compilers to Rust, which while a worthwhile task isn't the same as the claim that the AI tool wrote a compiler from scratch.

If the task was "write a C compiler that will compile the Linux sources" without any restrictions on the use of existing source code then any barely competent coder could take the GNU or Clang source and "write" a C compiler.

A more interesting test for the AI tools is to write something original.

35 posted on 02/06/2026 10:59:46 AM PST by freeandfreezing
[ Post Reply | Private Reply | To 1 | View Replies]

To: freeandfreezing
LOL, from the article:

"The fix was to use GCC as an online known-good compiler oracle to compare against. I wrote a new test harness that randomly compiled most of the kernel using GCC, and only the remaining files with Claude's C Compiler. If the kernel worked, then the problem wasn’t in Claude’s subset of the files. If it broke, then it could further refine by re-compiling some of these files with GCC. This let each agent work in parallel, fixing different bugs in different files, until Claude's compiler could eventually compile all files."

So much for the AI tool, its feedback is coming from a known correct implementation of the same functionality. Good for porting, but probably not so good writing original code.

36 posted on 02/06/2026 11:07:57 AM PST by freeandfreezing
[ Post Reply | Private Reply | To 35 | View Replies]

To: yesthatjallen


Thanks to AI, Ralph Wiggum is now the most popular Simpsons character.
37 posted on 02/06/2026 11:12:11 AM PST by dfwgator ("I am Charlie Kirk!")
[ Post Reply | Private Reply | To 1 | View Replies]

To: freeandfreezing
And even then Claude cheats and depends on GCC:

"As one particularly challenging example, Opus was unable to implement a 16-bit x86 code generator needed to boot into 16-bit real mode. While the compiler can output correct 16-bit x86 via the 66/67 opcode prefixes, the resulting compiled output is over 60kb, far exceeding the 32k code limit enforced by Linux. Instead, Claude simply cheats here and calls out to GCC for this phase (This is only the case for x86. For ARM or RISC-V, Claude’s compiler can compile completely by itself.)"

Hmm, I think most programmers using the GCC code as a base could "write" a C compiler that "calls out" to GCC when needed. I think I could "write" that compiler in bash.

38 posted on 02/06/2026 11:15:58 AM PST by freeandfreezing
[ Post Reply | Private Reply | To 36 | View Replies]

To: yesthatjallen
"...(... I did see Claude 'pkill -9 bash' on accident, thus killing itself and ending the loop. Whoops!)...

Suicidal software, the new frontier.

39 posted on 02/06/2026 11:42:21 AM PST by Paal Gulli
[ Post Reply | Private Reply | To 1 | View Replies]

To: Paal Gulli

I definitely chuckled at that when I read the piece (which blows my mind, BTW).

Shows my age - and I was never an engineer anyway, but got somewhat forced into a role I wasn’t right for in a company making a digital move without hiring the right people. Also shows how horrid our infrastructure was.

But anyway... There was a constant little help agent/process that people abused - basically, used to dump logs to an email box (and at the time? MINE!)

I don’t mind helping but I got extremely annoyed to get the same damn error from the same damn people - especially when I took the time to explain what the error meant and how they could solve it themselves without dumping it to my team.

Anyway... root process I couldn’t touch.

But one day? I discovered I actually DID have a permission level to kill user sessions!

You can see where I’m going with this — and I wasn’t cruel, but I created a pretty simple log check against user IDs and some common codes. You got *two* in a week. *three* times on the same error? Your userID session just got terminated rather than spamming me and I also managed to work in a trap to force re-login to download the log and then challenge “Unread log”.

Like I said - shows both my age and how horrid our infrastructure is... but it took months until someone said “What the hell is spmxcushion.sh and who did this??!”

Spam-Executioner... It was a process to punish users that abused the “send log for help” option.

The best part was they’d have never found out if I hadn’t said “I did that” — and pointed out 1)whoever had control on permission levels totally screwed up, 2)count yourself lucky I’m not nefarious and didn’t abuse 1, 3)I’d have done it “better” if not for 1, and 4)here is the list of people who waste my team’s time by not listening

Fun stuff.


40 posted on 02/06/2026 12:42:28 PM PST by Capn Hayek (Capital is not responsible for Labor's lack of planning)
[ Post Reply | Private Reply | To 39 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-2021-4041-45 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson