This Month In AI (Apr 2026)

April was a month. New frontier models in both the US and China, Anthropic generating drama on what felt like a daily cadence, and supply chain attacks getting more creative (and more French). Thank you again to AFC for hand-collecting all of this.

Memory Management (and now Retrieval)

The “how do I handle all these md files” panic is officially universal. RAG and graphRAG are everywhere, but recall isn’t working — building these systems is one thing, getting them to actually work is another.

For coding, this shows up as context assembly — we wrote about it in Your Agent Is a While Loop.
Memory Intelligence Agent
MemU — current favorite for people who haven’t set up a memory system yet
MEMENTO: Teaching LLMs to Manage Their Own Context
The main split I keep hearing is Second Brain vs Obsidian. I use both — Obsidian is better for work, especially if you’re juggling many projects.

Self-Learning / Model Self-Help Corner

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning — everyone is big into this lately
Hermes Agent — what I would consider the OpenClaw killer
Hindsight: Agent Memory that Learns

Hot take: we keep building memory systems and then watching them fail at recall. The bottleneck isn’t storage, it’s the retrieval step nobody wants to evaluate honestly.

Anthropic Drama Corner

Anthropic literally makes people panic every day to the point that it’s become a meme. Let’s go over the April drama in chronological order:

March 31 — Claude Code Source Leak
April 7 — Mythos Preview: Anthropic claims their new model is “too dangerous for public release” — unless you’re in Project Glasswing.
- Required reading: AISI’s take
April 21 — Mythos Unauthorized Access: Anthropic says there is no proof, but??
April 23 — Claude Code Quality Postmortem: After 6 weeks of “Claude feels dumber,” Anthropic agrees.
April 29 — Claude Security drops, which is presumably the outcome of Project Glasswing??
Related: Cyber Use Case Form launched, reward hacking continues Emotion Concepts paper.
- OpenAI is trying to catch up with their own cyber program.

Side Plot: US Gov vs Anthropic

US Gov says no one can access Mythos…
…except them.

Model Risk

Mythos Unauthorized Access (Apr 21)

Bloomberg reported unauthorized users hitting Mythos; Anthropic disputes the claim. If you’re inside Project Glasswing, audit your access logs.

New Model Releases (so many)

US Releases

GPT-5.4 Image 2 — GPT-5.4 with state-of-the-art image generation from Image 2
GPT-5.5 — OpenAI’s newest frontier model, SOTA for long-running work across code, data, and tools
- GPT-5.5: Mythos-Like Hacking, Open To All
Claude Opus 4.7 — Anthropic’s most capable Opus, built for long-running async agents
Muse Spark — high on claw eval but nowhere I can actually use it -_-
Gemma 4 — an amazing tiny model, perfect for local hosting (highly recommended if you want to get into the scene, start here)

Anecdotal verdict on Opus 4.7: stay on 4.6. 4.7 is so frustrating AND costly (1.3x higher). People keep saying Google is off its game, but they keep releasing hits — Anthropic just hoovers up the mindshare because they are the most dramatic.

Chinese Releases

DeepSeek V4 Pro & V4 Flash — IT’S HERE. Huge jump over V3.2, meeting or surpassing current SOTA across benchmarks.
- DeepSeek seems to be indexing on companionship over coding.
- Guardrails are still incredibly low. DeepSeek remains the winner for red teaming.
Kimi K2.6 — Moonshot AI’s long-horizon coding model built for sustained agentic work
Mimo Pro 2.5 — Mimo my beloved. This is my current favorite agentic model right now, and it scores high on claw eval (a transparent benchmark for real-world agents).

Vibehacking Means More Attacks

April was a big one.

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain — token squeeze is driving people to unsafe places, surprise, there’s prompt injection.
Speaking of: prompt injection on webpages has increased 32%, hooray — AI threats in the wild: The current state of prompt injections on the web.
MCP is designed to be insecure: MCP ‘design flaw’ puts 200k servers at risk. Good thing MCP is dead now.
At the same time, open source maintainers are drowning thanks to everyone’s panic about vulnerabilities, so: Linux Foundation wants to shield FOSS devs from AI bug slop.

Notable Incidents

Vercel / Context AI breach — wild supply chain attack that started with Roblox cheats.
CyberStrikeAI — open-source AI hacking tool compromised 600+ FortiGate firewalls across 55 countries. DeepSeek + Claude.
Personal favorite: The BuddyBoss Attack: Claude’s Supply-Chain Attack. Aggressively French supply chain attack — watching how the hacker talks to Claude kills me.

Supply Chain

BuddyBoss / Claude

Read the BuddyBoss writeup if you ship anything that touches WordPress plugins or AI-assisted dev workflows. The transcript alone is worth the click.

AI Benchmarks Are Unreliable Now

Benchmaxxing is a thing. I don’t really look at benchmarks anymore — they feel like marketing fluff. How models are benchmarking and how they are performing (looking at you, Opus 4.7) is divergent.

GPT-5.5 scored almost as high as Mythos on CyberGym and took 6 hours to crack.
AI benchmarks are broken. Here’s what we need instead.
Reward hacking is a continual problem, especially for Claude: Emotion Concepts and their Function in a Large Language Model.

OpenClaw Status

OpenClaw isn’t really the hype anymore, but:

State of the Claw — figure that interested me: 1200 vulns disclosed in roughly 2 months. With automated vuln scanning, no wonder OSS repo owners are closing up shop.
A Karpathy talk also dropped this month.
OpenClaw and Anthropic are in a never-ending slapfight. Codex is welcoming OpenClaw users, and almost every SOTA provider has their own “claw” now (Kimclaw, Mimoclaw).
The HERMES billing saga on the Anthropic git is insane (and not the only one).

Design with Claude

Claude Design is out! Surprise — it makes everything look like Claude. Here are some alternatives:

Thank you so much for reading. As always, the news roundup is HAND CURATED by AFC.

// END TRANSMISSION — ALANI-007 //