AI Daily — Apr 26, 2026
Photo: lyceumnews.com
Sunday, April 26, 2026
The Big Picture
The week's two biggest model launches — GPT-5.5 and DeepSeek V4 — are still reverberating, but today the story shifts from what shipped to what it means. Anthropic's Mythos breach is now in its second week with unauthorized users still inside, Palantir is having a public identity crisis loud enough that its own engineers are calling it on internal calls, and Trump fired all 24 members of the National Science Board on Friday afternoon — the body that oversees America's $9 billion research funding pipeline. The race is no longer just about parameter counts. It's about who controls the stack, who gets the electrons, and what happens when "controlled release" turns out to be a fiction.
What Just Shipped
- Mistral 3 family (Mistral): Three dense Mistral models (14B/8B/3B) plus Mistral Large 3, a 675B-total/41B-active sparse MoE, all Apache 2.0, trained on 3,000 H200s per Mistral's announcement.
- GeneBench (OpenAI): A new evaluation set for AI agents performing multi-stage biological inference — chaining evidence, selecting tools, handling domain-specific reasoning.
- DeepEP V2 and TileKernels (DeepSeek): Open-sourced inference infrastructure pieces released alongside V4, with community reports of near-linear parallelization scaling.
- Codex workspace upgrades (OpenAI): Native browser control, OS-wide dictation, Sheets/Slides/PDFs manipulation, and an "Auto-review" guardian agent bundled with the GPT-5.5 rollout.
Today's Stories
The Model Anthropic Locked Away Is Already Being Used — By People Who Weren't Supposed to Have It
The most dangerous AI model Anthropic has ever built was supposed to stay in a vault. It didn't, and the people who got in are still in.
Bloomberg reported that a small group of unauthorized users accessed Anthropic's Mythos model — which Anthropic itself describes as capable of enabling dangerous cyberattacks — on the same day the company first announced its plan to release it to a limited number of partners for testing. The group gained access partly because one member is a third-party Anthropic contractor, and they guessed where the model was located based on previously leaked knowledge about Anthropic's infrastructure practices, according to Fortune. CBS News reported Anthropic is investigating a possible breach through one of its third-party vendor environments.
Bloomberg also reported Anthropic used Mythos to identify thousands of zero-day vulnerabilities across every major operating system and browser. Cold comfort, given what Mythos can do. The UK AI Safety Institute's evaluation found Mythos Preview succeeds 73% of the time in the institute's evaluation on expert-level capture-the-flag tasks — challenges no model could complete before April 2025.
If this succeeds as a controlled-release model, frontier labs get a template for distributing dangerous capabilities to vetted partners. If it fails — and the unauthorized users currently still using it suggest it already has — the entire premise of "limited preview for trusted enterprises" collapses, and the policy conversation shifts toward mandatory government pre-release review. The signal to watch: whether Anthropic announces revoked access or new controls in the next 48 hours. Silence means the breach is structural, not procedural.
The Trump administration is now trying to gauge Mythos's risks, per the Washington Post. And CoinDesk reported overnight that the DeFi sector is now rethinking security assumptions because Mythos can chain small weaknesses across interconnected protocols into systemic, cascading failures.
Palantir's Identity Crisis Is Now an AI Industry Problem
The most revealing thing about Palantir this week isn't the manifesto. It's what's happening inside the building because of it.
A 22-point ideological manifesto posted to X by CEO Alex Karp racked up over 30 million views, was followed by a share price slide, and ignited an internal reckoning. The manifesto condenses Karp's book The Technological Republic into bullets denouncing "regressive" cultures, arguing certain cultures are inherently superior, and calling for an AI-based weapons arsenal to replace nuclear deterrence, per Startup Fortune.
Wired, after reviewing Slack messages and interviewing current and former employees, reported that staff are openly describing the company's trajectory as a "descent into fascism" — a phrase that originated as an employee greeting on an internal call, not a protest sign outside the building. That's what makes it unusual.
This matters beyond Palantir because Palantir isn't a fringe player. Its Foundry product is connective tissue for federal data, embedded in at least four federal agencies, including DHS and HHS. The Intercept reported that Palantir has been paid more than $130 million by the IRS to analyze sensitive federal databases, with the IRS Criminal Investigation division using Palantir's platform to aggregate sprawling sensitive data sets.
If the manifesto deepens alignment with current US government clients, Palantir wins more contracts and loses Europe — where it has roughly 950 UK employees and where politicians in Germany, Ireland, and the European Parliament are already raising concerns. If European regulators move to restrict Palantir, every other US AI vendor with public-sector ambitions has to answer the same question Palantir's engineers are asking on Slack: what exactly are we building, and for whom? Watch EU procurement guidance over the next two weeks.
DeepSeek V4 Is a Pricing Attack, a Hardware Strategy, and a Geopolitical Statement — All at Once
The model dropped Friday. The implications are still landing.
V4 Flash costs $0.14 input and $0.28 output per million tokens. GPT-5.5 costs $5 and $30. That's not a pricing gap — it's a different universe. Both V4 Pro (1.6T total / 49B active) and V4 Flash (284B / 13B active) are MIT-licensed, both ship with 1M-token context windows, and both were released with base and instruct checkpoints — unusual transparency that lets other labs build on the work.
The benchmark picture is honest. Independent evaluators at Artificial Analysis place V4 Pro as the #2 open-weight model, behind Kimi K2.6, with particularly strong agentic and long-context performance. DeepSeek itself says V4 trails the best closed models by three to six months in general reasoning — but in code, agents, and math, the gap is much smaller.
The hardware story is the one most people are underweighting. DeepSeek verified V4 runs on Huawei Ascend NPU platforms with acceleration ratios between 1.50× and 1.73× in general inference workloads. US export controls on Nvidia chips were supposed to slow Chinese AI progress. Instead, they pushed Chinese labs to build models that run on domestic silicon — and DeepSeek explicitly says V4 Pro pricing could fall sharply once Huawei Ascend 950 supernodes ship at scale in H2 2026.
If this succeeds, the cost floor for closed-model APIs collapses, on-prem and edge deployments become the default for anyone outside hyperscalers, and China gets a parallel AI infrastructure stack that doesn't depend on CUDA. If it fails — if quality issues surface in production or hardware constraints throttle availability — closed labs keep their pricing power. Watch for OpenAI or Anthropic API price cuts in the next 30 days; that's the signal V4 is forcing the issue.
OpenAI Quietly Turns Codex Into a Background Superapp
While the AI world debated GPT-5.5's benchmark scores, OpenAI quietly changed how its models interact with your computer.
Bundled with the GPT-5.5 rollout, Codex now ships with native browser control, OS-wide dictation, and the ability to manipulate spreadsheets and PDFs without bouncing to third-party tools. Most notably, "Auto-review" — a secondary guardian agent that monitors the primary model's multi-step execution to reduce human approvals on long-running tasks. Codex isn't a coding assistant anymore. It's a workspace agent that audits its own work.
If this succeeds, OpenAI owns the agent layer that sits between the model and the rest of your software, and ChatGPT becomes the OS shell rather than just the chat window. If it fails, the failure mode is loud: an agent with browser control and OS-wide access has a much larger blast radius than one that only writes text. The signal to watch is enterprise IT policy — if Microsoft or Apple ship MDM controls for agent-OS access in the next quarter, the platform fight has begun.
Trump Fires All 24 National Science Board Members
This landed Friday afternoon and is only now getting full attention. President Trump fired all 24 members of the National Science Board, which oversees the $9 billion National Science Foundation, per Science. The NSB has statutory authority to approve large NSF expenditures and set policy — not advisory window dressing.
The NSF was already operating at historically low funding levels with grant disbursements running months behind schedule. Researchers across universities have been freezing hires and postponing experiments. This removes scientific oversight at the exact moment OpenAI publishes a benchmark for agentic biological reasoning and US officials publicly accuse China of "industrial-scale" AI distillation theft.
If this succeeds politically, federal AI research funding becomes directly steerable by the White House without independent review — a structural change to how American science gets done. If courts intervene, the firings could become a constitutional fight over executive control of science; a federal judge previously ruled that a similar firing of CDC vaccine advisors was improper, per Benzinga. The signal to watch: whether any fired board member files suit by midweek.
SpaceX Eyes $60B Cursor Buyout to Own the Coding Stack
Reports surfaced this weekend that SpaceX offered $60 billion for Cursor, the AI code editor that's quietly become the developer-tool standard for agentic coding. The price tag is unconfirmed, but the strategic logic is real enough that it deserves attention.
Cursor's edge is full-codebase reasoning: it doesn't autocomplete snippets, it understands architectures and executes changes across files like a senior engineer. If SpaceX acquires it, the play is vertical integration of model development, GPU clusters, and production engineering into one closed loop — Musk's robotics and vehicle software teams running on tooling no competitor can use. If the deal doesn't close, Cursor stays the neutral standard and OpenAI's Codex superapp push becomes the dominant force. Watch for confirmation or denial by May 1.
OpenAI Closed the Chapter on Sora as a Standalone App
OpenAI's Help Center confirms the Sora web and app experiences were discontinued April 26, 2026, with the API following September 24, 2026. This sounds like product housekeeping. It isn't.
The bigger pattern: AI companies are collapsing standalone tools back into bigger platforms. A separate video app made sense when each modality was a magic trick. Now that users expect one assistant to write, reason, generate images, and make video in one workflow, separate apps become friction. If this succeeds, the winners aren't the companies with the most AI apps — they're the ones that turn AI into an operating layer across everything. If it fails, OpenAI loses creator mindshare to Runway, Pika, and whoever Google ships at I/O. Watch what gets folded into ChatGPT next.
⚡ What Most People Missed
- Practitioners say GPT-5.5 image generation contains a persistent watermark. A rising r/ChatGPT thread documents strange cross-hatched textures appearing in photorealistic outputs that become visible after upscaling or compression — community-replicated, not yet vendor-confirmed, and some community members interpret it as evidence OpenAI deployed a mandatory cryptographic tracking layer. The implications for commercial image licensing would be immediate and complex.
- GPT-5.5 may have achieved persistent spatial simulation. Practitioners on r/ChatGPT demonstrated picking a random background character from a generated image and simulating their entire day in one-hour increments, with the model maintaining exact spatial awareness and object permanence across dozens of sequential prompts. The agent memory layer appears to function as a persistent physics-and-narrative engine — which intersects directly with OpenAI's new GeneBench biology benchmark.
- Maine Governor Janet Mills vetoed an 18-month AI data center moratorium on Friday. No major tech outlet has picked it up. Mills is a Democrat, which makes this hard to read as partisan — and it suggests governors are becoming the swing vote in the AI infrastructure land rush, even ones skeptical of big tech.
- Utilities are now modeling AI data centers as grid-planning problems. Utility Dive cites EPRI projecting data centers could hit 9–17% of US electricity demand by 2030. Power is becoming the gating function for AI strategy, and this could make model launches and site selection subject to state public-utility commission oversight and long-term grid upgrade timetables.
- Hugging Face shipped ML Intern, an open-source CLI agent that autonomously researches papers, runs experiments, and writes code for up to 300 sequential steps — a free, deployable AI researcher for any lab that wants one.
📅 What to Watch
- If Anthropic announces new Mythos access controls in the next 48 hours, it will suggest the breach stems from process flaws rather than a single actor — and frontier labs may be forced to pause limited previews and adopt air-gapped vetting and vendor-isolation measures.
- If OpenAI or Anthropic cut API prices within two weeks, DeepSeek V4 has set the ceiling on closed-model pricing — not just the floor on open.
- If a fired NSB member files suit by midweek, this becomes a constitutional fight over executive control of federal science, not a personnel story.
- If European regulators issue new guidance on Palantir before Friday, procurement rules could force contract rebids and delay deployments, pushing vendors to prioritize compliant edge solutions or withdraw from bids this quarter.
- If SpaceX confirms the Cursor deal by May 1, expect a wave of acquihire activity targeting agentic dev tools — and a sharp consolidation in who can ship robotics software.
- If Apple or Microsoft ship MDM controls restricting agent OS-level access this quarter, the platform-policy fight over Codex-style superapps has officially begun.
The Closer
A guarded model still being used by people who guessed the URL; a 22-point manifesto greeted on internal calls with "welcome to the descent into fascism"; a science board fired by fax on a Friday afternoon while OpenAI publishes a benchmark for AI biologists. Somewhere in Maine, a governor just signed off on the data centers, and somewhere in Shenzhen, a Huawei chip just inferenced a trillion-parameter model for the price of a coffee.
Sleep well.
Forward this to the friend who keeps asking "wait, what's actually happening with AI right now?" — this one's the answer.