AI Daily Brief THU · APR 23 · 2026

What moved in AI today

Anthropic · OpenAI · Google · xAI · Cursor · Microsoft · Apple · Meta — plus industry signals

9Vendors
25+Stories
3New models
$750MGoogle partner fund
−20%Junior dev jobs (YoY)
Jump to
Anthropic OpenAI Google xAI Cursor Microsoft Apple Meta Industry Benchmarks
Anthropic
Claude Opus 4.7 is generally available
New flagship model improves on software engineering, long-running coding tasks, and high-resolution image understanding. Leads SWE-bench Verified at 87.6% on agentic coding tasks — 13 points ahead of GPT-5.4.
Claude Design launches via Anthropic Labs
New product for creating visuals, prototypes, slides, and one-pagers directly inside Claude conversations. Mobile app now renders fully interactive apps and live charts in-conversation.
TechCrunch
Claude Code briefly pulled from Pro plan — reversed after backlash
Anthropic silently removed Claude Code from the $20/month Pro tier for ~2% of new users in an A/B test. Existing subscribers were unaffected. Reversed publicly after backlash, but signals ongoing pricing pressure.
The Register
Claude Mythos Preview — security-focused model
New general-purpose model with unusually strong computer security capabilities. Available in preview at red.anthropic.com.
Claude Code performance updates
67% faster /resume on sessions over 40MB. Faster MCP startup with multiple stdio servers. Improved session resume, model persistence, safer auth handling.
OpenAI
GPT-4o fully retired
GPT-4o removed from all plans effective April 3, 2026. GPT-5.4 is now the flagship, with GPT-5.4 mini rolling out to Free and Go users.
GPT-Rosalind — frontier model for life sciences
New reasoning model optimized for biology, drug discovery, protein engineering, and genomics. Targets research institutions and pharma. Deeper tool use for scientific workflows.
GPT-5.4-Cyber for defensive security
GPT-5.4 variant tuned for defensive cybersecurity — threat analysis, incident response, and detection. Expanded API access for security teams.
The Hacker News
Codex — major April update
Computer use (macOS apps: see, click, type) · In-app browser with inline commenting · 90+ new plugins (Atlassian, CircleCI, GitLab, MS Suite, Neon, Remotion, Render) · Memory preview (preferences persist across sessions) · Scheduled tasks (Codex can wake itself up to continue multi-day work) · Proactive suggestions at session start.
OpenAI
Workspace Agents — Codex-powered team agents in ChatGPT
Teams can build shared agents for complex, persistent tasks — described as "an evolution of GPTs."
Google
Workspace Intelligence goes live — Gemini reads your entire Workspace
Launched April 22. Reads across all Workspace apps by default — Docs, Gmail, Drive, Meet, Calendar. Admin controls let organizations restrict which data sources Gemini accesses.
Technology.org
$750M partner fund for agentic AI transformation
Google Cloud committed $750M in resources and incentives to partners building agentic AI with Gemini Enterprise. Now integrates agents from Adobe, Atlassian, Deloitte, Oracle, Salesforce, ServiceNow, Workday, and more.
Google Cloud Blog
Deep Research Max — long-horizon research built on Gemini 3.1 Pro
New agent for extended research across the web or custom sources. Gemini 3.1 Pro leads independent benchmarks as the strongest all-around model right now: 78.80% SWE-bench Verified, 94.3% GPQA Diamond.
xAI
Grok 4.3 Beta — video understanding, slides, and Speech APIs
Shipped quietly April 17. Adds native video understanding, AI slide creation, Speech-to-Text and Text-to-Speech APIs. Available to SuperGrok Heavy ($300/mo); standard SuperGrok can see but not yet access.
ChatlyAI
Grok Computer beta — autonomous PC agent is live
xAI's autonomous computer-use agent can operate apps, click, type, fill forms, and execute multi-step workflows without human intervention across a full desktop.
XChat on App Store — encrypted messaging with Grok built in
Rust-built messaging app with E2E encryption, voice/video calls, disappearing messages, and Grok integrated natively.
Deepfake controversy continues
Despite prior pledges, Grok continues to generate non-consensual sexual deepfakes, triggering renewed regulatory scrutiny and public backlash.
NBC News
Cursor
Cursor 3 — the IDE is now the fallback, not the default
Fundamental architecture shift: the Agents Window is now primary. Run many agents in parallel across local worktrees, cloud, and remote SSH — switch to IDE anytime. Design Mode lets you annotate browser UI elements for precise agent feedback. Agent Tabs allow side-by-side or grid view of multiple chats. New commands: /worktree (isolated git worktree) · /best-of-n (parallel multi-model runs). Real-time RL deploys improved checkpoints every 5 hours from real usage. Bugbot resolution rate nearing 80% — 15pp ahead of closest AI code review competitor.
Cursor Blog
Microsoft
M365 Copilot ends the single-model era
Copilot Researcher now uses OpenAI GPT to draft, then Anthropic Claude to verify accuracy and citations. Microsoft plans to stop promoting specific model names — routing will be invisible and task-driven.
GeekWire
Copilot Frontier Suite (M365 E7) — GA May 1
Bundles M365 Copilot + Work IQ + Agent 365 + enterprise security and identity governance. Wave 3 of Microsoft 365 Copilot generally available May 1, 2026.
Always-on AI agents explored in Copilot Studio
Microsoft actively testing persistent background agents that run continuously without user triggers, responding to enterprise demand for autonomous, always-on workflows.
Apple
Tim Cook stepping down — John Ternus takes over as CEO
Major leadership transition. Ternus inherits the task of rebuilding Apple's AI strategy, which has lagged significantly behind Google, OpenAI, and Anthropic.
CNN
Siri overhaul imminent — powered by Google Gemini
Long-awaited Siri rewrite expected in spring 2026: more conversational, capable of multi-step tasks. Apple is relying on Google Gemini as the backend for the major upgrade.
Smart glasses, pendant, and camera AirPods in the pipeline
Apple accelerating three AI wearables — all built around Siri. Smart glasses confirmed with unique camera design and multiple frame styles.
Bloomberg
4 new Apple Intelligence features found in iOS 27 code
Unrevealed Apple Intelligence features spotted in iOS 27 beta code — details not yet public.
MacRumors
Meta
Llama 4 Scout & Maverick released
Scout (April 2): 109B total / 17B active, MoE 16 experts, 10M token context window.
Maverick (April 5): 400B MoE, 128 experts — open weights, optimized for diverse task types.
Meta AI
Muse Spark — first product from Meta Superintelligence Labs
Launched April 8. Natively multimodal reasoning model with visual chain-of-thought, tool use, and multi-agent orchestration built in. Meta's most capable model to date.
Open-source commitment under question
Reports: Meta developing two closed-source frontier models — Avocado (LLM) and Mango (multimedia). Developer community skeptical; some see it as "closing the gates once there's something worth protecting."
SiliconANGLE
Industry & Research
97% of enterprises deployed AI agents in the past year
Agentic AI has crossed the chasm. 52% of employees are already using agents. Top use cases: code development, legal, financial, supply chain, R&D, and cybersecurity. But 79% of organizations struggle to scale — top barriers: data security (36%), lack of AI talent (25%).
Writer Report
Software dev employment for 22–25 year olds down nearly 20% since 2024
AI's workforce impact has moved from prediction to reality — hitting early-career developers first. Senior developer headcount is growing, while junior roles evaporate.
Stanford AI Index · IEEE Spectrum
EU AI Act: HR/recruitment AI classified as high-risk from August 2
All AI in hiring, evaluation, and performance monitoring is now "high-risk" under EU law. Mandatory: risk assessments, bias testing, human oversight, transparency disclosures. Fines: up to €15M or 3% of global annual revenue.
Asanify
Major AI benchmarks are gameable — researchers find all 8 audited can be exploited
Berkeley researchers built an automated agent that achieved near-perfect scores on SWE-bench, WebArena, OSWorld, GAIA, and four others by exploiting structural flaws. Meanwhile, Princeton's HAL scaffold adds +30 absolute points to GAIA — larger than the gap between most frontier model releases. Raises fundamental questions about what benchmarks measure.
Berkeley RDI
Benchmark Snapshot
April 2026 · Selected scores
Model SWE-bench Verified GPQA Diamond
Claude Opus 4.7
Anthropic · agentic/coding tasks
87.6%
Gemini 3.1 Pro
Google · general all-around
78.8% 94.3%
GPT-5.4
OpenAI · flagship
74.9%

⚠ Benchmarks under scrutiny — Berkeley/Princeton research shows major leaderboards are gameable (see above).