📰 AI News Daily — 28 Nov 2025
TL;DR (Top 5 Highlights)
- Google launches Gemini 3, pushing agentic automation and multimodal reasoning back to the forefront of the AI race.
- OpenAI confirms a vendor-side data breach via Mixpanel, severs ties, and expands enterprise data residency to 10 regions to rebuild trust.
- Major publishers file a $10B copyright suit against OpenAI and Microsoft, intensifying legal battles over training data and AI outputs.
- DeepSeek-Math-V2 achieves IMO gold-level performance with open Apache-2.0 weights, raising the bar for open reasoning models.
- Real-world constraints bite: an MIT study estimates 11–12% of U.S. jobs are automatable by agentic AI, while CEOs warn of looming power and data center shortages.
🛠️ New Tools
- LangChain Deep Agents launch secure sandboxes for remote, reproducible code and bash execution, giving teams safer, auditable agent workflows for real production use.
- n8n + MCP integrations let ChatGPT and Claude search, trigger, and manage workflows directly, tightening the loop between LLMs and real business automations.
- fal FLUX.2 [dev]/[pro] ships to production with free daily comparisons; Z-Image-Turbo now runs on Hugging Face via fal, boosting fast, aesthetic image generation.
- vLLM Ray APIs streamline high-throughput MoE inference, simplifying distributed serving and cutting latency for large, sparse reasoning systems at scale.
- Google Antigravity IDE debuts with Gemini 3 for smarter coding assistance and real-time debugging, aiming to accelerate developer productivity across experience levels.
- OpenAI Sora (Android) rolls out globally, bringing AI video creation to mainstream mobile users and expanding short-form creative workflows beyond text and images.
🤖 LLM Updates
- Google Gemini 3 arrives with stronger language, vision, and reasoning, enabling richer enterprise automations and sharpening competition across multimodal and agentic tasks.
- DeepSeek-Math-V2 posts IMO gold and near-Putnam perfection with verifier-driven training; Apache-2.0 weights on Hugging Face make elite math reasoning broadly accessible.
- INTELLECT-3 (PrimeIntellect) scales RL to 100B+ MoE, reporting SOTA in math, coding, and reasoning while open-sourcing recipes for training large sparse models.
- Microsoft Fara-7B targets computer-use with an agentic small language model, signaling momentum toward lightweight, specialized assistants for on-device tasks.
- Anthropic Claude Opus 4.5 closes the design loop—generate, critique, iterate—and shows early PC control demos plus “think inside files” mode for deeper reasoning traces.
- Benchmarks stay tight: Claude Opus 4.x edges Gemini Pro 3 on scientific tasks; community code arenas show narrow spreads among top-tier models.
đź“‘ Research & Papers
- JECS GWAS dataset (Japan) releases childhood genetics data across 1,148 traits, creating a landmark public resource for developmental and epidemiological research.
- Sparse attention wins recognition via Qwen and DeepSeek papers, underscoring practical efficiency gains for long-context and high-throughput transformers.
- Mean-pooling emerges as a simple, strong context compression method, improving reliability and reducing complexity compared to heavier-weight summarization techniques.
- Beluga (CXL memory) cuts LLM latency by ~90% and boosts throughput 7.35x, pointing to near-term performance wins via composable memory architectures.
- P1 physics model achieves gold-level performance at the International Physics Olympiad using reinforcement learning, showcasing progress in non-language reasoning.
- MIT labor study estimates agent-based AI could automate tasks tied to ~11–12% of U.S. jobs, informing urgency around reskilling and policy planning.
🏢 Industry & Policy
- OpenAI confirms a Mixpanel-linked breach exposing some API user names, emails, and locations; ends the partnership, warns on phishing, and tightens vendor security reviews.
- A coalition of news outlets sues OpenAI and Microsoft for $10B over alleged copyright misuse, escalating pressure to clarify AI training and fair-use boundaries.
- The UK’s AISI launches propensity evaluations and allocates over £15M for alignment research, strengthening institutional capacity for safety and evaluation.
- Satya Nadella and Sam Altman warn of an energy and data center crunch, signaling that power availability may cap AI growth more than chips or capital.
- OpenAI adds enterprise and API data residency across 10 regions, helping clients meet compliance obligations and strengthening trust for regulated deployments.
- Researchers flag critical security gaps in popular stacks (e.g., Llama, TensorRT), urging stronger defenses around protocols like MCP and model-serving layers.
📚 Tutorials & Guides
- Anthropic shares operational guidance for long-running agents, emphasizing memory design, context hygiene, and failure handling for dependable production behavior.
- Hugging Face explains modern inference—continuous batching, KV caching, chunked prefill, and smart decoding—showing how each technique lifts throughput in practice.
- Deep dives into multi-vector retrieval (e.g., mgrep) show how token-efficient indexing improves recall and reduces context budgets for large-scale RAG systems.
- Builder summit notes cover quantization, attention optimization, and multi-node deployment, offering practical playbooks for cutting costs without sacrificing quality.
- A top course on Deep Representation Learning releases full slides and videos, giving practitioners a rigorous foundation in modern representation techniques.
- Weekly research roundups distill advances in reasoning, simulation, small multimodal models, and agent design, saving time for practitioners tracking fast-moving work.
🎬 Showcases & Demos
- A recreation of Anthropic’s “Eiffel Tower Llama” demo uses sparse autoencoders to steer model activations, with a live walkthrough of interpretability in action.
- Seven LLMs roleplay a coordinated Mafia game with synthetic voice acting, highlighting progress in multi-agent planning, deception, and conversational dynamics.
- Early demos show Claude Opus 4.5 controlling a PC to handle complex tasks, hinting at near-term viability for robust, general-purpose computer-use agents.
- Flux 2 launches an AI art contest spotlighting the latest models’ aesthetics and control, fueling community benchmarks for creative quality.
- Colombian students win the NASA Space Apps Challenge with “Exoplanet Hunter AI,” turning real data into discovery tools and demonstrating global AI talent.
đź’ˇ Discussions & Ideas
- Ilya Sutskever and others argue breakthroughs require original research, not just 100x scaling; NVIDIA findings echo diminishing returns on naĂŻve scale-ups.
- Governance debates heat up: critics challenge x-risk framings, while some accuse Anthropic of leveraging security narratives to discourage open-source competition.
- Enterprises increasingly prefer open and sovereign AI for privacy and control; Cohere–SAP deepen partnership to operationalize regionally compliant deployments.
- Studies suggest idea diversity improves research agents, activation probing can reduce sycophancy, and some hallucinations mirror biases in human training data.
- Builders report messy reality for agents: durable memory, feedback loops, monitoring, and context engineering—not prompts alone—decide reliability and ROI.
Source Credits
Curated from 250+ RSS feeds, Twitter expert lists, Reddit, and Hacker News.