📰 AI News Daily — 15 Feb 2026

TL;DR (Top 5 Highlights)

Microsoft pivots toward in-house “superintelligent” models, signaling a post-OpenAI era of AI self-sufficiency and diversified suppliers.
OpenAI retires GPT-4o and pilots ads, triggering user backlash and fresh debate over safety, monetization, and AI companionship.
Google’s WebMCP makes the web agent-friendly, replacing brittle scraping with structured APIs and unlocking smoother automation.
MIT’s Recursive Language Models smash long-context limits to 10M+ tokens, promising faster, cheaper reasoning over massive documents and codebases.
Disney and Hollywood unions confront ByteDance’s Seedance 2.0, escalating IP and likeness rights battles in AI-generated media.

🛠️ New Tools

Google WebMCP (Chrome) — Introduces structured, site-level APIs for agents, replacing fragile pixel parsing. Enables reliable automation across shopping, support, and workflows, while improving safety and developer control.
Amazon Bedrock AgentCore Browser — Adds proxies, custom profiles, and extensions for enterprise-safe browsing. Preserves sessions and compliance, making regulated web tasks viable for agents in finance and healthcare.
LangChain Agent Skills + LangGraph/LangSmith — Turns prompts into production-grade agent apps with telemetry. Shortens the path from prototype to deployment, improving reliability and observability for complex workflows.
Ollama 0.16 — One-click access to top coding assistants locally. Simplifies model switching and accelerates dev onboarding, especially for teams experimenting with open and specialized models.
Code Arena (image-to-React) — Converts images directly into multi-file React apps. Cuts front-end prototyping time from hours to minutes, aiding marketing sites, MVPs, and rapid iterations.
MCP → Excalidraw — Generates editable Excalidraw diagrams from text prompts. Speeds documentation, architecture sketches, and whiteboard-style planning for product and engineering teams.

🤖 LLM Updates

OpenAI GPT-5.3 Codex-Spark (on Cerebras) — Debuts Codex on non-Nvidia hardware, promising faster code generation and diversified compute. Signals a maturing, multi-vendor AI hardware ecosystem.
Google Gemini 3 DeepThink — Scores a record 48.4% on Human-Like Evaluation, strengthening Google’s leadership in reasoning and user experience for consumer and enterprise applications.
ByteDance Doubao 2.0 — Positions as an “agent era” model focused on real-world task execution. Targets cost-effective, tool-using assistants for everyday productivity and operations.
MiniMax M2.5 — Emphasizes agent-native RL, strong tool use, and competitive coding at lower cost. Closes performance gaps with closed models, though very long-context limits remain.
Alibaba Ovis2.6-30B-A3B — Advances open multimodal reasoning with larger context and high-resolution vision. Offers a compelling option for labs needing controllable, inspectable capabilities.

📑 Research & Papers

MIT Recursive Language Models (RLMs) — Extends context windows beyond 10M tokens while improving efficiency. Enables reasoning over sprawling codebases, legal corpora, and scientific literature at lower cost.
KeplerAgent (UC researchers) — Uses language models to discover governing equations from data, outperforming traditional methods. Could accelerate breakthroughs in physics, climate modeling, and complex systems.
Diffusion Tokenizers — Introduce richer latent representations that improve reconstruction sharpness and downstream generation quality. Benefits high-fidelity image and video pipelines.
HumanLM + Benchmark — Simulates realistic user behavior with a companion benchmark. Offers more faithful evaluations of assistant usefulness, grounding models in real-world interaction patterns.
SkillRater — Curates multimodal datasets aligned to target capabilities. Improves data quality for training and evaluation, reducing noise and wasted compute.
Self-Distillation & Evaluation Cautions — On-policy self-distillation yields notable accuracy gains; new work warns of ARC overfitting and “shallow exploration” traps. Next-gen ARC-AGI-3 (2026) will test continual adaptation.

🏢 Industry & Policy

Microsoft (Mustafa Suleyman) — Pursues AI self-sufficiency with proprietary, safety-focused frontier models and diversified suppliers. A strategic pivot that could reshape partnerships and accelerate vertical integration.
OpenAI (GPT-4o retirement + ads) — Sunsets a beloved model and rolls out ad experiments. Triggers user grief and scrutiny around safety tradeoffs, monetization, and AI’s emerging role as emotional companions.
Disney + Hollywood Unions vs. ByteDance Seedance 2.0 — Cease-and-desist actions allege copyright and likeness misuse. Foreshadows tougher IP enforcement and new licensing norms for AI-generated media.
Google Threat Intelligence (Gemini misuse) — Reports state-backed and criminal abuse of Gemini for cyber ops; Super Bowl week saw attack spikes. Underscores urgent need for AI-aware defenses and governance.
Anthropic + Palantir in U.S. Operation (report) — WSJ reports Claude supported a classified mission tied to Venezuela. Raises ethical questions and contract risks, spotlighting AI’s growing role in geopolitics.
Hugging Face — Rejects a reported $500M Nvidia investment to preserve open-source neutrality. Reinforces vendor-agnostic trust for developers and enterprises building on shared AI infrastructure.

📚 Tutorials & Guides

Model Families Primer (13 types) — Clear, practical breakdown of core AI architectures. Helps teams choose the right approach for retrieval, generation, planning, or multimodal tasks.
Udemy: LangChain in Production — Hands-on course covering agents, orchestration, and reliability. Bridges the gap between demos and robust, monitored deployments.
Minimalist LLM Implementations — Hundreds of lines demystify modern LLM components. Great for learners seeking intuition without heavyweight frameworks.
Faceted Vector Search Deep Dive — Practical guide to filtering and exploration beyond cosine similarity. Improves discovery, personalization, and analytics in vector databases.

🎬 Showcases & Demos

“Infinite Frames” (AI short film) — End-to-end AI production pays homage to cinematic masters. Demonstrates rapidly improving creative pipelines and accessible storytelling tools.
ByteDance Seedance 2.0 (video gen) — “Prompt-to-Hollywood” quality demos show consumer-grade, high-fidelity video creation. Raises both creative possibilities and major IP questions.
LangChain Agent Skills (live builds) — Instant multi-agent app creation from prompts. Highlights how promptable capabilities can ship to production-grade graphs with observability.
Code Arena (image→live React) — Real-time demos turn screenshots into functional sites. Speeds landing pages, prototypes, and design iteration.
Klarna AI Assistant — LangGraph/LangSmith-powered agent now serves 85M users, cutting resolution time 80%. A concrete case study for enterprise-scale, measurable agent ROI.
LLM-Council Debates — Panel-style model comparisons surface strengths and weaknesses interactively, helping teams select the right model per task.

💡 Discussions & Ideas

From competence to creativity — Observers see progress from grade-school math to research-level insights, including early AI-assisted advances in theoretical physics—hinting at genuine new knowledge generation.
Hybrid AI is pragmatic — Splitting workloads across device and cloud is emerging best practice. Enterprises report shifting inference to in-house GPUs with open models to cut costs and vendor risk.
Agent rigor and security — Calls grow for reproducible agent evaluations and identity-layer safeguards. Rapid shipping exposes session hijacking, prompt injection, and approval-loop gaps.
Governance and geopolitics — Debates weigh U.S. stewardship, open-model contributions, and China’s fast industrial adoption. ICML’s stealth prompt injections spotlight tensions in research integrity and reviewer tooling.
Rethinking “AGI” — Yann LeCun advocates world models over AGI slogans; others frame AGI as a learning process. Jensen Huang urges seriousness about real-world AI risk as memes dissect mission drift.

Source Credits

Curated from 250+ RSS feeds, Twitter expert lists, Reddit, and Hacker News.