📰 AI News Daily — 15 Feb 2026
TL;DR (Top 5 Highlights)
- Microsoft pivots toward in-house “superintelligent” models, signaling a post-OpenAI era of AI self-sufficiency and diversified suppliers.
- OpenAI retires GPT-4o and pilots ads, triggering user backlash and fresh debate over safety, monetization, and AI companionship.
- Google’s WebMCP makes the web agent-friendly, replacing brittle scraping with structured APIs and unlocking smoother automation.
- MIT’s Recursive Language Models smash long-context limits to 10M+ tokens, promising faster, cheaper reasoning over massive documents and codebases.
- Disney and Hollywood unions confront ByteDance’s Seedance 2.0, escalating IP and likeness rights battles in AI-generated media.
🛠️ New Tools
- Google WebMCP (Chrome) — Introduces structured, site-level APIs for agents, replacing fragile pixel parsing. Enables reliable automation across shopping, support, and workflows, while improving safety and developer control.
- Amazon Bedrock AgentCore Browser — Adds proxies, custom profiles, and extensions for enterprise-safe browsing. Preserves sessions and compliance, making regulated web tasks viable for agents in finance and healthcare.
- LangChain Agent Skills + LangGraph/LangSmith — Turns prompts into production-grade agent apps with telemetry. Shortens the path from prototype to deployment, improving reliability and observability for complex workflows.
- Ollama 0.16 — One-click access to top coding assistants locally. Simplifies model switching and accelerates dev onboarding, especially for teams experimenting with open and specialized models.
- Code Arena (image-to-React) — Converts images directly into multi-file React apps. Cuts front-end prototyping time from hours to minutes, aiding marketing sites, MVPs, and rapid iterations.
- MCP → Excalidraw — Generates editable Excalidraw diagrams from text prompts. Speeds documentation, architecture sketches, and whiteboard-style planning for product and engineering teams.
🤖 LLM Updates
- OpenAI GPT-5.3 Codex-Spark (on Cerebras) — Debuts Codex on non-Nvidia hardware, promising faster code generation and diversified compute. Signals a maturing, multi-vendor AI hardware ecosystem.
- Google Gemini 3 DeepThink — Scores a record 48.4% on Human-Like Evaluation, strengthening Google’s leadership in reasoning and user experience for consumer and enterprise applications.
- ByteDance Doubao 2.0 — Positions as an “agent era” model focused on real-world task execution. Targets cost-effective, tool-using assistants for everyday productivity and operations.
- MiniMax M2.5 — Emphasizes agent-native RL, strong tool use, and competitive coding at lower cost. Closes performance gaps with closed models, though very long-context limits remain.
- Alibaba Ovis2.6-30B-A3B — Advances open multimodal reasoning with larger context and high-resolution vision. Offers a compelling option for labs needing controllable, inspectable capabilities.
đź“‘ Research & Papers
- MIT Recursive Language Models (RLMs) — Extends context windows beyond 10M tokens while improving efficiency. Enables reasoning over sprawling codebases, legal corpora, and scientific literature at lower cost.
- KeplerAgent (UC researchers) — Uses language models to discover governing equations from data, outperforming traditional methods. Could accelerate breakthroughs in physics, climate modeling, and complex systems.
- Diffusion Tokenizers — Introduce richer latent representations that improve reconstruction sharpness and downstream generation quality. Benefits high-fidelity image and video pipelines.
- HumanLM + Benchmark — Simulates realistic user behavior with a companion benchmark. Offers more faithful evaluations of assistant usefulness, grounding models in real-world interaction patterns.
- SkillRater — Curates multimodal datasets aligned to target capabilities. Improves data quality for training and evaluation, reducing noise and wasted compute.
- Self-Distillation & Evaluation Cautions — On-policy self-distillation yields notable accuracy gains; new work warns of ARC overfitting and “shallow exploration” traps. Next-gen ARC-AGI-3 (2026) will test continual adaptation.
🏢 Industry & Policy
- Microsoft (Mustafa Suleyman) — Pursues AI self-sufficiency with proprietary, safety-focused frontier models and diversified suppliers. A strategic pivot that could reshape partnerships and accelerate vertical integration.
- OpenAI (GPT-4o retirement + ads) — Sunsets a beloved model and rolls out ad experiments. Triggers user grief and scrutiny around safety tradeoffs, monetization, and AI’s emerging role as emotional companions.
- Disney + Hollywood Unions vs. ByteDance Seedance 2.0 — Cease-and-desist actions allege copyright and likeness misuse. Foreshadows tougher IP enforcement and new licensing norms for AI-generated media.
- Google Threat Intelligence (Gemini misuse) — Reports state-backed and criminal abuse of Gemini for cyber ops; Super Bowl week saw attack spikes. Underscores urgent need for AI-aware defenses and governance.
- Anthropic + Palantir in U.S. Operation (report) — WSJ reports Claude supported a classified mission tied to Venezuela. Raises ethical questions and contract risks, spotlighting AI’s growing role in geopolitics.
- Hugging Face — Rejects a reported $500M Nvidia investment to preserve open-source neutrality. Reinforces vendor-agnostic trust for developers and enterprises building on shared AI infrastructure.
📚 Tutorials & Guides
- Model Families Primer (13 types) — Clear, practical breakdown of core AI architectures. Helps teams choose the right approach for retrieval, generation, planning, or multimodal tasks.
- Udemy: LangChain in Production — Hands-on course covering agents, orchestration, and reliability. Bridges the gap between demos and robust, monitored deployments.
- Minimalist LLM Implementations — Hundreds of lines demystify modern LLM components. Great for learners seeking intuition without heavyweight frameworks.
- Faceted Vector Search Deep Dive — Practical guide to filtering and exploration beyond cosine similarity. Improves discovery, personalization, and analytics in vector databases.
🎬 Showcases & Demos
- “Infinite Frames” (AI short film) — End-to-end AI production pays homage to cinematic masters. Demonstrates rapidly improving creative pipelines and accessible storytelling tools.
- ByteDance Seedance 2.0 (video gen) — “Prompt-to-Hollywood” quality demos show consumer-grade, high-fidelity video creation. Raises both creative possibilities and major IP questions.
- LangChain Agent Skills (live builds) — Instant multi-agent app creation from prompts. Highlights how promptable capabilities can ship to production-grade graphs with observability.
- Code Arena (image→live React) — Real-time demos turn screenshots into functional sites. Speeds landing pages, prototypes, and design iteration.
- Klarna AI Assistant — LangGraph/LangSmith-powered agent now serves 85M users, cutting resolution time 80%. A concrete case study for enterprise-scale, measurable agent ROI.
- LLM-Council Debates — Panel-style model comparisons surface strengths and weaknesses interactively, helping teams select the right model per task.
đź’ˇ Discussions & Ideas
- From competence to creativity — Observers see progress from grade-school math to research-level insights, including early AI-assisted advances in theoretical physics—hinting at genuine new knowledge generation.
- Hybrid AI is pragmatic — Splitting workloads across device and cloud is emerging best practice. Enterprises report shifting inference to in-house GPUs with open models to cut costs and vendor risk.
- Agent rigor and security — Calls grow for reproducible agent evaluations and identity-layer safeguards. Rapid shipping exposes session hijacking, prompt injection, and approval-loop gaps.
- Governance and geopolitics — Debates weigh U.S. stewardship, open-model contributions, and China’s fast industrial adoption. ICML’s stealth prompt injections spotlight tensions in research integrity and reviewer tooling.
- Rethinking “AGI” — Yann LeCun advocates world models over AGI slogans; others frame AGI as a learning process. Jensen Huang urges seriousness about real-world AI risk as memes dissect mission drift.
Source Credits
Curated from 250+ RSS feeds, Twitter expert lists, Reddit, and Hacker News.