📰 AI News Daily — 03 Oct 2025

TL;DR (Top 5 Highlights)

OpenAI hit a $500B valuation, inked mega chip alliances, and planned new South Korea data centers—supercharging the AI infrastructure race and roiling tech markets.
OpenAI launched Sora 2 and an invite-only social app, igniting viral AI video creation while raising deepfake, copyright, and platform policy concerns.
IBM returned to open models with Granite 4.0, a hybrid Mamba/Transformer family focused on efficient, controllable local deployment across 3B–32B parameters.
Google scaled up: near a quadrillion tokens processed monthly, Live Search on mobile, and Gemini 2.5 Flash image generation reached general availability.
Meta will target ads using users’ AI interactions, sharpening campaign relevance while intensifying privacy, transparency, and UX debates.

Airweave launched an open-source, bi-temporal knowledge base for agents, letting developers reason over live data from 30+ sources. It enables timely, auditable context for dynamic decision-making.
Microsoft released the open-source Agent Framework for Python and .NET, simplifying AI agent creation with built-in interoperability and observability. Enterprises gain a clearer path to production agents.
Perplexity rolled out the free global Comet Browser and introduced Comet Plus with premium news. It brings credible, curated sources into AI answers, improving trust and depth.
Thinking Machines’ Tinker offers a simple API for distributed LoRA fine-tuning of open-weight LLMs. Teams can customize models quickly without heavyweight infrastructure.
NVIDIA unveiled open models and simulation tools on Omniverse for robotics, accelerating development of adaptable, physically grounded robots across industries.
Mesh introduced an AI Wallet enabling autonomous, secure crypto purchases via stablecoins across 300+ services, foreshadowing agent-driven finance and automated commerce.

IBM Granite 4.0 returned to open models with a hybrid Mamba/Transformer design, cutting memory use while improving instruction following and tool use across 3B–32B sizes, including local deploys.
Qwen3 VL 235B delivered strong vision-language results at lower cost, offering enterprise-friendly multimodal capabilities for visual understanding and grounded reasoning.
Claude 4.5 showed broad industry traction with improved reliability and safety, reinforcing Anthropic’s positioning in regulated sectors like finance and healthcare.
vLLM 0.10.2 added encoder-only support via Transformers backend, expanded cudagraph coverage, and integrations for Qwen3-Next and InternVL 3.5—boosting serving speed and model breadth.
Gemini 2.5 Flash (image) reached GA with 10 aspect ratios, image-only output, and multi-image blending via AI Studio and API, improving creative control and production readiness.
Sora 2 expanded access and video fidelity, plus a social creation app for sharing/remixes. It advances creative tooling but heightens platform moderation and provenance challenges.

OpenMoE 2 demonstrated expert-choice sparse diffusion LMs with perfect load balancing, delivering ~20% throughput gains. It points to more efficient scaling and cheaper inference.
Low-rank tuning advances: LoRA rank‑1 matched full fine-tuning on select tasks with far less VRAM, enabling affordable domain adaptation on commodity hardware.
Training strategy matters: adding reasoning data early in pretraining produced durable gains that were hard to recover later—guiding future data curricula.
Recurrent transformers improved brain-representation fidelity and downstream NLP performance, hinting at biologically inspired architectures that also deliver practical gains.
Vision–generative bridges: visual encoders can act as tokenizers for diffusion models, and the MingTok tokenizer unified vision and language without vector quantization—simplifying multimodal stacks.
Healthcare frontiers: an AI model predicted illnesses decades before symptoms using genetics and health data, spotlighting preventative medicine’s potential and the need for rigorous validation.

OpenAI reached a $500B valuation, surpassing SpaceX, sparking market rallies and SaaS sell-offs. The milestone reinforces a compute super-cycle and intensifies competition across software categories.
OpenAI, Samsung, and SK Hynix partnered on the $500B Stargate AI data center plan and announced new South Korea data centers, positioning the country as a strategic AI infrastructure hub.
Japan’s Digital Agency adopted OpenAI tools to modernize public services, pairing productivity gains with security standards—an influential model for government AI adoption.
Meta will target ads using users’ AI interactions, promising higher relevance while raising fresh privacy, consent, and transparency questions in digital marketing.
Google reported infrastructure processing near a quadrillion tokens monthly and launched Live Search on mobile, underscoring unprecedented scale in real-time information access.
GoDaddy introduced a cryptographically verified identity system for AI agents, laying groundwork for global standards in agent trust, provenance, and safety.

A concise guide showed how to write high-performance Blackwell multi-GPU matmul kernels in ~150 lines, detailing memory movement, tiling, and scheduling for near-peak throughput.
Curated research roundups covered SimpleFold for proteins, zero-shot video learners, MetaEmbed, multimodal reasoning (MMR1), and black-box amplification—useful shortcuts for staying current.
A practical primer mapped the fast-growing ecosystem of AI data tools, helping teams choose reliable pipelines for labeling, curation, and governance.
A LlamaIndex podcast explored open-source agent frameworks and enterprise architectures, translating research advances into production-grade patterns.

Sora remixes exploded across social media within days, blending viral formats with provenance debates—a preview of AI-native video culture.
Developers turned World Labs single-image scene generation into a playable FPS prototype, hinting at rapid world-building for games and virtual production.
Claude Code autonomously built an MCP server in minutes, illustrating accelerating software scaffolding and agentic development workflows.
Google DeepMind and designer Ross Lovegrove used Gemini and image generation to translate artistic vision into concept collections, bridging design and AI exploration.
OmniRetarget delivered interaction-preserving humanoid motion retargeting, simplifying RL tracking and enabling more lifelike, transferable robot behaviors.

Analysts say OpenAI’s consumer playbook—turning frontier models into viral products—has outpaced rivals, pushing platforms to reassess AI video bans and creator monetization.
Anthropic’s warm, human-centered branding reframed assistant identity, potentially boosting trust and long-term adoption versus purely utilitarian bot personas.
Training debates highlighted durable gains from early reasoning data and strong RL math improvements, while scientific reasoning remains a tougher frontier.
A compute super-cycle looms: record token processing, faster software delivery, and surging AI-generated code volumes, offset by shadow AI risks and governance gaps.
Privacy tensions rose as Meta moves to ad targeting via AI interactions; experts urged clearer consent, user controls, and independent audits.
Sora’s “substance vs. spectacle,” the merits of specialized mini-agents, and the enduring “Bitter Lesson” framed how best to harness fast-moving multimedia models.

Curated from 250+ RSS feeds, Twitter expert lists, Reddit, and Hacker News.