INAI • The Open AI Hub

📰 AI News Daily — 29 Dec 2025

TL;DR (Top 5 Highlights)

OpenAI is hiring a $555K Head of Preparedness, signaling tougher safety oversight amid legal pressures and intense competition.
Meta released the RPG dataset (22,000 tasks) on Hugging Face to accelerate “AI co‑scientist” systems with built‑in evaluations.
Google deepens Gemini’s reach: smarter photo scouting in Maps, Android Assistant replacement by 2026, and rumored paid Chrome AI tiers.
GLM‑4.7 leads open‑source rankings as world models surge; local inference speeds up on Apple silicon via MLX.
Samsung unveils its first 2nm chip; China proposes stricter AI rules—hardware and policy are reshaping AI’s next phase.

🛠️ New Tools

LangChain: An interactive LLM inference visualizer shows token flows and context effects in real time, helping developers debug prompts and improve reliability during production deployments.
A3B & 8B model weights: Fresh checkpoints on Hugging Face enable hands‑on experimentation with recent training runs, speeding community benchmarking and fine‑tuning efforts.
Google Gemini in Maps: Conversational location scouting for photographers suggests vantage points, lighting, and crowd levels—turning trip planning into a creative co‑pilot for better shots.
World App (OpenAI/Tools for Humanity): A biometric super app combining identity, encrypted messaging, and crypto payments aims to fight deepfakes and fraud while simplifying user verification.
1inch Network + SavantChat: AI‑powered audits for DeFi transactions promise faster threat detection and lower costs, improving trust and security across decentralized finance ecosystems.
YAKSH (Uttar Pradesh Police): An AI‑enhanced app adds facial recognition, voice search, and gang analytics to identify suspects and track crime networks, modernizing law enforcement workflows.

🤖 LLM Updates

GLM‑4.7: Tops independent rankings, reinforcing open‑source momentum and offering strong baseline performance for coding, reasoning, and agentic tasks without closed‑source dependencies.
2025 World Models: LeJEPA, Dreamer 4, Genie 3, Cosmos WFM 2.5, and Code World Model highlight a shift toward models that reason over time and control environments, not just predict tokens.
Qwen3: “Attention sink” analysis shows specialized handling of key tokens, informing better context usage, prompt design, and efficiency strategies for long‑context applications.
MiniMax‑M2.1 on Apple M3 Ultra (MLX): Strong local inference performance underscores rapid on‑device gains, reducing latency, cost, and privacy risks for desktop‑class deployments.
vLLM (MLX backend teased): Native Apple silicon acceleration promises faster throughput and lower memory overhead for Mac‑based inference, aiding local development and testing.
Google Gemini on Android: Gemini’s overlay will replace Assistant by 2026, enabling uninterrupted multitasking and richer, persistent AI sessions that blend on‑device and cloud capabilities.

📑 Research & Papers

Meta RPG Dataset: 22,000 structured research tasks with rubrics and references aim to speed “AI co‑scientist” development, enabling reproducible evaluation of reasoning and tool use.
Egocentric2Embodiment: Converts first‑person videos into structured Q&A to bridge perception and physical intelligence, improving grounding for robots and embodied agents.
Video Zero‑Shot Transfer: New video models demonstrate strong task transfer without retraining, hinting at a step‑change for vision systems in robotics, navigation, and surveillance.
CuTe DSL Kernel: A compact TV‑layout kernel outperforms Torch RMSNorm on B200 GPUs, showing targeted kernel engineering can deliver outsized performance and cost efficiency.
SonicMoE: IO‑aware and tile‑aware optimizations streamline Mixture‑of‑Experts throughput, improving expert routing efficiency and lowering inference latency for scaled deployments.

🏢 Industry & Policy

OpenAI (Head of Preparedness): A $555K role to lead risk mitigation across cybersecurity, misuse, and mental health reflects escalating safety expectations and pre‑regulatory alignment.
Nvidia: Jonathan Ross becomes Chief Software Architect, signaling a deeper push into advanced AI software stacks alongside hardware leadership to sustain competitive moat and developer ecosystem.
Samsung: First 2nm chip targets next‑gen performance and efficiency, strengthening mobile and edge AI capabilities and tightening the hardware‑AI co‑design feedback loop.
China Draft AI Rules: New proposals require algorithm checks, safety protections, and strict content limits—raising compliance costs while pushing standardization and accountability.
Google Chrome (Paid AI): Code hints at Google AI Pro/Ultra tiers for premium features like agentic browsing and summarization, foreshadowing browser monetization in 2025 and beyond.
AI “Slop” Concern: Over half of English web content may be AI‑generated, driving demand for higher‑quality, user‑first design and better detection to restore trust.

📚 Tutorials & Guides

Policy Optimization Beyond PPO: A technical review of GRPO, DR.GRPO, GSPO, DAPO, and variants helps researchers modernize RL pipelines for stability, sample‑efficiency, and safer exploration.
Latent Space Year‑End: Recaps on OpenAI’s Codex and GPT‑5 expectations plus interviews on the Agentic Web provide pragmatic guidance for building AI‑native software teams and products.

🎬 Showcases & Demos

AGI Documentary: A behind‑the‑scenes feature drew massive viewership, illustrating public appetite for transparent narratives about frontier AI research and its societal stakes.
Diesol’s “The Cleaner” (Rome): Long‑form AI cinema with an Emmy‑winning original score showcases maturing production pipelines and creative control for narrative‑driven generative filmmaking.
DJ Reachy (Robotics + Music): Real‑time music generation and synchronized dance, released open‑source, demonstrate playful human‑robot collaboration and reproducible creative robotics.
Reachy Mini: A compact tabletop robot earns praise as an approachable maker platform, lowering the barrier to entry for hands‑on robotics experimentation.
Kling 2.6: Improved motion precision and stability in animation control push video generation toward production readiness for advertising, entertainment, and design workflows.

💡 Discussions & Ideas

Memory Systems for Agents: Analyses highlight how storage and retrieval design shape agent reasoning, suggesting vector databases and episodic memory as core architecture choices.
ARC Prize Takeaways: Discipline and non‑LLM methods can win hard benchmarks, reminding teams to combine symbolic tooling and search with LLMs for robust problem solving.
Agentic Workflows: Engineers report big coding productivity gains, but foresee testing and verification as critical new roles as orchestration replaces expert‑only execution.
Claude Code as “Agent”: Rapid adoption hints at mainstream agent experiences inside familiar coding tools, emphasizing UX and reliability over raw model horsepower.
Policy Fragmentation: State‑level patchworks risk chilling innovation; calls grow for a uniform federal framework as tax proposals on unrealized gains raise incentive concerns.
From Fringe to Core: Historical shifts in neural nets inform leaders’ predictions that 2026 rewards production‑grade results over demos; Andrej Karpathy expects a pivot to logical “ghost intelligence.”

Source Credits

Curated from 250+ RSS feeds, Twitter expert lists, Reddit, and Hacker News.