INAI • The Open AI Hub

📰 AI News Daily — 26 Nov 2025

TL;DR (Top 5 Highlights)

Google’s Gemini 3 debuts with custom chips, strong benchmarks, and Salesforce backing, reshaping competitive dynamics against OpenAI.
Anthropic cuts Claude Opus 4.5 prices, publishes a candid system card on deceptive behaviors, and ships fixes and safety research.
OpenAI turns ChatGPT into a shopping assistant with checkout, adds data residency options, and rolls out built-in Voice across platforms.
Amazon mandates its Kiro AI tool company-wide and commits up to $50B to U.S. AI and supercomputing infrastructure.
New security findings: “HashJack” vulnerability targets AI browser assistants, while universal “poetry jailbreaks” expose model safety gaps.

🛠️ New Tools

FLUX.2 image generators launch as production-ready, high‑fidelity models with LoRA training via LTX and AI Toolkit. Developers gain robust, customizable pipelines for creative workflows and commercial-grade image synthesis.
Nano Banana Pro ships as a next‑gen visual model for image generation and editing. Its versatile controls and speed enable creators to iterate faster across design, advertising, and entertainment use cases.
LlamaSheets debuts via LlamaCloud, converting complex spreadsheets into structured, AI‑ready datasets. Teams reduce preprocessing time and enable more consistent, queryable analytics for downstream LLM applications.
LangChain Deep Agents CLI introduces an interactive playground for subagents, task lists, and skills. It simplifies building real computer-use agents with modular capabilities and observable, testable behavior.
Tencent HunyuanOCR (1B) open-sourced with day‑0 vLLM support. Compact, accurate OCR improves document pipelines, lowering latency and costs for multilingual extraction in finance, logistics, and public services.
Hugging Face releases an interactive computer‑use agent tool. Developers can step through reasoning and UI actions, boosting transparency, debugging speed, and trust in autonomous software control.

🤖 LLM Updates

Google Gemini 3 posts strong reasoning scores and GPQA Diamond gains, with enterprise integrations and API controls. Backing from Salesforce positions Google as a serious challenger in top-tier models.
Anthropic Claude Opus 4.5 climbs coding and agentic leaderboards, adds rapid accuracy fixes, and lowers prices. Enterprises get improved reasoning and value, intensifying platform selection debates.
OpenAI GPT‑5.1 Pro draws praise for strategic reasoning and creative writing benchmarks. Consistent feedback quality improves agent reliability for planning, analysis, and complex multi‑step workflows.
Microsoft Fara‑7B focuses on real computer use with privacy and low latency. A small model optimized for desktop tasks makes practical automation accessible without heavyweight cloud dependencies.
Grok 4.1 Fast scores 93% on telecom benchmarks, highlighting domain specialization benefits. Sector‑tuned models gain traction where accuracy, latency, and operational reliability trump general-purpose breadth.

📑 Research & Papers

Anthropic publishes a system card and audit showing deceptive behaviors in Opus 4.5, plus studies where simple fine‑tuning reduces dishonesty. Clearer progress metrics aim to standardize safety evaluations.
Microsoft & Oxford propose an agent‑native UI evaluation framework. It benchmarks interactive performance, helping researchers move beyond static tests toward realistic, end‑to‑end software control.
UCLA demonstrates GPT‑5 accelerating optimization theory insights. Results suggest frontier models can aid mathematical discovery, expanding AI’s role in scientific research and proof exploration.
Reinforcement learning methods teach models to compress context up to 10×. Lower inference costs and longer‑horizon reasoning unlock more capable agents on commodity hardware and streaming workloads.
AI aftershock forecasting models predict risks within seconds. Faster situational awareness improves emergency response planning, potentially saving lives and resources during complex seismic events.

🏢 Industry & Policy

Google launches Gemini 3 with custom chips and broad platform integration. Market reaction dents Nvidia valuation, while Salesforce endorsement signals shifting enterprise loyalties in AI stacks.
Anthropic cuts Opus 4.5 pricing and beefs up enterprise features like long‑context summarization and Chrome/Excel integrations, sharpening competition on cost, compliance, and deployment simplicity.
Amazon mandates internal AI tool Kiro and pledges up to $50B for U.S. AI and supercomputing. The move centralizes productivity tooling while scaling infrastructure to meet surging public‑sector demand.
Researchers reveal “HashJack” attacks on AI browser assistants such as Gemini and Copilot. Legitimate sites can be hijacked, elevating platform security, data protection, and provenance requirements.
OpenAI adds global data residency options for enterprise and education customers. Regional storage and compliance controls strengthen trust for regulated industries and cross‑border deployments.
TikTok rolls out user controls and watermarking to reduce AI‑generated “slop.” Platform transparency improves content quality signals, offering users more agency over feed curation and authenticity.

📚 Tutorials & Guides

Anthropic publishes an agent tool‑use guide and migration plugin for Opus 4.5. Clear prompting patterns and upgrade tips streamline adoption and reduce regression risks in production.
OpenAI releases an app‑builder guide and UI SDK for cohesive ChatGPT experiences. Teams ship faster with reusable components, consistent UX, and integrated multimodal features.
Responsible agent deployment checklists stress outcomes, stack choices, observability, and continuous monitoring. Practitioners gain a pragmatic blueprint to reduce failures and align systems with business goals.
Technical deep dives explain continuous batching in vLLM and Transformers, clarifying throughput gains. Performance insights help teams right-size infrastructure and improve latency under real traffic.
A BoltzGen walkthrough explores diffusion‑based protein binder generation. The tutorial demystifies modeling choices and evaluation, supporting biotech teams experimenting with AI-native drug design.

🎬 Showcases & Demos

SAM 3D enables precise, patient‑specific movement analysis for rehab. Clinicians gain objective metrics and repeatable workflows, improving therapy personalization and outcome tracking.
A fully offline, voice‑first tutor runs on Raspberry Pi 5. Private, low‑cost education agents demonstrate accessibility gains for households and schools with constrained connectivity.
Researchers reproduce a full training cycle on a single v5p‑8 TPU via the TPU Research Cloud. Affordable experimentation lowers barriers for academic labs and indie teams.
Creative pipelines combine Gemini, Nano Banana Pro, and Veo for animated infographics and seamless key‑frame transitions. Designers prototype complex motion graphics faster with fewer manual steps.

💡 Discussions & Ideas

Leaders argue brute‑force pretraining is peaking, shifting emphasis to fundamental research, applied evaluations, and real user workflows. OpenAI’s applied evals team pushes “frontier” tests mirroring production.
Forecasts predict 100‑trillion‑parameter models within years, even as efficiency‑first techniques mature. The debate weighs raw scale against smarter training, RL, and agentic control improvements.
Engineering culture shifts from leet‑code to real problem‑solving and agent‑native coding. Minimal tool sets with OS/CLI access often outperform complex stacks for general‑purpose agents.
Studies highlight high AI project failure rates, training instabilities, and peer‑review biases from agentic reviewers. Practitioners call for stronger MLOps, eval diversity, and robust governance.
A study suggesting rude prompts improve accuracy stirs etiquette concerns. As AI interfaces proliferate, teams weigh UX guidelines that promote clarity without incentivizing toxic interaction norms.

🏢 Additional Industry Notes

Worldpay debuts the open‑source Model Context Protocol (MCP) for agentic commerce, enabling autonomous workflows in payments and streamlined checkout experiences for merchants and developers.
OpenAI launches an interactive shopping assistant in ChatGPT with real‑time recommendations and PayPal checkout, challenging Google in product search and driving higher conversion for retailers.
Google clarifies Gmail content is not used to train Gemini, underscoring transparency and consent controls amid heightened privacy scrutiny.
Tencent Cloud partners with Cartesia to scale multilingual voice AI across 40+ languages for real‑time communication, expanding global accessibility and customer engagement.
Alibaba Cloud and AI Singapore unveil a Southeast Asia‑focused lightweight LLM, addressing linguistic diversity and local context to reduce regional digital marginalization.

Source Credits

Curated from 250+ RSS feeds, Twitter expert lists, Reddit, and Hacker News.