INAI • The Open AI Hub

📰 AI News Daily — 15 Oct 2025

TL;DR (Top 5 Highlights)

OpenAI launched a cheaper GPT-5 web search API with domain filtering, intensifying AI search competition and enabling safer, high-precision vertical search experiences.
Google announced a gigawatt-scale AI hub in Visakhapatnam, positioning Vizag as India’s first “AI city” and accelerating cloud, jobs, and subsea connectivity.
OpenAI partnered with Broadcom on custom AI chips and a projected 10 GW infrastructure buildout, signaling a strategic shift away from sole dependence on NVIDIA.
Walmart rolled out instant AI-powered checkout nationwide and integrated ChatGPT shopping, raising the bar for retail personalization and frictionless reorders.
Alibaba’s Qwen3‑VL surged across sizes and platforms while AI video rivalry heated up, with Veo 3 and Sora 2 trading top spots for realism and versatility.

🛠️ New Tools

OpenAI Search API (GPT‑5) launched cheaper web search with domain filtering, enabling safer, verticalized retrieval for apps. Developers gain controllable sourcing and lower costs for production-grade RAG and search assistants.
NVIDIA DGX Spark (desktop) brings powerful local LLM inference to workstations, shrinking latency and cloud bills. It enables on-prem privacy while supporting cutting-edge multimodal and agent workloads.
Amazon AgentCore (AWS Bedrock) lets businesses deploy monitored AI agents across operations. Early adopters like Sony and Ericsson report real-world automation gains, de-risking agent rollouts at enterprise scale.
Microsoft MarkItDown converts PDFs, slides, and more into clean Markdown for LLM pipelines. It removes brittle parsing steps, improving retrieval quality and developer productivity in production RAG systems.
Nanonets OCR2 (3B VLM) upgrades OCR with visual reasoning, LaTeX, and multilingual support. Better document understanding reduces manual data entry and unlocks automation for invoices, forms, and scientific content.
Flint emerged with an autonomous, real-time website builder backed by $5M from Accel. It adapts content and layout on the fly, cutting design cycles for marketing and e-commerce teams.

🤖 LLM Updates

Qwen3‑VL expanded from 4B to 235B with Instruct and Thinking variants, landing day‑one support in MLX‑VLM and heavy vLLM usage—an open alternative strengthening vision-language stacks.
Video models intensified competition: community benchmarks show Veo 3 and Sora 2 trading wins on realism, control, and fidelity. Creators gain more choice; production teams weigh licensing and brand safety.
ServiceNow 15B Multimodal (Together) arrives as an enterprise‑friendly model focused on structured tool use. It promises robust grounding for workflows across IT, HR, and customer operations.
KAIST KORMo‑10B launched a fully transparent Korean–English model, advancing non‑English AI research. Open training artifacts support reproducibility and localization beyond English‑centric benchmarks.
Google Gemini 3.0 (leak) hints at a near-term upgrade following 2.5 Pro. If confirmed, expect better reasoning and tooling—important for devs migrating assistants and agent workflows.
ChatGPT App Ecosystem added integrations with Spotify, Canva, and Slack, plus an in‑app store. It turns ChatGPT into a command hub, streamlining content, collaboration, and automation.

📑 Research & Papers

DiT360 advances panoramic image generation, improving global consistency and detail. It enables immersive content for VR, real estate, and robotics without stitching artifacts common in older methods.
Phalanx Attention proposes a faster alternative to sliding‑window attention, boosting throughput on long contexts. It reduces memory pressure, enabling extended reasoning and document understanding on modest hardware.
Representation Autoencoders for diffusion transformers aim to supersede VAEs, improving latent quality and training stability. Expect crisper generations and more controllable edits in image and video pipelines.
Targeted Model Retraining shows updating small parameter subsets preserves knowledge and cuts costs. Teams can ship domain improvements faster without catastrophic forgetting or full model retrains.
IIT Delhi study finds LLMs excel on tasks yet falter in scientific reasoning and safety inferences. Results underscore the need for human oversight and benchmark diversity in research settings.

🏢 Industry & Policy

Google Vizag AI City: a 1‑GW data center and subsea gateway establish India’s largest AI hub, catalyzing cloud capacity, local jobs, and regional innovation ecosystems.
OpenAI x Broadcom: custom AI chips and a planned 10‑GW build push compute scale and efficiency, reducing reliance on NVIDIA and reshaping the AI hardware landscape.
Walmart x OpenAI: nationwide AI checkout and ChatGPT‑based shopping connect accounts for personalized reorders. It sets a new retail standard for convenience and cross‑channel engagement.
Japan vs. OpenAI (Sora 2): government requests compliance after anime‑style outputs trigger IP concerns. The dispute foreshadows tighter rules on cultural copyrights and model training.
Sweden’s free AI access gives millions generative AI tools, boosting digital literacy and equity. A public–private blueprint others can emulate for inclusive AI readiness.
Microsoft (UK) “Shadow AI” reports 71% of workers use unapproved AI. The finding highlights urgent needs for governance, audits, and sanctioned, secure enterprise alternatives.

📚 Tutorials & Guides

Embedding Model Selection Guide compares trade‑offs for stronger RAG, helping teams choose models by domain, latency, and multilingual needs to improve retrieval accuracy and user satisfaction.
Qwen3‑VL Cookbook (Alibaba) walks through OCR, object grounding, and vision‑language tasks. Practical examples accelerate adoption and benchmarking for teams exploring multimodal workflows.
“Thinking Tokens” Explainer demystifies allocated reasoning tokens and when extra compute pays off. It guides developers on cost–quality trade-offs for complex tasks and tool use.
Agent Security Walkthrough shows how to authenticate, authorize, and harden data‑fetching agents. Concrete patterns reduce prompt injection and overreach in production environments.
AI Video Roundup compares Sora 2, Veo 3, Runway, Pika, and Synthesia. Creators get clarity on quality, control, and licensing for commercial campaigns.

🎬 Showcases & Demos

Sora 2 workflows enable instant cloning and editable remixes for TikTok and Instagram, accelerating content iteration while raising attribution and disclosure expectations for brands.
“Baby dino” AI captivated viewers, proving friendly design can shift public sentiment toward AI—and offering cues for UX teams building approachable agents.
Community contests: prompt battles and Kling AI’s global challenge drew thousands of submissions, highlighting rapid skill growth and emergent norms in the AI creator economy.

💡 Discussions & Ideas

Prompt‑security reality check: researchers bypassed OpenAI Guardrails with simple injections, reinforcing the need for layered defenses, independent validators, and strict tool permissions.
Reward models miss 25%+ of preferences, suggesting evaluation gaps. Proposals like Spectrum Tuning aim to preserve capability diversity and align outputs with nuanced user intent.
Synthetic data risks: overuse can cause model collapse; careful mixing and audits are vital. RL and high‑quality datasets let small models beat larger peers on targeted tasks.
Agents need tool‑heavy, multi‑turn fine‑tuning, not just long chains of thought. RL improves tool use reliability and safety calibration under real constraints.
Resource allocation insights: balancing weights, KV cache, and compute can improve reasoning efficiency. Info‑theoretic tests probe genuine multi‑agent coordination beyond surface metrics.

Source Credits

Curated from 250+ RSS feeds, Twitter expert lists, Reddit, and Hacker News.