Latest in AI

Showing:inferenceFoundersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

NVIDIA Blackwell Leads First Agentic AI Infrastructure Benchmark★ 72
NVIDIA BlogyesterdayBenchmark
NVIDIA reports that its GB300 NVL72 platform leads the first published AgentPerf results from Artificial Analysis, a benchmark designed for agentic AI infrastructure. The benchmark uses DeepSeek V4 Pro and coding-agent-style workloads with long sequences, simulated tool delays, and concurrency targets. NVIDIA attributes the gains to rack-scale Blackwell design, CUDA optimizations, and TensorRT LLM, claiming up to 20x more agents per megawatt than HGX H200.
How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies★ 72
NVIDIA Blog6 days agoBusiness
NVIDIA says the UK’s “AI maker” strategy is moving into deployment through domestic AI cloud infrastructure, Isambard-AI, and the Sovereign AI Fund. UK startups are using NVIDIA technologies for coding agents, self-improving AI, inference optimization, and biological foundation models. The post also covers NVIDIA’s UK startup investment, developer training, 6G collaboration, and enterprise AI projects moving from pilots into production.
Launch HN: General Instinct (YC P26) - Frontier models on edge devices
Hacker News (AI keywords)9 days agoNew Tool
General Instinct is a YC P26 company introduced through a Launch HN post. Its headline positioning is bringing frontier models to edge devices, suggesting local or embedded AI deployment rather than purely cloud-based inference. Since no article body is available, details such as supported models, hardware, benchmarks, pricing, and developer tooling cannot be verified from the provided source.
Qualcomm Unveils Dragonfly Data Center Brand for the Agentic AI Era
INSIDE 硬塞 AI13 days agoHardware
At Computex 2026, Qualcomm described AI agents as a major driver of cross-device hardware upgrades. The company unveiled Dragonfly, a new data center brand focused on inference computing. The announcement outlines a broader strategy spanning endpoint devices and cloud infrastructure, although the source does not provide specifications, performance figures, or deployment timelines.
After Nvidia’s $20B not-aqui-hire, AI chip startup Groq reportedly raising $650M
TechCrunch AI16 days agoHardware
TechCrunch cites Axios reporting that AI chipmaker Groq is seeking $650 million in internal funding. The company is reportedly pivoting from hardware toward AI inference, the stage focused on how models respond to prompts. The report comes after Nvidia’s $20 billion not-aqui-hire, underscoring continued investor attention around AI compute and inference infrastructure.
Xcena raises $135M betting AI’s bottleneck is memory, not compute
TechCrunch AI16 days agoHardware
South Korean chip startup Xcena raised a $135 million Series B at a $570 million valuation, bringing total funding to $185 million. The company argues AI inference is increasingly constrained by memory movement, not just GPU compute. Its prototype MX1 chip uses CXL to process data closer to DRAM, with Samsung foundry mass production planned by late 2026 and revenue targeted for 2027.
Protecting against inference theft
Vercel Changelog16 days agoCommentary
Only the title is available, so specific Vercel product changes or implementation steps cannot be confirmed. The topic appears to focus on protecting AI inference resources from unauthorized access, abuse, or cost-draining traffic. For teams deploying AI apps, the practical takeaway is to treat inference endpoints as high-value backend assets requiring access control, monitoring, and abuse prevention.
Has the hunt for AI compute uncovered the next Cerebras?
TechCrunch AI17 days agoHardware
TechCrunch reports that General Compute has raised a $15 million seed round at a $60 million post-money valuation to build an AI inference neocloud. The company is ordering $300 million of SambaNova SN50 chips, betting they can outperform GPUs and rival specialized chips for inference. The story frames inference speed, deployment flexibility, and lower power needs as key battlegrounds in AI infrastructure.
New AI Infra Decacorns: Fireworks, Baseten, and OpenRouter★ 78
Latent Space18 days agoBusiness
AI infrastructure startups Fireworks and Baseten have reportedly reached massive valuations, reflecting intense investor interest in developer-focused inference and deployment platforms. OpenRouter, the popular LLM API aggregator, is also on a rapid growth trajectory. This funding wave highlights a major capital shift toward cost-effective, developer-friendly API and hosting solutions.
OpenRouter more than doubles valuation to $1.3B in a year
TechCrunch AI19 days agoBusiness
OpenRouter, an AI gateway startup founded in 2023, raised a $113 million Series B led by CapitalG. The round reportedly values the company at about $1.3 billion post-money, more than doubling from its estimated $547 million valuation after its June 2025 Series A. The company says it now offers access to over 400 models, has 8 million global users, and processes 100 trillion tokens per month.