Latent Space’s roundup frames image composition as a major barrier now being tackled by layout-aware image models. Reve 2.0 emphasizes precise generation and editing with layouts, while Ideogram 4.0 uses bounding boxes tied to region descriptions. The issue also covers MAI-Thinking-1, Gemma 4 12B, open audio models, agent execution layers, and model-routing cost debates.
Vercel’s changelog points to Grok Imagine Video 1.5 becoming available through AI Gateway. The public model page lists the preview model as xai/grok-imagine-video-1.5-preview and marks it primarily for image-to-video generation. Because the source text is unavailable, concrete claims about quality, speed, audio, editing, or text-to-video improvements should not be inferred.