Google DeepMind unveils Gemini Fusion 2 — a model optimized for multimodal prototyping

Tech · 5 min read

Gemini Fusion 2 targets design-centric workflows by combining an image-aware encoder with a light-weight solver that performs layout manipulations in under a second. The model supports multi-frame UI changes, context-aware color swaps, and copy-aware text flows ideal for iterative prototyping.

Google showcased integrations with Material Design tooling and an API that returns both edited assets and a change-set for design tokens. By optimizing caching at the model level, Fusion 2 reduces repetitive inference costs during rapid iteration and supports offline bake-in for on-premise deployments.

Designers at the announcement praised the model’s ability to maintain typographic rhythm across edits, but some noted initial struggles with complex animations. Google said training on animation datasets and tighter UX controls are next priorities.

The model will be available to enterprise customers through Vertex AI with limited public testing later this summer.