Google expands Gemini API with multimodal fine-tuning and latency tiers
AI ยท 5 min read
The new Gemini multimodal fine-tuning allows teams to upload paired datasets of images, audio, and structured text to create models tuned for domain-specific tasks such as visual search or guided design generation. Fine-tuned endpoints can be provisioned privately and secured under organization-level IAM policies.
To address production needs, Google introduced latency tiers with SLAs and replenishable compute reservations, making it easier for developers to pick a predictable performance profile for interactive apps. Pricing and quota controls were updated to support consistent performance across peaks.
Additional SDKs for web and mobile include safety filters and a moderation pipeline that can be customized per endpoint. Google also published best-practice patterns for integrating multimodal outputs into UX while preserving clarity and editability for users.