NovaML unveils Condor: a 3.5B-parameter image-text model optimized for mobile inference

AI · 5 min read

NovaML announced Condor early this morning, positioning the model as a bridge between cloud-grade multimodal intelligence and on-device efficiency. At 3.5 billion parameters, Condor uses a novel sparsity-aware transformer and aggressive quantization to deliver image-to-text and text-to-image captioning at near real-time speeds on modern mobile SoCs.

The company demonstrated Condor running directly on a Snapdragon 8 Gen 4 device with sub-200 ms latency for image captioning and live scene understanding. NovaML says it achieved that performance by co-designing kernel-level optimizations, a new sparse attention routine, and a compact vision encoder tuned to mobile camera pipelines.

Alongside the launch, NovaML disclosed a $40 million Series B led by Lightspeed Ventures with participation from Sequoia Capital and existing backers. The funding will go toward expanding the SDK, building native integrations for iOS and Android apps, and ramping partnerships with OEMs to preinstall Condor capabilities.

For designers and mobile developers, NovaML released a developer kit with example flows for AR caption overlays, zero-shot semantic search, and on-device accessibility tooling. The company says Condor will be open to enterprise licensing next quarter and expects to release a smaller 1B-parameter variant for embedded IoT later this year.