NimbleAI launches Onyx — a 3.2B parameter multimodal model optimized for edge devices

AI · 4 min read

NimbleAI today launched Onyx, a 3.2 billion parameter multimodal model that supports text, image, and lightweight video understanding and is specifically optimized for on-device inference across ARM and RISC-V platforms. The company emphasizes low-latency inference, model sparsity techniques, and a runtime that leverages integer quantization to run within tight power budgets. Onyx arrives with an SDK that abstracts heterogenous compute and includes prebuilt connectors for Android, Linux edge appliances, and a cloud fallback service.

Alongside the product reveal, NimbleAI disclosed a $45 million Series B led by Frontier Ventures with participation from a mix of semiconductor-focused funds and enterprise AI investors. The funds will be used to expand partnerships with silicon makers and to seed a certification program for edge-ready models and appliances. Existing enterprise customers in robotics and smart retail will be early adopters of Onyx’s offline language understanding and CV capabilities.

NimbleAI's positioning is clear: while large multimodal models dominate cloud compute, there is a growing market for compact, multimodal models that preserve privacy and allow fast inference on-device. The company expects Onyx to be used in verticals where connectivity is limited or privacy is a regulatory concern, such as in healthcare kiosks, drones, and industrial inspection.