NimbusOS launches lightweight on-device LLM runtime for IoT and mobile

AI · 4 min read

NimbusOS revealed a lightweight runtime that enables on-device execution of quantized LLMs across ARM and RISC-V devices. The runtime offers hardware-aware scheduling, memory-efficient parameter sharding, and a developer CLI to package models into secure sandboxes for privacy-sensitive applications.

Concurrently, NimbusOS closed a $30 million seed round led by EdgeScale Capital to build tooling for model compilation and security certification. The funding will accelerate partnerships with semiconductor makers and expand the company’s support for private inference use cases like offline assistants and industrial controllers.

The company highlights use cases in retail, automotive, and medical devices where on-device inference reduces latency and improves privacy. NimbusOS also announced an open benchmark for measuring model latency and energy consumption on constrained hardware.