CortexCloud raises $120M to build LLM hosting optimized for multimodal agents

AI · 5 min read

CortexCloud, an AI infrastructure provider, announced a $120 million Series D to broaden its hosted LLM services for multimodal agents in enterprise contexts. The platform offers model orchestration, memory management, and latency SLAs specifically designed for chained agent workflows that combine text, image, and external tool calls.

CortexCloud’s pitch is predictable performance under complex call graphs: smart routing between models, caching of intermediate reasoning steps, and deterministic replay for auditability. The funding will expand edge capacity, add hardware-accelerated vector stores, and build compliance features for regulated industries.

The company highlighted customer wins in financial services and clinical decision support, where predictable inference latency and traceable reasoning were non-negotiable. CortexCloud will also add developer tools to simulate load across agent topologies and estimate cost-per-session.