Inference is just another distributed system

We're a team of experienced researchers and software engineers: we've published academic papers, maintained open-source projects, and held the pager for critical services at hyperscalers. We first met at Twitter, where the infrastructure we built saved the company over $100M over five years.

Operating inference shares much with traditional services: balancing latency with load, understanding the limits of the hardware, and monitoring the system for unexpected changes. We help you run the best inference stack in your own cloud.

Yao Yue

YaoYue

Mihir Nanavati

MihirNanavati

Yuri Vishnevsky

YuriVishnevsky

Xi Yang

XiYang

Brian Martin

BrianMartin

Sean Lynch

SeanLynch

Backed by