Inference is just another distributed system
We're a team of experienced researchers and software engineers: we've published academic papers, maintained open-source projects, and held the pager for critical services at hyperscalers. We first met at Twitter, where the infrastructure we built saved the company over $100M over five years.
Operating inference shares much with traditional services: balancing latency with load, understanding the limits of the hardware, and monitoring the system for unexpected changes. We help you run the best inference stack in your own cloud.
YaoYue
MihirNanavati
YuriVishnevsky
XiYang
BrianMartin
SeanLynch
Backed by