← State of Embedded Finance 2026

Groq

Can purpose-built LPU silicon deliver the cheapest and fastest AI inference at scale, making GroqCloud the default inference API for production AI workloads?

Founded2016
HQMountain View, California, USA
FoundersJonathan Ross, Douglas Wightman
Total raised$1.1B+
Latest roundSeries D, 2025
IndustryInfrastructure / AI Inference
The story

Founded in 2016 by Google TPU alumni, Groq initially focused on custom AI accelerator silicon (the LPU) targeting HPC and ML workloads. The company operated in stealth through 2023, then pivoted to a cloud API model with GroqCloud, monetizing its LPU hardware as a tokens-as-a-service inference platform. The embedded finance angle is limited — Groq uses billing infrastructure (Lago for usage-based metering) to monetize its developer platform on a per-token basis, but does not offer financial products itself.

Last 12 months
2025-09
2025-08
2025-05
2025-11
Product timeline
2016
Groq founded by former Google TPU engineers Jonathan Ross and Douglas Wightman to build purpose-built inference silicon.· pivot
2018
Raised Series B led by Social Capital; remained in stealth while developing the LPU (Language Processing Unit) chip.· banking
2023
Launched GroqCloud publicly, offering developer API access to LPU-accelerated inference for large language models.· pivot
2024
Expanded GroqCloud with enterprise plans, regional endpoints, and batch processing; usage-based billing at per-token pricing.· pivot
2025
Raised $750M Series D as inference demand surged; added support for OpenAI open models and expanded MoE model support.· pivot
The stack
Accounting gap: none