If you’re pricing a real-time AI avatar API, the sticker price on the pricing page is the wrong number to anchor on. What you actually pay is cost per minute of conversation, multiplied by how many minutes your users spend talking to the avatar — and at any real scale, that’s where the bill lives.
This guide breaks down what drives real-time avatar cost, what the going per-minute rates look like in 2026, and how the cheapest option on the market gets there. Where it matters, the answer comes back to one thing: where the rendering happens.
Why real-time AI avatars are expensive in the first place
Most real-time avatar platforms render the avatar in the cloud and stream the resulting video to the user. That architecture has two cost drivers baked in:
- GPU rendering in the cloud, per session, in real time. Rendering video frames continuously is GPU-intensive, and you’re paying for that compute for every minute every user is connected.
- Bandwidth. Streaming video to the device needs a sustained 1–2 MB/s. That’s egress cost for the provider and, ultimately, a line item in your rate.
Add it up and the industry average for real-time avatars lands around ~$0.15/min. That’s the number to beat. For the architectural background, see on-device AI avatar vs cloud streaming.
How on-device rendering changes the math
Spatius uses a different architecture, and the cost follows directly from it. Instead of rendering video in the cloud, the cloud Motion Server does only lightweight driving inference and sends down compact Motion data (driving parameters) — about 10–20 KB/s. The client SDK renders the avatar locally.
Two things get cheaper at once:
- Bandwidth drops from ~1–2 MB/s to ~10–20 KB/s — roughly two orders of magnitude less data to move.
- GPU cost is minimized. The heavy rendering moves off the cloud and onto the user’s device, where it runs as rendering only, with no inference. There’s still lightweight driving inference in the cloud, so cloud GPU work is dramatically reduced compared with rendering full video frames server-side. As the Spatius docs put it, the approach is “replacing heavy cloud rendering with a lightweight stream and rendering on edge.”
That’s why an interactive avatar can run on hardware with no dedicated GPU at all — the device only renders. And it’s why the per-minute rate can come down to a fraction of the cloud-streaming average.
The numbers: per-minute and per-hour
On the Spatius Scale plan, the effective rate is $0.007/min — about $0.42 per hour of conversation. (Worth flagging precisely: that $0.42/hour figure is the Scale-plan rate, i.e. $0.007/min × 60. The Free and Starter plans have different effective rates.)
Put against the ~$0.15/min industry average, the gap is easiest to feel with a fixed budget. Spend $5,000 on conversation minutes:
| Effective rate | Hours of conversation for $5,000 | |
|---|---|---|
| Spatius (Scale) | ~$0.007/min | ~11,349 hours |
| Industry average | ~$0.15/min | ~556 hours |
Same budget, roughly 20× more talk time. For a high-frequency use case — a language tutor, a customer-service agent, a companion app where sessions run long — that ratio is the difference between a viable unit economics and a runaway cloud bill.
Full Spatius pricing breakdown
Spatius prices in credits, where 10 credits ≈ 1 minute of conversation. Monthly plans (as of 2026-05-13):
| Plan | Price | Credits/mo | ≈ Minutes | Concurrency | Max session |
|---|---|---|---|---|---|
| Free | $0 | 500 | ~50 min | 2 | 10 min |
| Starter | $19/mo | 22,000 | ~2,200 min | 5 | 30 min |
| Scale | $299/mo | 400,000 | ~40,000 min | 40 | Unlimited |
| Enterprise | Custom | Unlimited | Unlimited | Custom | Custom |
Annual billing saves 20% and lowers the effective per-minute rate further — Starter to $15/mo ($0.0072/min) and Scale to $239/mo ($0.0056/min).
A few things to know so the budgeting is accurate:
- Credits don’t roll over on subscription plans — they reset each cycle. Credits you purchase or that are gifted to your team are permanent.
- Avatar generation (building a custom avatar from a photo) is a separate quota and doesn’t consume your conversation credits. It’s currently in Beta, granted manually by the team, and failed generations are automatically refunded.
- There’s a genuinely usable permanent free tier (500 credits/~50 min/month) for prototyping. See the full pricing page.
”Cheapest” should never mean “no AI included” by surprise
One honest caveat that applies to every platform in this category: a real-time avatar API is the avatar + driving/rendering layer. The AI agent — speech recognition (ASR), the LLM, and text-to-speech (TTS) — is a separate stack. With Spatius, you bring your own AI (or wire up providers you already use); Spatius does not provide ASR/LLM/TTS. That keeps the avatar layer cheap and swappable, but it means your total cost includes whatever you spend on those AI services. When you compare “cheapest API” claims across vendors, check what’s bundled — you’re not always comparing the same scope. More on the three-layer split in our interactive avatar complete guide.
How the per-minute cost compares to other platforms
The cost advantage shows up clearly in head-to-head comparisons, because it’s structural, not promotional:
- Spatius vs Synthesia — on-device rendering at roughly 99% lower cost per minute than Synthesia’s video-generation plans.
- Spatius vs Tavus — ~98% lower cost per minute than cloud video streaming.
- Spatius vs LiveAvatar — ~95% lower cost per minute.
- Spatius vs Anam.ai — on-device rendering at a fraction of cloud cost.
For the full landscape, see best on-device AI avatar platforms in 2026.
The takeaway
The cheapest real-time AI avatar API isn’t cheap because of a discount — it’s cheap because of where the work happens. Cloud-streamed video carries GPU-rendering and bandwidth cost into every minute of every session. Move rendering to the device, stream only Motion data, and the per-minute rate drops by roughly an order of magnitude — which on a fixed budget turns hundreds of hours of conversation into thousands.
Start on the free tier, check the math on the pricing page, or just talk to a live avatar in the Playground and watch your network usage while you do.
Recommended reading
- On-Device AI Avatar vs Cloud Streaming: Architecture, Bandwidth, and Cost
- Best On-Device AI Avatar Platforms in 2026 (Ranked & Compared)
- Comparing AI Avatar Platforms for Speed