What Does AI Video Generation Actually Cost? Real Numbers From a Production Pipeline
A full cost breakdown across image generation, video clips, and 3D models — with a per-scene budget estimate for an 8-shot story. The numbers will surprise you.
When I started building an AI video-story pipeline I had rough intuitions about cost but no real numbers. After running the full stack — image generation, image-to-video, 3D models — I have actual data. Here's what things cost in May 2026, and the decisions I made based on those numbers.
Image Generation
Image costs are low enough that they're rarely the bottleneck.
| Service | Model | Cost |
|---|---|---|
| Gemini 2.5 Flash (via API) | Image generation | $0.039 / image (1024×1024) |
| Gemini 2.5 Flash (batch) | Image generation | $0.020 / image (50% off) |
| fal.ai | FLUX.1/dev | $0.025 / megapixel → $0.025 at 1024×1024 |
| fal.ai | FLUX.1/schnell | ~$0.003 / image (fast, lower quality) |
| fal.ai | FLUX Kontext Pro | $0.040 / image |
For character portraits I use Gemini 2.5 Flash — it handles reference images well and the quality at $0.039 is excellent. For scene stills (backgrounds, wide shots) FLUX.1/dev at $0.025 is my default. FLUX Schnell at $0.003 is worth trying for rough story boards, but it shows on detailed scenes.
The batch discount on Gemini is significant if you're generating many variants. At $0.020 per image, running 50 character portraits to find the right 5 costs $1.00. Hard to complain.
Video Generation (Image-to-Video)
This is where costs get serious.
| Service | Model | Cost/sec | 5s clip | 10s clip |
|---|---|---|---|---|
| Seedance (Atlas CLI) | Seedance 2.0 Fast 720p | $0.2419 / sec | ~$1.21 | ~$2.42 |
| Seedance (Atlas CLI) | Seedance 2.0 Standard 720p | $0.3024 / sec | ~$1.51 | ~$3.03 |
| fal.ai | WAN 2.5 | $0.05 / sec | ~$0.25 | ~$0.50 |
WAN 2.5 is 5× cheaper than Seedance 2.0 Fast. That's a meaningful gap. After testing both I settled on a tiered approach: WAN 2.5 for rough cuts and non-hero shots, Seedance 2.0 Fast for key story moments, Seedance 2.0 Standard only for hero/title shots where quality is non-negotiable.
The quality difference is real — Seedance handles character motion and facial consistency better, especially over 5+ seconds. WAN 2.5 has occasional drift on complex motion but is perfectly fine for environment shots, establishing pans, and anything where a character isn't the focal point.
For a typical scene: 6 background shots at WAN 2.5 ($0.25 × 6 = $1.50) and 2 character shots at Seedance Fast ($1.21 × 2 = $2.42) blends quality where it matters and keeps costs reasonable.
3D Generation
3D is the most complex cost structure — and has the biggest traps.
Pay-Per-Use (fal.ai)
| Model | Input | Base | With PBR | Max |
|---|---|---|---|---|
fal-ai/trellis |
Image → 3D | $0.02 | — | $0.02 |
fal-ai/hunyuan3d-v3/image-to-3d |
Image → 3D | $0.375 | $0.525 | ~$0.825 |
fal-ai/hunyuan3d-v3/text-to-3d |
Text → 3D | $0.375 | $0.525 | ~$0.825 |
| Replicate Trellis2 | Image → 3D | ~$0.82 | — | ~$0.82 |
The Replicate number is the one that caught me off guard. Trellis on fal.ai costs $0.02. The same model on Replicate costs $0.82 — 41× more expensive for identical output. Platform matters a lot for 3D.
The recommended combo: Trellis draft at $0.02 for shape validation, Hunyuan3D + PBR at $0.525 for approved finals. You're paying $0.545 total for a production-ready GLB with Base Color, Metallic/Roughness, Normal Map, and AO. For game assets that's competitive with stock model pricing.
Meshy (Subscription — Game Pipeline)
| Plan | Monthly | Est. gens | Cost/gen |
|---|---|---|---|
| Pro | $12 | ~100 | ~$0.12–0.36 |
| Max | $45 | ~1,000 | ~$0.045 |
| Team | $55/seat | ~1,800 | ~$0.03 |
Meshy is the right choice when you need rigging and animation — fal.ai models output static geometry only. Meshy's full pipeline (image → 3D → remesh → auto-rig → animate) stacks credits:
- Generation: ~$0.24–0.36
- Remesh: ~$0.06–0.12
- Auto-rig: ~$0.06
- Animation (per clip): ~$0.036
On the Pro plan a fully rigged, animated character asset lands around $0.40–0.55 total. At Max scale it's closer to $0.18. For game production that's remarkably cheap.
Per-Scene Budget Estimate
Let's put it together for a real scene. Assume 8 shots: 6 environment/background and 2 character focus, each 5 seconds. Plus 2 character portraits.
| Item | Qty | Unit | Total |
|---|---|---|---|
| Character portraits (Gemini Flash) | 2 | $0.039 | $0.08 |
| Scene stills — backgrounds (FLUX.1/dev) | 8 | $0.025 | $0.20 |
| Video — background shots (WAN 2.5, 5s) | 6 | $0.25 | $1.50 |
| Video — character shots (Seedance Fast, 5s) | 2 | $1.21 | $2.42 |
| Assembly (Remotion) | — | free (local) | $0.00 |
| Scene total | ~$4.20 |
A 5-scene short story at this breakdown runs about $21. If you use Seedance for all shots (no WAN 2.5 split) it jumps to ~$50. The tiered video approach cuts total cost roughly in half with minimal quality loss on non-critical shots.
Add 3D assets (3–5 per scene, Trellis draft + Hunyuan3D final) and you're looking at another $1.65–$2.75 per scene.
Full 5-scene story with mixed 3D: $30–40 total generation cost.
What I Actually Use
After running the numbers and testing quality:
- Image generation: FLUX.1/dev for stills, Gemini 2.5 Flash for character portraits
- Video (hero shots): Seedance 2.0 Fast via Atlas CLI
- Video (environment/background): WAN 2.5 via fal.ai
- 3D draft: Trellis on fal.ai ($0.02 — always)
- 3D final (static): Hunyuan3D v3 + PBR
- 3D final (animated game asset): Meshy Pro pipeline
- Assembly: Remotion (free, local CPU render)
The biggest cost lever by far is video generation. Everything else — image gen, 3D drafts, rendering — is almost noise compared to what you spend on clips. Optimizing the video split (which shots get Seedance, which get WAN 2.5) is where budget control actually lives.
The Atlas CLI Factor
One thing that simplified the whole pipeline: the Atlas CLI (atlas v0.1.10) gives you a unified interface across most of these models:
atlas generate cost bytedance/seedance-2.0-fast/image-to-video # estimate before running
atlas generate video bytedance/seedance-2.0-fast/image-to-video --image @shot.png
atlas models search "image to video" # discover models
The atlas generate cost command before every video run has saved me from a few accidental expensive generations. Make it a habit.