AI API Cost Analysis
Data range: 2026-02-22 to 2026-04-19 (all historical data) Model: Gemini 3 Flash Preview · Pricing: input $0.50 / M tokens, output $3.00 / M tokens Users: 8,917 · Tarot readings: 77,462 · Average cards per reading: 3.44
1. Current AI Costs
Features using AI in the app:
| Feature | Description | Monthly calls | Monthly cost |
|---|---|---|---|
| Initial tarot reading | One call generates all content (summary, card interpretations, advice, lucky stone, etc.) | ~39,600 | $170 |
| Follow-up chat | User asks questions after the reading | ~15,200 | $44 |
| Draw extra card | User requests an additional card in the conversation | ~20,300 | $47 |
| Spread suggestion | AI picks the best spread for the user's question | ~38,500 | $42 |
| Card of Day | Daily one-card reading | ~6,400 | $7 |
| Weekly Guidance / Soul Journey | Built but never used in production | 0 | $0 |
| Blog / Admin translation | Sporadic | < 100 | negligible |
| Total | ~$310 / month |
The initial tarot reading accounts for 55% of total cost — this is what the split discussion is about.
2. Splitting the Initial Reading: Cost Comparison
Current approach: after the user draws cards, 1 AI request generates everything (summary + per-card interpretations + advice + lucky stone + suggested questions).
Split approach: break it into multiple requests (2, 3, or ~9 — one per message segment). All approaches can still use streaming output.
Why splitting costs more
Each AI request must re-send the "context" (character persona, user question, card meanings, etc.) — roughly ~2,000 tokens. More splits = more repetitions of this context.
Per-reading cost comparison
| Approach | Requests | Input tokens | Output tokens | Cost per reading | Multiplier |
|---|---|---|---|---|---|
| Current (1 request) | 1 | 1,942 | 1,113 | $0.0043 | 1× |
| Split into 2 | 2 | ~4,000 | ~1,200 | $0.0056 | 1.3× |
| Split into 3 | 3 | ~6,000 | ~1,300 | $0.0069 | 1.6× |
| One per message (~9) | ~9 | ~18,000 | ~2,000 | $0.015 | 3.5× |
Output tokens don't increase much (each split produces simpler output). The cost increase comes from repeatedly sending input context.
Monthly cost comparison (initial reading only, ~39,600 / month)
| Approach | Monthly cost | Difference |
|---|---|---|
| Current (1 request) | $170 | — |
| Split into 2 | $222 | +$52 |
| Split into 3 | $273 | +$103 |
| One per message | $594 | +$424 |
Full app monthly cost comparison
| Approach | Initial reading | Other AI features | Monthly total | Difference |
|---|---|---|---|---|
| Current | $170 | $140 | $310 | — |
| Split into 2 | $222 | $140 | $362 | +$52 |
| Split into 3 | $273 | $140 | $413 | +$103 |
| One per message | $594 | $140 | $734 | +$424 |
3. Growth Sensitivity
| Scenario | Current approach | One per message |
|---|---|---|
| Current user base (8,917) | $310 / month | $734 / month |
| Users × 5 | $1,550 / month | $3,670 / month |
| Users × 10 | $3,100 / month | $7,340 / month |
At 10× users, one-per-message costs $4,240 more per month (~¥30,000) than the current approach.
4. Other Impacts of Splitting
| Dimension | Benefits | Drawbacks |
|---|---|---|
| User interaction | Can trigger "generate next section" on user tap | 1-3 second wait per section breaks flow |
| Stability | Each call is simpler, less likely to produce format errors | Any single failure breaks the experience; overall failure rate rises from 0.51% to ~1-2% across 9 calls |
| Flexibility | Different sections can use different models or parameters | Added complexity |
5. Recommendations
Option A (recommended): Keep 1 request + client-side paced reveal
- Backend stays as-is: 1 AI request fires immediately after card draw
- After generation completes, the frontend hides the content and lets users tap to reveal each section
- Reveal reads from already-generated local data — no waiting
- Tap-to-reveal animations, sound effects, and haptics all work with this approach
- Zero additional AI cost
Option B: Split into 2 requests (basic + deep on demand)
- Request 1: summary + brief card interpretations (fast first result)
- Request 2: triggered only when user taps "Deep Reading" (advice, lucky stone, detailed analysis)
- Monthly cost +$52 (+17%), but only users who choose deep reading trigger request 2
- Realistic average cost increase: ~10-15%
Not recommended: One request per message
- Monthly cost +$424 (+137%)
- User waits 1-3 seconds on every tap
- Overall failure rate increases
Data source: production database, all historical data (2026-02-22 to 2026-04-19). Model: Gemini 3 Flash Preview. Pricing: Vertex AI official rates.