Skip to content

AI API Cost Analysis

Data range: 2026-02-22 to 2026-04-19 (all historical data) Model: Gemini 3 Flash Preview · Pricing: input $0.50 / M tokens, output $3.00 / M tokens Users: 8,917 · Tarot readings: 77,462 · Average cards per reading: 3.44


1. Current AI Costs

Features using AI in the app:

FeatureDescriptionMonthly callsMonthly cost
Initial tarot readingOne call generates all content (summary, card interpretations, advice, lucky stone, etc.)~39,600$170
Follow-up chatUser asks questions after the reading~15,200$44
Draw extra cardUser requests an additional card in the conversation~20,300$47
Spread suggestionAI picks the best spread for the user's question~38,500$42
Card of DayDaily one-card reading~6,400$7
Weekly Guidance / Soul JourneyBuilt but never used in production0$0
Blog / Admin translationSporadic< 100negligible
Total~$310 / month

The initial tarot reading accounts for 55% of total cost — this is what the split discussion is about.


2. Splitting the Initial Reading: Cost Comparison

Current approach: after the user draws cards, 1 AI request generates everything (summary + per-card interpretations + advice + lucky stone + suggested questions).

Split approach: break it into multiple requests (2, 3, or ~9 — one per message segment). All approaches can still use streaming output.

Why splitting costs more

Each AI request must re-send the "context" (character persona, user question, card meanings, etc.) — roughly ~2,000 tokens. More splits = more repetitions of this context.

Per-reading cost comparison

ApproachRequestsInput tokensOutput tokensCost per readingMultiplier
Current (1 request)11,9421,113$0.0043
Split into 22~4,000~1,200$0.00561.3×
Split into 33~6,000~1,300$0.00691.6×
One per message (~9)~9~18,000~2,000$0.0153.5×

Output tokens don't increase much (each split produces simpler output). The cost increase comes from repeatedly sending input context.

Monthly cost comparison (initial reading only, ~39,600 / month)

ApproachMonthly costDifference
Current (1 request)$170
Split into 2$222+$52
Split into 3$273+$103
One per message$594+$424

Full app monthly cost comparison

ApproachInitial readingOther AI featuresMonthly totalDifference
Current$170$140$310
Split into 2$222$140$362+$52
Split into 3$273$140$413+$103
One per message$594$140$734+$424

3. Growth Sensitivity

ScenarioCurrent approachOne per message
Current user base (8,917)$310 / month$734 / month
Users × 5$1,550 / month$3,670 / month
Users × 10$3,100 / month$7,340 / month

At 10× users, one-per-message costs $4,240 more per month (~¥30,000) than the current approach.


4. Other Impacts of Splitting

DimensionBenefitsDrawbacks
User interactionCan trigger "generate next section" on user tap1-3 second wait per section breaks flow
StabilityEach call is simpler, less likely to produce format errorsAny single failure breaks the experience; overall failure rate rises from 0.51% to ~1-2% across 9 calls
FlexibilityDifferent sections can use different models or parametersAdded complexity

5. Recommendations

  • Backend stays as-is: 1 AI request fires immediately after card draw
  • After generation completes, the frontend hides the content and lets users tap to reveal each section
  • Reveal reads from already-generated local data — no waiting
  • Tap-to-reveal animations, sound effects, and haptics all work with this approach
  • Zero additional AI cost

Option B: Split into 2 requests (basic + deep on demand)

  • Request 1: summary + brief card interpretations (fast first result)
  • Request 2: triggered only when user taps "Deep Reading" (advice, lucky stone, detailed analysis)
  • Monthly cost +$52 (+17%), but only users who choose deep reading trigger request 2
  • Realistic average cost increase: ~10-15%
  • Monthly cost +$424 (+137%)
  • User waits 1-3 seconds on every tap
  • Overall failure rate increases

Data source: production database, all historical data (2026-02-22 to 2026-04-19). Model: Gemini 3 Flash Preview. Pricing: Vertex AI official rates.

Internal documentation for MysticX team