AI API Cost Analysis

Data range: 2026-02-22 to 2026-04-19 (all historical data) Model: Gemini 3 Flash Preview · Pricing: input $0.50 / M tokens, output $3.00 / M tokens Users: 8,917 · Tarot readings: 77,462 · Average cards per reading: 3.44

1. Current AI Costs

Features using AI in the app:

Feature	Description	Monthly calls	Monthly cost
Initial tarot reading	One call generates all content (summary, card interpretations, advice, lucky stone, etc.)	~39,600	$170
Follow-up chat	User asks questions after the reading	~15,200	$44
Draw extra card	User requests an additional card in the conversation	~20,300	$47
Spread suggestion	AI picks the best spread for the user's question	~38,500	$42
Card of Day	Daily one-card reading	~6,400	$7
Weekly Guidance / Soul Journey	Built but never used in production	0	$0
Blog / Admin translation	Sporadic	< 100	negligible
Total			~$310 / month

The initial tarot reading accounts for 55% of total cost — this is what the split discussion is about.

2. Splitting the Initial Reading: Cost Comparison

Current approach: after the user draws cards, 1 AI request generates everything (summary + per-card interpretations + advice + lucky stone + suggested questions).

Split approach: break it into multiple requests (2, 3, or ~9 — one per message segment). All approaches can still use streaming output.

Why splitting costs more

Each AI request must re-send the "context" (character persona, user question, card meanings, etc.) — roughly ~2,000 tokens. More splits = more repetitions of this context.

Per-reading cost comparison

Approach	Requests	Input tokens	Output tokens	Cost per reading	Multiplier
Current (1 request)	1	1,942	1,113	$0.0043	1×
Split into 2	2	~4,000	~1,200	$0.0056	1.3×
Split into 3	3	~6,000	~1,300	$0.0069	1.6×
One per message (~9)	~9	~18,000	~2,000	$0.015	3.5×

Output tokens don't increase much (each split produces simpler output). The cost increase comes from repeatedly sending input context.

Monthly cost comparison (initial reading only, ~39,600 / month)

Approach	Monthly cost	Difference
Current (1 request)	$170	—
Split into 2	$222	+$52
Split into 3	$273	+$103
One per message	$594	+$424

Full app monthly cost comparison

Approach	Initial reading	Other AI features	Monthly total	Difference
Current	$170	$140	$310	—
Split into 2	$222	$140	$362	+$52
Split into 3	$273	$140	$413	+$103
One per message	$594	$140	$734	+$424

3. Growth Sensitivity

Scenario	Current approach	One per message
Current user base (8,917)	$310 / month	$734 / month
Users × 5	$1,550 / month	$3,670 / month
Users × 10	$3,100 / month	$7,340 / month

At 10× users, one-per-message costs $4,240 more per month (~¥30,000) than the current approach.

4. Other Impacts of Splitting

Dimension	Benefits	Drawbacks
User interaction	Can trigger "generate next section" on user tap	1-3 second wait per section breaks flow
Stability	Each call is simpler, less likely to produce format errors	Any single failure breaks the experience; overall failure rate rises from 0.51% to ~1-2% across 9 calls
Flexibility	Different sections can use different models or parameters	Added complexity

5. Recommendations

Option A (recommended): Keep 1 request + client-side paced reveal

Backend stays as-is: 1 AI request fires immediately after card draw
After generation completes, the frontend hides the content and lets users tap to reveal each section
Reveal reads from already-generated local data — no waiting
Tap-to-reveal animations, sound effects, and haptics all work with this approach
Zero additional AI cost

Option B: Split into 2 requests (basic + deep on demand)

Request 1: summary + brief card interpretations (fast first result)
Request 2: triggered only when user taps "Deep Reading" (advice, lucky stone, detailed analysis)
Monthly cost +$52 (+17%), but only users who choose deep reading trigger request 2
Realistic average cost increase: ~10-15%

Not recommended: One request per message

Monthly cost +$424 (+137%)
User waits 1-3 seconds on every tap
Overall failure rate increases

Data source: production database, all historical data (2026-02-22 to 2026-04-19). Model: Gemini 3 Flash Preview. Pricing: Vertex AI official rates.

AI API Cost Analysis ​

1. Current AI Costs ​

2. Splitting the Initial Reading: Cost Comparison ​

Why splitting costs more ​

Per-reading cost comparison ​

Monthly cost comparison (initial reading only, ~39,600 / month) ​

Full app monthly cost comparison ​

3. Growth Sensitivity ​

4. Other Impacts of Splitting ​

5. Recommendations ​

Option A (recommended): Keep 1 request + client-side paced reveal ​

Option B: Split into 2 requests (basic + deep on demand) ​

Not recommended: One request per message ​