Đã làm xong: Pipeline face-swap full character với FLUX + PuLID + style LoRA, validated trên 5 ethnicities × 34 trang sách T16.
Quality: Asian + Blonde production-ready. Indian medium. Black ethnicity cần fix structural.
Blocker chính: Style LoRA v4 trained chủ yếu trên light-skin children → fight dark skin tones. Cần dataset diverse hơn.
Cần từ artist: (1) per-page person-region masks chuẩn, (2) bổ sung dataset diverse skin tones, (3) confirm template variants strategy.
👉 Xem tất cả 6 diagrams tại đây → (Mermaid live render, zoomable, version-controlled)
Full detailed pipeline doc: 📄 AI Pipeline Iter7 (HTML)
Lưu ý: diễn biến iter7 rất khác diễn biến iter6/i5. Diagram cũ đã outdated, click link trên để xem state hiện tại.
Flow xử lý 1 page sách cho 1 customer:
| Component | Tool | Status |
|---|---|---|
| Base model | FLUX.1-dev (23GB) | ✓ Production |
| Identity transfer | PuLID-Flux v0.9.1 | ✓ Production |
| Style adapter | FoxeTales LoRA v4 (1000 steps) | ⚠ Cần retrain với diverse data |
| Workflow engine | ComfyUI | ✓ Production |
| GPU | RTX A6000 48GB (Vast.ai) | ✓ $0.43/h, ~30s/page |
| Repo | github.com/Foxetales/ai-infra | ✓ Private, all artifacts |
136 renders đã hoàn thành (34 pages × 4 ethnicities). Dưới đây là 6 pages đại diện × 4 ethnicities (Black row đã apply skin shift postproc):
Total batch cost: ~$0.50 GPU compute, ~35 min runtime.
| Approach | Pro | Con | Recommend |
|---|---|---|---|
| Face-only mask | Safe, không phá pose | Personalization shallow ở full page | ❌ Reject (per Chi) |
| Full-character mask | Identity + body skin match | Phá pose ở risky pages | ✓ Use cho 80% pages |
| Template variants per gender/hair | UX clear, no AI guess | 4× art cost per page | ❌ Reject (per Chi: hiện đã làm rồi, không reduce workload) |
| LoRA fine-tune diverse skin tones | Structural fix Black/Indian quality | 1-2h training + cần diverse dataset | ⚠ Phase 2 |
Detailed cost model: 📄 Full cost analysis (VI)
| Item | Cost | Note |
|---|---|---|
| GPU compute / 1 order (35 pages + 50 previews + regen) | ~$1.61 | RunPod A100 80GB serverless, ~51 min GPU time |
| + 25% buffer (cold start, idle, retry) | $0.40 | Production safety margin |
| + Storage + DB + CDN | $0.06 | R2 + Workers KV + Sentry |
| Total infra cost / order | ~$2.07 (~53,000 VND) | Stable across scale |
| Orders / month | 1,000 | 2,500 | 5,000 | 7,500 | 10,000 |
|---|---|---|---|---|---|
| GPU hours (estimate) | 854 | 2,134 | 4,268 | 6,402 | 8,536 |
| GPU cost (raw) | $1,614 | $4,034 | $8,067 | $12,101 | $16,134 |
| + buffer + storage + Cloudflare + monitoring | $503 | $1,185 | $2,291 | $3,426 | $4,552 |
| Total infra / month | ~$2,117 | ~$5,219 | ~$10,358 | ~$15,527 | ~$20,686 |
| Cost / order | $2.12 | $2.09 | $2.07 | $2.07 | $2.07 |
| Optimization | Saving | Implementation effort |
|---|---|---|
| FLUX fp8 quantization (50% memory, 2x speed) | ~40% GPU cost ↓ ($2.07 → $1.30) | 1 day |
| Step distillation (25 → 8 steps) | ~30% additional ↓ | 2-3 days |
| Batch processing (multi-page parallel) | ~15% ↓ (cold start amortization) | 1 day |
| Cache warm models (avoid 30s reload) | ~10% ↓ | 0.5 day |
Realistic target Phase 2: Cost / order ~$1.20-1.40 (40% reduction) within 1 week of optimization work.
Industry benchmark: Wonderbly sells personalized books at $40-60. Cost of goods (printing + shipping + AI) typically 30-40% = $12-24.
FoxeTales infra cost: ~$2/order = ~5% of revenue (very healthy margin for AI component).
Breakeven scale: Even at 100 orders/month (~$210 infra), business model viable. Optimization room is significant at scale.