🎨 FoxeTales AI Pipeline β€” Iter 7 (Current State)

Updated: 2026-05-28 | Latest from production code at github.com/Foxetales/ai-infra/workflows/faceswap_fullbody_v1.json

πŸ“₯ Inputs (per render job)

1. Customer photo β€” kid face reference (JPG/PNG)
2. Template assets (artist-prepared, per page): 3. Ethnicity metadata β€” text prompt anchor (vd "Vietnamese Asian")

βš™οΈ Processing pipeline

1. PuLID identity extraction
Customer photo β†’ InsightFace AntelopeV2 face detection β†’ EVA-CLIP embedding β†’ PuLID-Flux v0.9.1 identity vector
Output: 1085MB PuLID model loaded, identity embedding ~512-dim
↓
2. Mask preparation
Person mask β†’ GrowMask(+2) β†’ FeatherMask(12px) β†’ smooth person-region mask for inpaint
Iter7 change: switched from face-only mask to person-region mask (Chi's feedback fix)
↓
3. Conditioning
CLIP text encoders: CLIP-L + T5-xxl-fp16 (9.2GB)
Positive prompt: "fxtl style watercolor, {ethnicity} child, full body, skin tone consistent across face arms and legs, FLAT 2D, Foxetales storybook"
Negative: "photograph, realistic, 3d, semi-realistic, beauty filter"
↓
4. FLUX KSampler
FLUX.1-dev (23GB) + FoxeTales LoRA v4 (strength 0.7) + PuLID identity injection
Iter7 locked config: sampler euler, scheduler beta, steps 25, denoise 0.55 (preserves pose vs face-only 0.85)
SetLatentNoiseMask = person-region mask
Output: 1024Γ—1024 latent regenerated in person region only
↓
5. VAE decode
FLUX VAE (ae.safetensors, 320MB) β†’ decode latent to RGB image (1024Γ—1024)
↓
6. Composite
ImageCompositeMasked with feathered person mask β†’ blend regenerated character into original background plate
↓
7. Skin shift postproc (Black/dark ethnicity only)
pipeline/skin_shift.py β€” YCbCr skin detection Γ— person_mask β†’ multiply skin pixels by RGB scale
Presets: asian/blonde (no-op), indian (.85/.75/.65), black (.65/.50/.42), black_deep (.55/.42/.35)
WHY: Style LoRA trained on light-skin children β†’ fights dark skin signal. Postproc bypasses LoRA bias.
↓
πŸ“€ Output: Final personalized page (1024Γ—1024 PNG, ~1MB) ready for book layout / preview

πŸ“Š Performance characteristics

πŸ”„ Iter7 changes vs previous

ComponentIter6 (face-only)Iter7 (current)
Mask typeFace bounding boxPerson region (alpha-derived)
KSampler denoise0.850.55 (preserve pose)
Grow / FeatherNone+2 / 12px (smooth body edges)
Prompt skin anchorJust "{ethnicity}""+ skin tone consistent across face arms legs"
PostprocNoneYCbCr skin shift for dark ethnicities
ControlNetPlanned⚠ Not yet β€” Phase 2 (cαΊ§n cho pose-sensitive pages)

🚧 Known limitations (need Phase 2 work)