Projects

What I've built.

Each project below is something I built and deployed. These aren't tutorials or demos—they're working systems running in production or deployed to real infrastructure.

F5-TTS Voice Cloning

Custom voice synthesis on serverless GPU

Fine-tuned the F5-TTS model on ~2 hours of custom voice recordings for production-quality voice cloning. Deployed on serverless GPU infrastructure with a custom Docker container, CUDA 12.1, and an API supporting multiple output formats. Discovered and documented critical, previously undocumented reference audio requirements that eliminate common synthesis artifacts.

PyTorch F5-TTS CUDA 12.1 Docker RunPod Serverless FFmpeg
Production · Deployed on RunPod

Qwen3-TTS Voice Cloning

1.7B-parameter model on serverless GPU

Deployed Qwen3-TTS (1.7 billion parameters) on serverless GPU as an additional voice cloning endpoint. Uses flash attention for efficient inference with S3 storage for generated audio.

Qwen3-TTS PyTorch Flash Attention Docker RunPod Serverless S3
Production · Deployed on RunPod

AI Video Generation

Text-to-video & image-to-video at 720p

Built and deployed the Wan2.2-TI2V-5B diffusion model (5 billion parameters) on serverless GPU for text-to-video and image-to-video generation. Produces 720p video at 24fps in landscape or portrait, with configurable duration (2–5 seconds), guidance scale, and seed control. Supports batch processing and optional S3 storage.

Wan2.2 Diffusion Models Docker RunPod Serverless CUDA S3
Built · Previously deployed on RunPod

BookForge

AI-powered book creation for Amazon KDP

End-to-end non-fiction book creation pipeline. Two-phase workflow: AI-driven market research and niche discovery, then chapter-by-chapter generation with citations and KDP-formatted DOCX export. Uses Claude for writing and Perplexity for real-time research, with a structured service layer pattern for each stage.

Python Django Claude API Perplexity API python-docx PostgreSQL
Built · Not yet deployed

SlideFlow

AI presentation generation from articles

Converts long-form articles into structured slide decks using a strict 11-type JSON schema. Generates content slides, scripture layouts, key points, two-column comparisons, and image placements—each with speaker notes. Configurable image frequency, detail levels, and audience targeting. Outputs to PowerPoint via python-pptx.

Python Django Claude API JSON Schema python-pptx
Production · Integrated in AssetFlow

NLP Translation Pipeline

30,000+ verses with morphological analysis

AI-assisted word-by-word translation pipeline processing 30,286+ verses across 66 Biblical books (Hebrew OT and Greek NT) with full morphological analysis. Processed 444,785+ individual words with morphological tagging, etymology, Strong's cross-references, and interlinear formatting. Built a structured lexicon database with 475K+ SEO-friendly pages and etymological comparisons across 15 Niger-Congo Bantu languages.

Python Django Perplexity API PostgreSQL NLP
Production · 66 books complete

Pocket TTS Integration

CPU-based TTS at 6x real-time

Integrated Kyutai Labs' 100M-parameter Pocket TTS model as a lightweight, CPU-only text-to-speech option for AssetFlow. Runs at 6x faster than real-time on consumer hardware using only 2 CPU cores, with ~200ms latency to first audio chunk. Supports voice cloning and handles unlimited-length text input.

PyTorch Pocket TTS CPU Inference Voice Cloning
Production · Integrated in AssetFlow

See the code.

Browse my public repositories on GitHub or get in touch to discuss a project.