Philip Rugo | Product + Creative Technology

01 / The Brief

While the larger Kaiber team was in the early days of developing Superstudio, Labs deployed on the ground through a client partnership with a clear goal: ship a professional visual set built entirely out of generative media and post-processing. This wasn't a proof of concept or a demo reel. It was a live show at a major venue, projected across the Brooklyn Mirage's massive video wall, with a real audience and a real artist's creative vision on the line.

The challenge was that the techniques we needed barely existed yet. Early video models and keyframe-driven animation flows were nascent at the time. Character consistency, motion interpolation, and aesthetic control across a full hour of content were all unsolved problems.

02 / Approach

We built custom workflows from scratch, combining multiple AI models and traditional post-processing tools to solve each production challenge individually.

Keyframe-driven character animation was the core problem. We needed Woofa to move consistently across dozens of scenes while preserving her signature face and body constraints. The solution was a pipeline that generated stylized keyframes using custom reference images in Midjourney and SDXL, then synthesized coherent motion between start and end keyframes using Luma DreamMachine.

We tested four different video model pipelines before landing on this approach: ToonCrafter, AnimateDiff via ComfyUI, Viggle, and Luma DreamMachine. Each had tradeoffs. ToonCrafter handled tweening for sequential keyframes but had major aesthetic limitations. ComfyUI offered longer outputs but degraded over time. Viggle had innovative motion input but produced inherently memetic results. Luma emerged as the clear winner with state-of-the-art video logic, consistency, and 5-second outputs that could extend or loop.

Woofamoji was a secondary deliverable: a full custom set of iOS-styled emojis built around Woofa's likeness. We developed a pipeline using Midjourney for stencil generation, Photoshop for face compositing and cleanup, ComfyUI with an SDXL emoji LoRA for stylization, and Real-ESRGAN for upscaling. This pipeline was significantly faster and more efficient at scale than traditional 3D modeling or illustration approaches.

1 / 3

Weirdcore handled additional visual processing through Max MSP, layering the generative outputs into the final show visuals.

03 / Output

The final production totaled 115 core keyframes, 133 secondary keyframes, and 355 five-second clips generated from those keyframes, roughly 26 minutes of generated video that was composited and processed into the full hour-long set.

04 / Why This Matters

This project was formative. The problems we solved here, character consistency across generative video, motion control between keyframes, scaling aesthetic quality across long-form content, directly informed how Superstudio was designed. Working at the intersection of real creative production and emerging AI tools made it clear that the gap between what these models could do and what creators actually needed was a product problem, not just a research problem. That realization shaped the trajectory of my work at Kaiber and my eventual move into product.

KAIBER LABS X YAEJI

Role

01 / The Brief

02 / Approach

03 / Output

04 / Why This Matters