ComfyUI Workflow Architect

🌍 Remote, USA 🎯 Full-time 🕐 Posted Recently

Job Description

We are seeking a ComfyUI specialist to build a high-precision, sequential video generation workflow. The objective is to create a 15-second video that is generated in three distinct 5-second segments. The Vision: Unlike standard AI video generators that "guess" the motion, this workflow must allow me to provide a Start Frame and an End Frame for every 5-second block. This ensures that the video doesn't just wander; it follows a precise path from Point A to Point B. The aesthetic must mimic the "Zack D.

Films" style: clean 3D character models, clinical/educational lighting, and smooth, snappy animations. By generating the keyframes (T0, T5, T10, T15) before the video, we ensure total character consistency and professional-grade storytelling across the full 15 seconds. 2. Core Logic & Workflow Structure

The developer must build the pipeline to follow these four specific stages: ● Stage 1: Keyframe Storyboarding ○ A module to generate 4 primary images: 0s, 5s, 10s, and 15s. ○ Must use IP-Adapter or Wan-StandIn logic to ensure the character, clothing, and environment are identical in all 4 images.

● Stage 2: Sequential Rendering (

The "Sandwich" Method) ○ Segment 1 (0-5s): Uses Image 0 as the Start and Image 5 as the End. ○ Segment 2 (5-10s): Uses Image 5 as the Start and Image 10 as the End. ○ Segment 3 (10-15s): Uses Image 10 as the Start and Image 15 as the End. ● Stage 3: Seamless Transitions & Smoothing ○ Implement Color Match nodes to prevent "flicker" between segments. ○ Use VFI (Video Frame Interpolation) to bring the native 16fps output up to a "snappy" 60fps. ● Stage 4: Automated Assembly ○ Automatically stitch the three clips into a single high-bitrate.mp4 file.

3. Required Technical Stack (Models & Nodes) Primary Video Model: ● Wan2.1 / Wan2.2 (FLF2V Version): Specifically the First-Last Frame 14B or 1.3B models. This is non-negotiable as it is the only open-source model capable of dual-image conditioning (Start and End frames). Essential Custom Nodes: ● ComfyUI-WanVideoStartEndFrames: For the WanVideoStartEndFramesSampler. ● ComfyUI-WanVideoWrapper (Kijai): For model loading and VRAM optimization. ● ComfyUI-VideoHelperSuite (VHS): For video concatenation and saving.

● ComfyUI-KJNodes: For ColorMatch and frame interpolation. ● IP-Adapter-Plus: To lock character identity across the segments. 4. Technical Requirements & Performance ● VRAM Efficiency:

The workflow should be optimized for a 24GB VRAM environment (using FP8 or GGUF quantization where necessary). ● Zack D. Aesthetic:

The workflow must include a prompt-engineering block (or LoRA loader) pre-configured for the "3D Medical Animation / Octane Render" look. ● Modularity: Each 5-second segment should be able to be "frozen" or "muted" so I can re-roll one segment without re-rendering the whole 15 seconds.

5. Deliverables 1. A.json or.png workflow file that is color-coded and organized into clear groups. 2. A simple "Setup Guide" listing the specific models and LoRAs to download. 3. A Test Render: A 15-second demonstration video showing a character moving through the three segments with zero identity drift. Apply tot his job