InfiniteTalk

High-fidelity talking head video generation with precise lip-sync and natural head motion.

GPU Min12GB

GPU Rec24GB

Disk Min20GB

Disk Rec60GB

ComfyUI: 1.37.11

Last Updated: 1/28/2026

Max frames

kjnodes

value

500

negative_prompt

STRING

bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards

positive_prompt

STRING

The man speaking

LoadAudio

-core

AUDIO

hello.mp3

null

Width

kjnodes

value

480

Height

kjnodes

value

848

LoadImage

-core

IMAGE

MASK

useflow (1).png

image

Create Video - InfiniteTalk

-core

image

width

height

num_frames

audio

positive_prompt

negative_prompt

images

width

height

num_frames

positive_prompt

negative_prompt

model

lora

model_name

clip_name

VHS_VideoCombine

videohelpersuite

images

audio

meta_batch

vae

Filenames

frame_rate25

loop_count0

filename_prefixuseflow/v

formatvideo/h264-mp4

pix_fmtyuv420p

crf19

save_metadatatrue

trim_to_audiofalse

pingpongfalse

save_outputtrue

Readme

Useflow - Simple workflows that actually work

Model links

GGUF

Wan2.1-i2v 480p or 720p

InfiniteTalk

Wan2_1-InfiniteTalk_Single_xxxx.gguf

LORA

lightx2v_I2V_14B_480p_cfg_step_distill_rank64_bf16.safetensors

clip_vision

clip_vision_h.safetensors

text_encoders

umt5-xxl-enc-bf16.safetensors

vae

Wan2_1_VAE_bf16.safetensors

wav2vec2

wav2vec2-chinese-base_fp16.safetensors

MelBandRoFormer

MelBandRoformer_fp16.safetensors

Model Storage Location

📂 ComfyUI/
├── 📂 models/
│   ├── 📂 diffusion_models/
│   │      ├── wan2.1-i2v-14b-480p_xxxx.gguf || wan2.1-i2v-14b-720p_xxxx.gguf
│   │      ├── Wan2_1-InfiniteTalk_Single_xxxx.gguf
│   │      └── MelBandRoformer_fp16.safetensors
│   ├── 📂 loras/
│   │      └── lightx2v_I2V_14B_480p_cfg_step_distill_rank64_bf16.safetensors
│   ├── 📂 clip_vision/
│   │      └── clip_vision_h.safetensors
│   ├── 📂 text_encoders/
│   │      └── umt5-xxl-enc-bf16.safetensors
│   ├── 📂 vae/
│   │      └── Wan2_1_VAE_bf16.safetensors
│   └── 📂 wav2vec2/
│          └── wav2vec2-chinese-base_fp16.safetensors

Zoom: 100%

workflow