Kling Lip Sync Text to Video Node synchronizes mouth movements in a video file to match a text prompt. It takes an input video and generates a new video where the character's lip movements are aligned with the provided text. The node uses voice synthesis to create natural-looking speech synchronization.

## Inputs

| Parameter | Description | Data Type | Required | Range |
| --- | --- | --- | --- | --- |
| `video` | Input video file for lip synchronization. Video must be between 720px and 1920px in height/width, between 2s and 10s in duration, and no larger than 100MB. | VIDEO | Yes | - |
| `text` | Text Content for Lip-Sync Video Generation. Required when mode is text2video. Maximum length is 120 characters. | STRING | Yes | - |
| `voice` | Voice selection for the lip-sync audio (default: "Melody"). Includes both English and Chinese voice options. | COMBO | No | "Melody"<br>"Sunny"<br>"Sage"<br>"Ace"<br>"Blossom"<br>"Peppy"<br>"Dove"<br>"Shine"<br>"Anchor"<br>"Lyric"<br>"Tender"<br>"Siren"<br>"Zippy"<br>"Bud"<br>"Sprite"<br>"Candy"<br>"Beacon"<br>"Rock"<br>"Titan"<br>"Grace"<br>"Helen"<br>"Lore"<br>"Crag"<br>"Prattle"<br>"Hearth"<br>"The Reader"<br>"Commercial Lady"<br>"阳光少年"<br>"懂事小弟"<br>"运动少年"<br>"青春少女"<br>"温柔小妹"<br>"元气少女"<br>"阳光男生"<br>"幽默小哥"<br>"文艺小哥"<br>"甜美邻家"<br>"温柔姐姐"<br>"职场女青"<br>"活泼男童"<br>"俏皮女童"<br>"稳重老爸"<br>"温柔妈妈"<br>"严肃上司"<br>"优雅贵妇"<br>"慈祥爷爷"<br>"唠叨爷爷"<br>"唠叨奶奶"<br>"和蔼奶奶"<br>"东北老铁"<br>"重庆小伙"<br>"四川妹子"<br>"潮汕大叔"<br>"台湾男生"<br>"西安掌柜"<br>"天津姐姐"<br>"新闻播报男"<br>"译制片男"<br>"撒娇女友"<br>"刀片烟嗓"<br>"乖巧正太" |
| `voice_speed` | Speech Rate. Valid range: 0.8~2.0, accurate to one decimal place. (default: 1) | FLOAT | No | 0.8-2.0 |

**Video Requirements:**

- Video file should not be larger than 100MB
- Height/width should be between 720px and 1920px
- Duration should be between 2s and 10s

## Outputs

| Output Name | Description | Data Type |
| --- | --- | --- |
| `output` | Generated video with lip-synchronized audio | VIDEO |
| `video_id` | Unique identifier for the generated video | STRING |
| `duration` | Duration information for the generated video | STRING |

> This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! [Edit on GitHub](https://github.com/Comfy-Org/embedded-docs/blob/main/comfyui_embedded_docs/docs/KlingLipSyncTextToVideoNode/en.md)

---
**Source fingerprint (SHA-256):** `fb6d208be684f8fc38f692c0439bcfebafc8c448932bc54fa4730da87113f376`
