Produce a Full Music Video with LTX-2.3 Lip-Sync — Segment by Segment
Learning Scenario
A brand has a full-length music track (shanty style) and a cinematic base image — a photorealistic 17th-century pirate captain on a wide ship deck. The goal is a continuous ~1:45 LTX-2.3 lip-sync music video built from consecutive 15-second segments: generate the first frame, extend segment-by-segment by grabbing the last frame of each clip, and stitch everything into one polished final cut.
What you'll learn in this tutorial
From brief to finished production
Generate the Hero Image
Start by generating your base composition. Use Nanobanana (reference: Zorg logo PNG) with a 16:9 aspect ratio for a "ultra-photorealistic wide cinematic daytime" pirate ship scene — captain at the helm, large crew visible, beautiful weather, and a flag showing the brand logo. This image becomes the first frame anchor for all video segments.

Hero Image — Nanobanana 16:9 · Ultra-photorealistic wide daytime pirate ship scene with Zorg-logo flag
Open full sizeSlice the Audio into 15-Second Segments
Before generating any video, slice the full audio track into 15-second chunks for each segment: 0:00–0:15, 0:15–0:30, 0:30–0:45, and so on through to the end of the track. Label each slice clearly. Having all audio segments ready prevents delays mid-production and ensures you use the correct time window for each video generation.
Generate Segment 1 with LTX-2.3 Lip-Sync
Use LTX-2.3 with the hero image as the start frame and the 0:00–0:15 audio slice. Set resolution to 1080p, duration 15s. Prompt: "The rugged pirate captain is singing with high energy to camera, joyful crew softly visible behind him." LTX-2.3 will animate the captain's face and lip movements to match the audio, producing a photoreal lip-synced performance.
Segment 1 (0:00–0:15) — LTX-2.3 Lip-Sync · Captain singing to camera · 15s · 16:9
Download / open videoGrab the Last Frame & Generate Each Subsequent Segment
After each segment is complete: (1) Extract the last frame of the generated video as a PNG — this becomes the start image for the next segment, ensuring visual continuity. (2) Submit the new LTX-2.3 job using the last-frame PNG and the next 15s audio slice. Repeat for segments 2, 3, 4… until you have covered the full track duration. If a job returns a 429 rate-limit error, wait 5–10 minutes before retrying.
Stitch All Segments into the Final Cut
Once all segments are generated, stitch them in chronological order. For efficiency, stitch in batches: first 3 clips → stitched_v1; then stitched_v1 + next 2 clips → stitched_v2; continue until all segments are joined. The final output is a single continuous lip-synced music video. Verify the audio is correctly synced throughout the full clip before publishing.
Final Stitched Cut — All segments combined · ~1:45 · 16:9 · Full lip-synced music video
Download / open videoReview, Replace & Re-upload Failed Segments
Occasionally a segment fails with an HTTP 404 (missing input URL) or produces a quality mismatch. Fix protocol: (1) Re-slice the audio for that segment fresh; (2) Re-grab the last frame from the previous clip; (3) Resubmit LTX-2.3 with the fresh URLs. Never reuse old sliced audio URLs that may have expired in cold storage. Once all passes quality review, commit the final stitch.
All assets produced in this tutorial
Every image, video, and audio file generated using Easy Zorg throughout this tutorial.

Hero Image — Nanobanana 16:9 · Ultra-photorealistic wide daytime pirate ship scene with Zorg-logo flag
Open full sizeSegment 1 (0:00–0:15) — LTX-2.3 Lip-Sync · Captain singing to camera · 15s · 16:9
Download / open videoFinal Stitched Cut — All segments combined · ~1:45 · 16:9 · Full lip-synced music video
Download / open videoNext Step
Apply what you learned — inside ZorgSocial
Open Easy Zorg and start using the same tools you saw in this tutorial — free.