16 — Common Workflows

This section provides step-by-step recipes for the most common use cases in HF AI Video Studio.

Workflow 1: Quick Video Edit

Goal: Import footage, clean it up, and export — no AI generation needed.

Import video. In the Video Segments panel, click Add Video and import your file(s).
Trim. Select each segment on the timeline. In the Segment Sidebar, adjust Trim Start and Trim End to cut unwanted footage.
Reorder. Drag segments on the timeline to arrange them in the desired order.
Add transitions (optional). In the Segment Sidebar, set a transition style and duration for each cut.
Add audio (optional). In Audio Clips, import background music. Adjust volume so it doesn’t overpower the video audio.
Export. Click Export → Export Video. The MP4 is saved to your export folder.

Goal: Build a polished video entirely from AI-generated assets.

Generate images. In AI Generate → Image, use a text prompt to generate a thumbnail, background, or title card.
Generate B-roll video. In AI Generate → Video, generate short clips (5–10 seconds each) as scene filler.
Generate or record a voiceover. Use Voice Studio to record a voiceover, or skip to synthesis below.
Synthesize voice (optional). In Voice Studio, type your script, add pause markers, and synthesize using your preferred voice model.
Generate a talking head (optional). In AI Generate → Talking Head, upload your portrait and pair it with the synthesized audio to create a presenter clip.
Arrange on timeline. Add generated images and videos as video segments. Add voiceover or talking head audio to the audio track.
Add captions. If you used Voice Studio, transcription data is ready. Open Captions, pick a style, and adjust offset if needed.
Export. Click Export → Export Video.

Goal: Replace the original voice in a video with a new AI-synthesized voice.

Open Voice Studio. Click + Voice Studio in the toolbar.
Extract audio. In Step 1 (Extract), upload your existing video. The original audio is extracted.
Transcribe. In Step 2 (Transcribe), select a transcription engine and click Transcribe.
Edit and polish. In Step 3 (Edit/Polish), correct any errors. Use Polish with AI to improve the tone if desired. Add pause markers (<#0.8#>) at natural breathing points.
Synthesize. In Step 4 (Synthesize), pick a voice, set pitch/speed/emotion, and click Synthesize.
Preview and iterate. Listen to the result. Change settings and regenerate until satisfied.
Add to timeline. Click Add to Timeline to insert the new audio. Mute or remove the original video audio if it was embedded in the video segment.
Sync captions. Since you have the transcript, open Captions and generate captions from it. Use the offset slider to sync to the new audio timing.
Export.

Goal: Turn a long recorded audio session into a fully produced talking-head video series.

Open Long-Form Pipeline. Click + Long-Form in the toolbar.
Upload audio (Step 1). Upload the full podcast or recording.
Segment (Step 2). Use auto-segmentation with a target segment duration of 3–5 minutes. Use silence detection for natural split points. Adjust boundaries manually if needed.
Generate (Step 3).
Upload your reference portrait image.
Select a talking-head model (OmniHuman or MultiTalk recommended).
Write a scene prompt: “Professional presenter, warm studio lighting, looking at camera”.
Enable Frame continuity for seamless clip-to-clip transitions.
Click Generate All.
Monitor progress. Watch the generation status per segment. Most will complete in 2–5 minutes each.
Merge (Step 4). Review completed clips. Regenerate any failed ones. Click Merge to export the final combined video with audio synced.

Goal: Add professional-grade animated captions to any video.

Import your video in the Video Segments panel.
Open Voice Studio. Use Extract to pull the audio from the video, then Transcribe using Whisper for the best word-level accuracy.
Open Captions panel. The transcription data populates automatically.
Choose a style. Pick Pop for social media clips, Karaoke for music content, or Typewriter for a more dramatic effect.
Style the text. Set font, size, weight, and colors. Active word color should contrast strongly against the inactive word color.
Preview. Press play and watch the word-by-word highlighting in real time.
Adjust offset. If captions are early or late, use the offset slider or nudge buttons to shift them into sync.
Fine-tune phrases (optional). On the captions track in the timeline, drag individual phrase boundaries to correct any phrases that are timed slightly wrong.
Export. Click Export → Export Video. Captions are burned into the video.

Previous: Exporting | Next: Troubleshooting & FAQ →