Kling V3 Omni vs Wan 2.6: 5 Prompt Test (9:16, 720p)

This post runs the same 5 short ad-style prompts on two text-to-video models: Kling V3 Omni and Alibaba Wan 2.6. Each test uses 9:16, 720p, 5 seconds, audio off, and one simple camera move.
Quick specs
| Item | Kling V3 Omni | Wan 2.6 |
|---|---|---|
| Provider | Kling | Alibaba |
| Type | Text-to-video (also supports image + reference inputs) | Text-to-video (also supports image + reference inputs) |
| Resolution options | std (720p), pro (1080p) | 720P, 1080P |
| Aspect ratios | 16:9, 9:16, 1:1 | 16:9, 9:16, 1:1, 4:3, 3:4 |
| Duration tested here | 5 seconds | |
| Ratio tested here | 9:16 (vertical) | |
| Audio tested here | Off | |
Test setup (same for both models)
- Goal: fast vertical clips that could work as product ads or UGC-style demos
- Duration: 5 seconds per run
- Ratio: 9:16
- Resolution: 720p
- Audio: off
- Prompt style: 2-4 short sentences, one camera move
5 prompt results (Kling vs Wan)
1) Perfume bottle hero shot (reflections)
Prompt: 9:16 commercial product video. A premium matte-black perfume bottle on dark wet slate. Soft rim light, realistic reflections. Slow camera push-in with a gentle turntable rotation. Clean background, no text.
Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.
| Kling V3 Omni | Wan 2.6 |
|---|---|
|
|
|
- Kling keeps a matte cylindrical bottle consistent across frames, with stable lighting and reflections.
- Wan renders a glossier rectangular glass bottle look with strong highlights. Framing stays steady.
- Both clips look ad-usable for a clean product hero shot.
2) UGC hand demo (small object handling)
Prompt: 9:16 UGC phone video in a bright kitchen. A hand opens a wireless earbuds case and takes one earbud out. Slight handheld shake, natural skin texture. Simple background, no text.
Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.
| Kling V3 Omni | Wan 2.6 |
|---|---|
|
|
|
- Kling stays stable across the open-and-grab sequence, with normal-looking hands in the sampled frames.
- Wan looks coherent, but a couple frames show small geometry changes on the earbud/case.
- For UGC hands, keeping the action list short helps both models.
3) Running shoes turntable (geometry consistency)
Prompt: 9:16 studio product ad video of a pair of running shoes on a turntable. One smooth orbit around the shoes. Sharp fabric texture, clean highlights, soft shadow. Minimal background, no text.
Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.
| Kling V3 Omni | Wan 2.6 |
|---|---|
|
|
|
- Kling keeps the shoe form consistent across frames and looks safe for a generic product spin.
- Wan looks more stylized and detailed, but the midsole/shape shifts across frames and a brand-like side mark appears.
- If the product must stay exact, watch for shape drift and accidental branding on footwear prompts.
4) Stop-motion wrapper reveal (style lock)
Prompt: 9:16 stop-motion paper cutout ad scene. A chocolate bar wrapper flips open and a paper chocolate square pops out. Handcrafted paper texture, simple loop-like motion. Clean composition, no text.
Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.
| Kling V3 Omni | Wan 2.6 |
|---|---|
|
|
|
- Kling keeps the wrapper and chocolate piece coherent across frames, with a clean minimal look.
- Wan shows readable wrapper text (“Chocolate”) even though the prompt asked for no text.
- If you need brand safety, add a stronger negative prompt for text/logos and keep packaging generic.
5) Busy neon subway (crowd + signage)
Prompt: 9:16 cinematic handheld shot on a crowded subway platform at night. Neon lights reflect on a wet floor. People walk past the camera. One forward tracking move, realistic motion blur, no readable text.
Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.
| Kling V3 Omni | Wan 2.6 |
|---|---|
|
|
|
- Kling lands a busy crowd scene, but the sampled frames show more chaos: heavier blur/ghosting and more visible signage.
- Wan looks more composed and cinematic across frames, with steadier framing and fewer obviously readable signs.
- For public scenes, always watch for readable signage and recognizable faces if the clip goes into a real ad.
Verdict (based on these 5 tests)
- If the goal is clean product shots and simple hand demos, Kling V3 Omni looks steadier and more “safe” across frames.
- If the goal is a more cinematic vibe for environments (like the subway test), Wan 2.6 looks more composed in this set.
- Both can surprise you with accidental text or brand-like marks. Negative prompts help, but reviewing frames before shipping is mandatory.
Prompt tips that improved stability
- Write 2-4 short sentences. One subject, one camera move.
- Say “no text” and also add a negative prompt for text, watermark, and logos.
- For hands: keep the action list to one clear action (open, grab, place). Avoid multi-step instructions.
- For product spins: keep the background minimal and avoid brand names.
Try the prompts
Copy the prompts above, keep the settings the same (9:16, 5s, 720p, audio off), and swap only one variable at a time (ratio, duration, or camera move). That makes it easy to see what actually changes.