Video generation
1. Use cases
Video generation models are technologies that use text or image descriptions to generate dynamic video content. As the technology continues to advance, its applications are becoming increasingly widespread. Some potential application areas include:
- Dynamic content generation: Video generation models can create dynamic visual content to describe and explain information.
- Multimodal intelligent interaction: Combining image and text inputs, video generation models can be used for more intelligent and interactive applications.
- Replacing or enhancing traditional visual technologies: Video generation models can replace or enhance traditional machine vision technologies to solve more complex multimodal problems. As technology progresses, the multimodal capabilities of video generation models will integrate with visual language models, driving their comprehensive application in intelligent interaction, automated content generation, and complex scenario simulation. Additionally, video generation models can be combined with image generation models (image-to-video) to further expand their application range, achieving more diverse and rich visual content generation.
2. Usage ecommendations
When writing prompts, pay attention to detailed, chronological descriptions of actions and scenes. Include specific actions, appearance, camera angles, and environmental details. All content should be written in a single paragraph, starting directly with the main action, and the description should be specific and precise. Imagine yourself as a director describing a shot script. Keep the prompt within 200 words.
To achieve the best results, structure your prompt as follows:
- Start with a sentence describing the main action
- Example:A woman with light skin, wearing a blue jacket and a black hat with a veil,She first looks down and to her right, then raises her head back up as she speaks.
- Add specific details about actions and gestures
- Example:She first looks down and to her right, then raises her head back up as she speaks.
- Precisely describe the appearance of the character/object
- Example:She has brown hair styled in an updo, light brown eyebrows, and is wearing a white collared shirt under her blue jacket.
- Include details about the background and environment
- Example:The background is out of focus, but shows trees and people in period clothing.
- Specify the camera angle and movement
- Example:The camera remains stationary on her face as she speaks.
- Describe lighting and color effects
- Example:The scene is captured in real-life footage, with natural lighting and true-to-life colors.
- Note any changes or sudden events
- Example:A gust of wind blows through the trees, causing the woman’s veil to flutter slightly.
Example of a video generated from the above prompt:
3. Experience address
You can experience it by clicking playground.
4. Supported models
4.1 Text-to-video models
Currently supported text-to-video models:
- Lightricks/LTX-Video
This model offers free video generation for a limited time when calling the text-to-video API. You can experience it in the playground and it supports API calls.
- tencent/HunyuanVideo
This model charges ¥0.7/Video. It supports API calls.
- genmo/mochi-1-preview
This model charges ¥2.8/Video. It supports API calls.
4.2 Image-to-video models
- Lightricks/LTX-Video
This model charges ¥0.14/Video when calling the image-to-video API. It currently only supports API calls.