Introducing visual references: create more consistent AI videos
Guide AI video generation with up to five images and one video. Create more consistent characters, styles, and motion with Visual References.
Envato: Get every type of asset for any type of project, and access to AI tools. Start now
Gemini Omni Flash is now part of Envato's AI video generator, bringing multimodal creation, conversational editing, and richer video workflows.
The next evolution of AI-powered video creation has arrived.
We’re excited to announce that Gemini Omni Flash is now part of the AI models powering Envato’s AI video generator.
While you won’t need to choose between individual models, every advancement helps improve what’s possible with AI-powered video creation. We handle the technology behind the scenes so you can focus on creating, while benefiting from the latest innovations as they become available.
Gemini Omni Flash introduces powerful multimodal capabilities, more intuitive editing workflows, and new ways to work with references, context, and creative direction.
Video creation is becoming more flexible, iterative, and connected across different media.
With Gemini Omni Flash now helping power AI video generation on Envato, you can work with text, images, and video together, refine content through natural language instructions, and build on existing ideas without constantly starting over. The result is a more intuitive creative process that gives you greater control over how videos are created and refined.
Gemini Omni Flash is the first model in Google’s Gemini Omni family, combining Gemini’s reasoning capabilities with advanced multimodal creation and editing.
Designed to work across text, images, and video, Gemini Omni Flash supports creative workflows that go beyond simple generation. It helps creatives build, edit, refine, and evolve content using a wider range of inputs and more natural forms of interaction.
Gemini Omni Flash brings a deeper understanding of locations, environments, cultural references, and historical settings, helping generate content that feels more grounded and believable. It can also render clearer, more realistic text within scenes while supporting more natural interactions between characters, objects, and environments.
Whether you’re creating a modern city street, a historical setting, or a location inspired by a specific culture, Gemini Omni Flash is designed to better understand the context behind the scene and reflect it more accurately in the final output.
Gemini Omni Flash is built to work with text, images, and video, making it easier to bring different creative inputs together in a single workflow. Creatives can use references to guide generation, maintain character consistency across scenes, transfer styles and motion between assets, and even transform sketches or rough concepts into video sequences.
By combining multiple forms of input, Gemini Omni Flash opens up new ways to turn inspiration, references, and existing creative assets into richer video outputs.
Gemini Omni Flash introduces more advanced editing workflows powered by natural language instructions. Instead of starting over, creatives can refine existing videos by changing actions, replacing objects, updating scenes, adjusting camera perspectives, modifying characters, and evolving creative direction through an ongoing editing process.
This conversational approach makes it easier to experiment, iterate, and develop ideas while maintaining continuity across edits and revisions.
Create and refine videos with greater flexibility by making changes through natural language instructions instead of repeatedly generating new versions from scratch. Small adjustments, creative experimentation, and ongoing refinement become faster and more intuitive.
Guide video generation using combinations of images, videos,, and text prompts within a single workflow. Multiple creative inputs can work together to shape a more cohesive, intentional final result.
Maintain continuity across scenes, edits, and creative iterations with support for reference-based workflows, character consistency, and more context-aware generation. This makes it easier to build projects that feel connected from beginning to end.
Gemini Omni Flash demonstrates a stronger understanding of locations, historical settings, cultural references, real-world environments, motion, and physics interactions. The result is content that can feel more realistic, relevant, and aligned with your creative intent.
Gemini Omni Flash is now helping power video generation on Envato, giving creatives new ways to generate, edit, and refine video content through multimodal creation and more intuitive editing workflows.
Whether you’re building from a prompt, working from references, refining an existing concept, or exploring new creative directions, Gemini Omni Flash expands what’s possible with AI-powered video creation.
Ready to see what you can create?
Start generating videos today with Envato’s AI video generator.
Gemini Omni Flash is the first model in Google’s new Gemini Omni family. It is a multimodal AI model designed to understand and work across text, images, video, and while supporting more advanced creative workflows such as conversational editing and reference-based creation.
No. Envato integrates AI technologies behind the scenes, so creatives can focus on creating rather than managing individual models. As new capabilities become available, they help improve the overall AI video generation experience.
Yes. Gemini Omni Flash is designed to work across multiple forms of media, including text, images, and video. These multimodal capabilities can support more flexible creative workflows and reference-based creation.
Google has stated that content generated with Gemini Omni Flash includes SynthID watermarking and C2PA Content Credentials. These technologies are designed to support responsible AI use and greater transparency around AI-generated content.
Gemini Omni Flash is designed around three core capabilities: Native Multimodality, Conversational Editing, and World Knowledge. Together, these capabilities help support more flexible creation workflows, natural editing experiences, and richer context-aware outputs.
Conversational editing is a workflow that allows creators to refine and edit content through natural language instructions. Rather than starting over with every change, edits can build on previous prompts while maintaining context and continuity.
Gemini Omni Flash supports video generation, but Google positions it as a multimodal model designed for both creation and editing workflows. It is built to work across text, images, and video rather than focusing on a single input type.
Guide AI video generation with up to five images and one video. Create more consistent characters, styles, and motion with Visual References.
Eleven creative pros share what getting into creative flow state actually looks like — the entry rituals, the breakers, and the habits that keep them locked in.
Creative flow anywhere doesn't need the perfect desk, the right light, or a quiet room. Eleven creative pros share the strangest places they've ever managed to lock in and get something done.
Explore AI art prompts for better image generation, including AI art styles, materials, colors, events, and AI prompt tips for creating more controlled visuals.