Google Vids AI video generation just entered a new phase. Google is rolling out Lyria 3 for music creation and Veo 3.1 for video generation and editing into Google Vids, fundamentally changing what non-professionals can create without leaving the platform. These are not incremental upgrades—they represent a significant leap in what AI video tools can do.
Key Takeaways
- Veo 3.1 generates videos with native audio, including sound effects, ambient noise, and dialogue
- Portrait (9:16) and landscape (16:9) aspect ratios now supported, with video extension and frame-specific generation
- Video lengths range from 4 to 8 seconds, with 8-second output limited to 1080p, 4K, or reference image workflows
- Lyria 3 handles music generation while Veo 3.1 excels at visual consistency, physics accuracy, and prompt adherence
- Available via Gemini API and Vertex AI for developers and creators
What Makes Google Vids AI Video Generation Different Now
The integration of Veo 3.1 into Google Vids eliminates a critical friction point: creators no longer need to generate video and audio separately, then sync them manually. Veo 3.1 natively generates audio alongside video, producing sound effects, ambient noise, and dialogue in a single pass. This is not a minor convenience—it is the difference between a tool that feels like work and one that feels intuitive.
The model supports three video generation approaches: text-to-video, image-to-video, and video-to-video. For creators working with reference materials, you can upload up to three reference images to guide the style and direction of generated footage. The system also handles video extension, letting you expand existing clips without reshooting, and frame-specific generation that lets you lock down the first and last frames to maintain narrative continuity.
Portrait videos (9:16) arrive as a major addition. Vertical video dominates social platforms, yet most AI video tools default to landscape. Google’s decision to prioritize both aspect ratios signals that the company is building for where creators actually publish, not just where traditional film happens.
How Veo 3.1 Performs Against Other AI Video Models
Google DeepMind tested Veo 3.1 across multiple benchmarks to measure real-world performance. On MovieGenBench, a test of 1,003 prompts, Veo 3.1 won overall preference among evaluators. In the audio-specific subset—527 prompts with sound—Veo 3.1 was chosen as the best for both audio synchronization and visual realism. On VBench I2V, a benchmark using 355 image-text pairs, Veo 3.1 was preferred for visual quality.
These benchmarks matter because they measure what users actually care about: does the video look real? Does the audio match the visuals? Does the model understand physics? Veo 3.1 excels at cinematic styles, real-world physics, realism, prompt adherence, and consistency across frames. Compared to the earlier Veo 2, which generated only silent video and maxed out at shorter durations, Veo 3.1 adds native audio, portrait aspect support, and extended creative controls.
The model comes in three variants—Veo 3.1, Veo 3.1 Fast, and Veo 3.1 Lite—all currently in preview status. Video output lengths range from 4 to 8 seconds, though the full 8-second option is limited to 1080p, 4K, or workflows using reference images. You can generate up to four output videos per prompt.
Lyria 3 and the Music Side of Creation
While Veo 3.1 handles visuals, Lyria 3 manages music generation. The integration means creators can build a complete scene—video and original music—within Google Vids without bouncing between tools. Lyria 3 is positioned as a music generation model that complements the video capabilities, though the research brief provides limited detail on its specific strengths relative to music-only alternatives.
The real value is ecosystem integration. A creator can write a script, generate video with Veo 3.1, create music with Lyria 3, and edit everything in Google Vids without leaving the platform. That workflow simplicity is competitive advantage.
Where This Fits in Google’s Broader AI Strategy
These models are available via Gemini API and Vertex AI, meaning developers can integrate them into custom applications. Google DeepMind partnered with Primordial Soup to create short films using Veo 3.1, demonstrating professional-grade output. The fact that Google is pushing these tools to both casual creators and professional filmmakers suggests the company believes the quality gap has closed enough for both audiences.
Content Credentials (C2PA) support is also built in, allowing creators to tag AI-generated content with provenance metadata. This addresses a real concern: as AI video becomes indistinguishable from filmed video, transparency about origins matters for trust and legal clarity.
What Creators Actually Need to Know
If you use Google Vids and have wanted native audio generation, portrait videos, or the ability to extend existing footage, this update removes those blockers. The 4-to-8-second length constraint is real—you are not building feature films here—but for social media, ads, and short-form content, it is adequate. The reference image capability gives you directional control without requiring you to shoot anything yourself.
The catch: Veo 3.1 is still in preview via Gemini API and Vertex AI. Full rollout into Google Vids is announced but not yet universal. If you do not have access yet, you likely will soon, but adoption will be phased.
Can I use Veo 3.1 for commercial projects?
The research brief does not specify licensing or commercial use restrictions for Veo 3.1 within Google Vids. You should verify terms of service and content policies before publishing commercially generated videos, particularly for advertising or brand work.
How long does it take to generate a video with Veo 3.1?
The research brief does not provide generation speed or processing time estimates. Speed will depend on video length, resolution, and whether you are using the standard, Fast, or Lite variant.
Is there a free tier for Google Vids with Veo 3.1?
The research brief confirms Veo 3.1 is available via Gemini API and Vertex AI but does not specify pricing or free tier availability. Check Google’s official documentation for current access and cost details.
Google Vids just became a serious contender for creators who cannot afford traditional video production. Native audio, portrait support, and physics-accurate generation close the gap between what amateurs and professionals can produce. The preview status is temporary—expect full rollout to reshape how non-experts think about video creation.
This article was written with AI assistance and editorially reviewed.
Source: Android Central


