模型与实验室 5.0 · 必读 2026-04-16 · X

Gemini 3.1 Flash TTS 登陆 Google Vids,支持一键脚本转专业旁白

Google AI 宣布 Gemini 3.1 Flash TTS 已在 Google Vids 中推送,同时通过 Gemini API 和 Google AI Studio 预览开放。该功能可将脚本一键转化为工作室级旁白配音,面向 pitch deck 和 passion project 等场景。这是 Google 生态内首次实现「脚本→专业配音」的端到端覆盖,视频创作者无需再借助第三方语音合成工具,工作流大幅简化。

打开原文回到归档

Google Gemini 3.1 Flash TTS:支持音频标签的可控文本转语音

来源:X/Twitter · @GoogleAI · 2026-04-17

发布内容

Today we launched Gemini 3.1 Flash TTS, our most expressive and controllable text-to-speech model yet.

This launch includes audio tags! 🗣🏷

Audio tags are a seamless way to guide vocal style, pace, and delivery using natural language commands embedded directly in your text. Want a different tempo or tone? Just tag the audio to steer the AI-speech output!

The model supports 70+ languages (24 of which are high-quality evaluated languages, including: Japanese, Hindi, and Arabic).

可用性

Gemini 3.1 Flash TTS is rolling out in Google Vids and is available today in preview via the Gemini API and in @GoogleAIStudio.

Whether you're creating a pitch deck or recording a passion project, transform your scripts into studio-quality narration: https://t.co/MG2YIQwKb6

社区反响

  • @travasites: "Nice to see TTS getting the 'Flash' treatment — low latency matters for real time narration. Curious how the voice quality compares to ElevenLabs or Azure TTS, and whether the API supports streaming or SSML."
  • @dan_in_robots: "Audio tags for steering speech output, exactly the practical control developers need. Already testing in AI Studio."
  • @PromptSlinger: "Wait so the pitch deck thing, does it handle different speaker voices in the same script? Or is it one narrator per video?"

关键参数

| 特性 | 详情 | |------|------| | 模型 | Gemini 3.1 Flash TTS | | 语言支持 | 70+ 语言(24 种高质量评估语言) | | 特色功能 | Audio Tags(自然语言嵌入的语音风格控制) | | 可用平台 | Google Vids、Gemini API、AI Studio |