Google Launches Gemini 3.1 Flash TTS, Robotics-ER 1.6, and Native Gemini App for macOS

AI3 min

April 16, 2026

Google Launches Gemini 3.1 Flash TTS, Robotics-ER 1.6, and Native Gemini App for macOS

Google released a wave of AI products: Gemini 3.1 Flash TTS for text-to-speech with 70+ languages, Gemini Robotics-ER 1.6 developed with Boston Dynamics, and a native Gemini desktop app for macOS.

📝

CoinJP Editorial

CoinJP Editorial · 0 articles

Gemini 3.1 Flash TTS: Next-Generation Speech Synthesis

Google has unveiled a series of major product launches, headlined by Gemini 3.1 Flash TTS — a text-to-speech model built on the Gemini 3 architecture. The updated model delivers enhanced audio quality, greater expressiveness, and granular control over voice generation. Supporting more than 70 languages, it enables developers, enterprises, and everyday users to build applications with AI-powered voice interfaces.

Gemini 3.1 Flash TTS Artificial Analysis ranking — Gemini 3.1 Flash TTS performance in the Artificial Analysis TTS ranking

Access to Gemini 3.1 Flash TTS is already available through multiple channels:

Developers can use it in early access via Gemini API and Google AI Studio;
Enterprise customers gain access through Vertex AI;
Workspace users can leverage it via Google Vids.

The model scored 1,211 points on the Artificial Analysis TTS leaderboard, a rating based on blind testing preferences from thousands of respondents evaluating audio quality. Artificial Analysis placed Gemini 3.1 Flash TTS among the most attractive solutions on the market, citing the combination of high-quality speech synthesis and low cost. The model is also noted for its ability to generate natural multi-speaker dialogues.

Audio Tags for Precise Voice Control

A key innovation in version 3.1 Flash TTS is the introduction of audio tags — tools that let developers control the style, tempo, and delivery of generated speech. According to Google's official blog, early testers and enterprise users have praised the model's "impressive controllability and expressiveness." They reported that audio tags provide a new level of creative precision, transforming plain text into high-quality voice performances.

Why This Matters

Google's latest batch of releases spans three strategic areas: voice AI, robotics, and desktop applications. The speech synthesis model could reshape how voice interfaces, audio content, and customer service automation are built. The robotics model, developed in partnership with Boston Dynamics, pushes AI closer to widespread adoption in industrial settings. Meanwhile, the native macOS app extends Google's AI ecosystem to Apple's desktop user base.

Gemini Robotics-ER 1.6: AI for Real-World Robots

Alongside the speech model, Google introduced Gemini Robotics-ER 1.6 — a neural network designed to empower robots to handle complex tasks in real-world environments. The model focuses on improving machines' cognitive abilities through "embodied" reasoning: spatial perception, action planning, and outcome evaluation.

Compared to its predecessor and Gemini 3.0 Flash, the new model shows significant improvements in tasks involving spatial and physical reasoning. Gemini Robotics-ER 1.6 can interpret readings from complex measurement instruments and monitor gauges through viewing windows — a capability that Google DeepMind developed jointly with Boston Dynamics for industrial applications.

Marco da Silva, Vice President of the Spot project at Boston Dynamics, stated that these capabilities allow robots to autonomously see, understand, and respond to real-world challenges.

In safety hazard detection tests, Gemini Robotics-ER 1.6 outperformed Gemini 3.0 Flash by 6% in text-based scenarios and by 10% in video analysis. Boston Dynamics has already integrated the model along with Gemini into its Orbit AIVI-Learning platform.

Gemini Comes to macOS

The third major announcement was a native Gemini application for macOS. The app is activated with the Option + Space shortcut and includes a window-sharing feature for instant context transfer. Users can generate images using Nano Banana, create videos with Veo, and access other familiar tools from Google's AI ecosystem.

Earlier in April, Google also announced Gemma 4 — a new family of open AI models designed for advanced reasoning and agentic workflows.

artificial-intelligenceboston-dynamicsgeminigooglemacosroboticstext-to-speech

Frequently Asked Questions

What is Gemini 3.1 Flash TTS?

Gemini 3.1 Flash TTS is Google's text-to-speech model built on the Gemini 3 architecture. It supports over 70 languages and features improved audio quality, expressiveness, and precise voice control through audio tags.

How can I access Gemini 3.1 Flash TTS?

Developers can access it in early preview through Gemini API and Google AI Studio. Enterprise customers can use it via Vertex AI, while Workspace users get access through Google Vids.

What is Gemini Robotics-ER 1.6 used for?

Gemini Robotics-ER 1.6 is an AI model designed for robotics applications, specializing in spatial perception, action planning, and outcome assessment. It was co-developed with Boston Dynamics for industrial use cases and outperformed Gemini 3.0 Flash by 6% in text-based safety tests and 10% in video analysis.

How do I launch Gemini on macOS?

The native Gemini app for macOS is activated using the Option + Space keyboard shortcut. It supports window sharing for instant context transfer, image generation with Nano Banana, and video creation with Veo.

How does Gemini 3.1 Flash TTS rank compared to competitors?

The model scored 1,211 points on the Artificial Analysis TTS leaderboard, based on blind testing by thousands of respondents. Artificial Analysis ranked it among the most attractive solutions for its combination of high-quality synthesis and low cost.

Read also

Alphabet Posts $94.7B Q1 Revenue Beating Estimates Amid AI-Driven Growth

Google's parent company Alphabet reported Q1 2026 revenue of $94.7 billion, surpassing Wall Street forecasts, with its cloud division and AI integration fueling a strong beat across all metrics.

3 min·🔥 0

Google Launches Nano Banana 2 Image Model and Redesigned Flow Creative Studio

Google released Nano Banana 2, a new visual generation model delivering Pro-level quality at Gemini Flash speed, alongside a major overhaul of its Flow creative platform.

3 min·🔥 1

Innovations

Google Enhances Opal AI Platform with New Autonomous Agents

Google has upgraded its visual AI workflow builder Opal with agent functionality that automatically analyzes tasks and selects appropriate tools for completion.

3 min·🔥 1

AI Audit Uncovers Critical Liveness Bug in Ethereum's Nethermind Client

Octane Security's AI discovered a high-severity vulnerability in the Nethermind execution client that could have halted block production for 38% of Ethereum mainnet validators. The Ethereum Foundation awarded a maximum $50,000 bounty.

3 min·🔥 1

Analytics

Gemini Exchange Faces Crisis After IPO Amid Crypto Market Turmoil

Gemini exchange lost 85% of its market cap post-IPO, cut 25% of staff, and exited three regions. What went wrong for the Winklevoss empire?

4 min·🔥 0

OpenAI Secures Record $110 Billion Round at $730 Billion Valuation

OpenAI closed the largest startup funding round in history at $110 billion, backed by Amazon, SoftBank, and Nvidia, with a $730 billion valuation.

4 min·🔥 1