
At its Google I/O conference last week, Google announced that its Gemini chatbot, which now has 900 million monthly active users, was shifting from a daily prompt limit to compute-based usage limits that will refresh every 5 hours until users reach their weekly limits. Usage limits now depend on the complexity of the prompts, with premium models and features using more resources.
Google probably needed to better explain these new Gemini usage limits, however, as many users reached their quotas too quickly. Josh Woodward, VP of Google Labs, Gemini, and AI Studio, explained yesterday that the company implemented some changes in response to user feedback.
We’ve heard your feedback about hitting limits too quickly on @GeminiApp. We're rolling out several fixes to make your quota stretch further and feel more predictable… 🧵
— Josh Woodward (@joshwoodward) May 29, 2026
First of all, Woodward clarified that Gemini prompts that use Google’s Flash-Lite model “should be free and won’t count against your quota.” Gemini users also won’t see their quota impacted when a request fails. “Our system mistakes are on us, not you. Your quota is used only for successful completions,” the exec explained.
When using the more intensive Deep Research mode, Gemini will now display “more detailed usage breakdowns and notifications” to prevent users from reaching their limits too quickly. Moreover, for complex prompts that use the Gemini 3.1 Pro model, Google is also capping the amount of quota a single prompt can use. This should be noticeable on prompts with large files attached, which can use a lot of resources.
Woodward also said that Google fixed a bug that was causing videos generated with Omni, Google’s new world model that can create anything from any input, to drain quotas for some users. The company also doubled the number of Omni video generations for Google AI Ultra subscribers.
Lastly, Google Gemini will now remember the last model you use across all future sessions. “It will only change if you manually adjust it or hit a cap that triggers an automatic fallback to a lighter model,” Woodward explained yesterday.