On April 2, Google released Gemma 4, a family of four multimodal AI models built on the same architecture as Gemini 3. The models process text, images, video, and audio natively. They run from 2 billion parameters up to 31 billion. And they ship under Apache 2.0, the most permissive open-source license in AI. No usage restrictions. No commercial limitations. No fine print.

Four Models, Four Use Cases

The lineup is designed to cover everything from a phone to a data center. The E2B model (2 billion parameters) is built for on-device inference through Android AICore, giving mobile developers a genuinely capable multimodal model that runs locally without API calls. The E4B model (4 billion) hits a sweet spot for edge deployments and lightweight servers.

The real firepower sits in the larger variants. The 26B Mixture-of-Experts model ranks #6 among all open models on the Arena AI leaderboard while using significantly less compute than dense models of similar capability. The 31B Dense model ranks #3 globally among open models, putting it in striking distance of commercial offerings that cost real money to access.

All four models handle vision, audio, and text in a single architecture. No separate pipelines. No stitching together speech-to-text with a language model with an image classifier. One model, multiple modalities, out of the box.

Why Apache 2.0 Changes the Game

The license matters more than the benchmarks. Meta's Llama models ship under a custom license that restricts usage for companies above 700 million monthly active users and requires special permissions for certain deployments. Mistral has moved toward commercial licensing for its most capable models. Most Chinese open-source models carry restrictions on military or surveillance applications.

Google chose Apache 2.0 for Gemma 4. That means any company, of any size, can use these models commercially without permission, modify them without disclosure, and deploy them without royalties. For startups building AI-powered products, this eliminates the legal overhead that comes with every other frontier-class open model.

The strategic logic is clear. Google makes money from cloud infrastructure, not model licensing. Every developer who builds on Gemma is a potential Google Cloud customer. By removing license friction, Google is betting that ecosystem growth will drive more compute revenue than model subscriptions ever could.

What This Means for Teams Building AI Products

If you're currently paying for API access to multimodal capabilities, Gemma 4 deserves a serious evaluation. The 26B MoE model delivers near-frontier performance at a fraction of the inference cost of dense models. For teams processing images, audio, or video at scale, self-hosting a Gemma model could cut your AI bill dramatically.

The E2B on-device model opens a category that barely existed six months ago: genuinely capable AI that runs entirely on a user's phone. No network latency. No API costs. No data leaving the device. For healthcare, finance, or any industry with privacy requirements, this is a significant unlock.

What To Do About It

1. Benchmark Gemma 4 against your current stack. If you're using GPT-4o or Claude for multimodal tasks, run the 31B Dense model on the same test set. The quality gap may be smaller than you expect, and the cost gap will be enormous.

2. Evaluate the 26B MoE for production workloads. Mixture-of-Experts models activate only a subset of parameters per inference call. For high-throughput applications, this translates to lower GPU requirements and faster response times at comparable quality.

3. Prototype on-device AI with E2B. If your product involves mobile and you've been waiting for models capable enough to run locally, this is the starting line. Test it through Android AICore or run it on any device with 4GB+ RAM.

4. Read the Apache 2.0 license. Seriously. If you've been navigating Llama's restrictions or Mistral's commercial terms, switching to Gemma eliminates an entire category of legal review from your deployment pipeline.

HRIM's Take

Google is playing a different game than everyone else in open-source AI. While Meta negotiates usage caps and Mistral builds commercial tiers, Google is giving away models that rank in the global top 5 with zero strings attached. The bet is that the ecosystem effects will be worth more than licensing revenue. For builders, the calculus is simple: if Gemma 4 meets your quality bar, there is no cheaper or legally cleaner option available today. We're already evaluating the 26B MoE for internal tooling, and we'd recommend any team doing multimodal work to do the same.