Gemma 4

Google DeepMind Launches Gemma 4: The Most Capable Open-Source AI Models Ever Released

Google DeepMind has officially unveiled Gemma 4, a groundbreaking family of open models that the company describes as “byte for byte, the most capable open models to date.” Built on the same cutting-edge research and technology as the proprietary Gemini 3 models, Gemma 4 is purpose-built for advanced reasoning, agentic workflows, and on-device AI deployment. Crucially, the entire family is released under the fully permissive Apache 2.0 license, making it free for commercial use, modification, and redistribution.

This release marks a major leap forward for open-source AI. Unlike previous Gemma versions, Gemma 4 delivers frontier-level multimodal intelligence that can run locally on everything from smartphones and edge devices to consumer GPUs — no cloud required.

The Gemma 4 Model Family

Gemma 4 comes in four carefully optimized variants, split into two deployment tiers:

ModelArchitectureParametersTarget HardwareContext WindowKey Strengths
E2BDense2.3B effective (5.1B total)Phones, IoT, browsers128K tokensUltra-light, real-time edge AI
E4BDense4.5B effective (8B total)Mobile & edge devices128K tokensAudio + vision on-device
26B A4BMixture-of-Experts25.2B total / 3.8B activeConsumer GPUs & workstations256K tokensHigh efficiency + reasoning
31B DenseDense30.7BWorkstations & local servers256K tokensMaximum intelligence per parameter

All models support text + image input (variable resolution). The smaller E2B and E4B models also handle audio input (up to 30 seconds) and process video as frame sequences (up to 60 seconds at 1 fps).

Breakthrough Capabilities

Gemma 4 is designed from the ground up for agentic AI — systems that can plan, reason step-by-step, use tools, and take autonomous actions:

  • Advanced Reasoning & Thinking Mode: Native support for structured <|think|> reasoning tokens that let the model show its work before answering.
  • Function Calling & Tool Use: Built-in support for agentic workflows, including multi-step planning and external tool integration.
  • Multimodal Superpowers: Strong performance on vision (MMMU Pro) and audio tasks without any specialized fine-tuning.
  • Long Context Mastery: Up to 256K tokens — perfect for analyzing entire codebases, long documents, or complex conversations.
  • 140+ Languages: True multilingual fluency with cultural context awareness, not just translation.
  • Offline & Private: Runs completely locally with near-zero latency — ideal for privacy-sensitive or disconnected environments.

Performance That Redefines Open-Source AI

Gemma 4 doesn’t just compete — it dominates previous open models and even punches above its weight against much larger systems. Here are key benchmarks for the instruction-tuned (“IT”) versions (as of April 2, 2026):

BenchmarkGemma 4 31BGemma 4 26B A4BGemma 4 E4BGemma 4 E2BGemma 3 27B
Arena AI (text)145214411365
MMMLU Multilingual85.2%82.6%69.4%60.0%67.6%
MMMU Pro (Multimodal)76.9%73.8%52.6%44.2%49.7%
AIME 2026 Math (no tools)89.2%88.3%42.5%37.5%20.8%
LiveCodeBench (Coding)80.0%77.1%52.0%44.0%29.1%
GPQA Diamond (Science)84.3%82.3%58.6%43.4%42.4%
τ2-bench (Agentic Tool Use)86.4%85.5%57.5%29.4%6.6%

The 31B and 26B models deliver exceptional intelligence-per-parameter, outperforming the previous Gemma 3 27B across the board — sometimes by massive margins.

Where to Get Gemma 4 Right Now

  • Hugging Face: Full collection with transformers, llama.cpp, MLX, WebGPU, and Ollama support.
  • Google AI Studio: Try the 26B and 31B models instantly.
  • Google AI Edge Gallery: E2B and E4B optimized for Android, iOS, web, and embedded devices.
  • Google Cloud / Vertex AI: Fully managed deployment coming soon.
  • Ollama: One-command local install (ollama run gemma4).

All weights are downloadable today under Apache 2.0.

Why Gemma 4 Matters

For years, open-source models lagged behind closed frontier models in reasoning and multimodal capabilities. Gemma 4 closes that gap dramatically while maintaining full openness and on-device efficiency. Developers can now build powerful agents, coding assistants, vision apps, and voice interfaces that run entirely locally — delivering speed, privacy, and zero cloud costs.

Google DeepMind VP of Research Clement Farabet and Group Product Manager Olivier Lacombe summed it up: “Gemma 4 is our answer to what innovators need next… breakthrough capabilities made widely accessible.”

The Gemma series has already been downloaded over 400 million times. With Gemma 4, that momentum is about to explode.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *

Prove your humanity: 0   +   9   =