How do you use Gemma 4 26B A4B (free)?

Gemma 4 26B A4B (free) can be called through OpenRouter's OpenAI-compatible API using the model ID google/gemma-4-26b-a4b-it:free. This page includes a ready-to-run curl example and the supported parameter list.

Is Gemma 4 26B A4B (free) free?

Gemma 4 26B A4B (free) is currently listed as free for both input and output tokens in the OpenRouter catalog.

Gemma 4 26B A4B (free)

All models

GoogleGemmaFreeReleased 2026-04-03

Gemma 4 26B A4B (free)

Q: What is Gemma 4 26B A4B (free)?

Gemma 4 26B A4B (free) is an AI model from Google in the Gemma series. Agent Mag tracks its context window, pricing, modalities, and supported API parameters on this page.

262K context

Free

25.2B total, 3.8B active

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model developed by Google DeepMind. It features 25.2 billion total parameters, with only 3.8 billion activated per token during inference, enabling high-quality outputs comparable to models with 31 billion parameters while optimizing compute efficiency. The model supports multimodal inputs, including text, images, and video, and offers advanced capabilities such as a 256K token context window, native function calling, configurable reasoning modes, and structured output generation.

What is Gemma 4 26B A4B (free)?

Gemma 4 26B A4B (free) is an AI model from Google that Agent Mag tracks for pricing, context window, modalities, benchmarks, and API compatibility. Builders can use this page to compare Gemma 4 26B A4B (free) against other models for agent workflows and production deployments.

Model ID

Architecture & Specifications

Architecture

Mixture of Experts (MoE)

Parameters

25.2B total, 3.8B active

Tokenizer

Gemma

License

Apache 2.0

Released

2026-04-03

Modalities

Input

imagetextvideo

Output

text

Supported Parameters

include_reasoningmax_tokensreasoningresponse_formatseedtemperaturetool_choicetoolstop_p

Strengths

Supports multimodal inputs including text, images, and video
Efficient inference with only 3.8B active parameters per token
256K token context window for handling large inputs
Native function calling and structured output generation
Configurable reasoning modes for tailored responses

Limitations

Lower performance on certain benchmarks like CritPt (0.0%)
Limited coding capabilities as indicated by Terminal-Bench Hard (13.6%)
High hallucination rate in knowledge benchmarks (19.1%)
Relatively low accuracy in knowledge-based tasks (18.1%)
Performance variability across different benchmarks

Recommended Use Cases

Complex reasoning tasks with large context requirements

Multimodal applications involving text, images, and video

Instruction-following scenarios with structured outputs

Agentic coding and terminal use in IDE environments

Interactive roleplay and storytelling applications

More from Google

Gemma 4 26B A4B

262K ctx$0.060/M

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at...

Gemma 4 31B (free)

262K ctxFree

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function...

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate...

Stay in the know

Gemma 4 26B A4B (free)

What is Gemma 4 26B A4B (free)?

More from Google

Related content