How do you use DeepSeek V4 Flash?

DeepSeek V4 Flash can be called through OpenRouter's OpenAI-compatible API using the model ID deepseek/deepseek-v4-flash. This page includes a ready-to-run curl example and the supported parameter list.

Is DeepSeek V4 Flash free?

DeepSeek V4 Flash is a paid model in the OpenRouter catalog, with separate input and output token pricing shown on this page.

DeepSeek V4 Flash

All models

DeepSeekDeepSeekReleased 2026-04-24

DeepSeek V4 Flash

Q: What is DeepSeek V4 Flash?

DeepSeek V4 Flash is an AI model from DeepSeek in the DeepSeek series. Agent Mag tracks its context window, pricing, modalities, and supported API parameters on this page.

1.0M context

$0.140/M input

$0.280/M output

284B total, 13B activated

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model with 284 billion total parameters and 13 billion activated parameters. It supports a 1 million-token context window and is designed for fast inference and high-throughput workloads. The model features hybrid attention for efficient long-context processing and configurable reasoning modes, making it suitable for coding assistants, chat systems, and agent workflows.

What is DeepSeek V4 Flash?

DeepSeek V4 Flash is an AI model from DeepSeek that Agent Mag tracks for pricing, context window, modalities, benchmarks, and API compatibility. Builders can use this page to compare DeepSeek V4 Flash against other models for agent workflows and production deployments.

Model ID

Architecture & Specifications

Architecture

Mixture of Experts (MoE)

Parameters

284B total, 13B activated

Tokenizer

DeepSeek

Released

2026-04-24

Modalities

Input

text

Output

text

Supported Parameters

frequency_penaltyinclude_reasoninglogprobsmax_tokenspresence_penaltyreasoningresponse_formatstoptemperaturetool_choicetoolstop_logprobstop_p

Strengths

Supports a 1 million-token context window
Optimized for fast inference and high-throughput workloads
Hybrid attention for efficient long-context processing
Configurable reasoning modes
Strong reasoning and coding performance

Limitations

Limited information on training data sources
Hallucination rate of 4.2% in knowledge benchmarks
Low performance in research-level physics reasoning (CritPt: 7.1%)

Recommended Use Cases

Coding assistants

Chat systems

Agent workflows

Long-context processing tasks

High-throughput applications requiring cost efficiency

More from DeepSeek

DeepSeek V4 Pro

1.0M ctx$1.74/M

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding,...

DeepSeek V3.2

131K ctx$0.252/M

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

DeepSeek V3.2 Speciale

164K ctx$0.400/M

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on DeepSeek Sparse Attention (DSA) for efficient long-context processing, then scales post-training reinforcement learning...

DeepSeek V3.2 Exp

164K ctx$0.270/M

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism...

Stay in the know

DeepSeek V4 Flash

What is DeepSeek V4 Flash?

More from DeepSeek

Related content