How do you use Qwen3.5-Flash?

Qwen3.5-Flash can be called through OpenRouter's OpenAI-compatible API using the model ID qwen/qwen3.5-flash-02-23. This page includes a ready-to-run curl example and the supported parameter list.

Is Qwen3.5-Flash free?

Qwen3.5-Flash is a paid model in the OpenRouter catalog, with separate input and output token pricing shown on this page.

Qwen3.5-Flash

All models

AlibabaQwenReleased February 25, 2026

Qwen3.5-Flash

Q: What is Qwen3.5-Flash?

Qwen3.5-Flash is an AI model from Alibaba in the Qwen series. Agent Mag tracks its context window, pricing, modalities, and supported API parameters on this page.

1M context

$0.065/M input

$0.260/M output

Qwen3.5-Flash is a vision-language model built on a hybrid architecture that combines linear attention mechanisms with a sparse mixture-of-experts model. It is designed for efficient inference and excels in both pure text and multimodal tasks, offering fast response times while maintaining a balance between speed and performance. The model represents a significant improvement over the Qwen3 series in terms of capabilities and efficiency.

What is Qwen3.5-Flash?

Qwen3.5-Flash is an AI model from Alibaba that Agent Mag tracks for pricing, context window, modalities, benchmarks, and API compatibility. Builders can use this page to compare Qwen3.5-Flash against other models for agent workflows and production deployments.

Model ID

Architecture & Specifications

Architecture

Hybrid architecture with linear attention and sparse mixture-of-experts

Tokenizer

Qwen3

Released

February 25, 2026

Modalities

Input

textimagevideo

Output

text

Supported Parameters

include_reasoningmax_tokenspresence_penaltyreasoningresponse_formatseedstructured_outputstemperaturetool_choicetoolstop_p

Strengths

Efficient inference with hybrid architecture
Strong performance in both text and multimodal tasks
Fast response times
Improved capabilities over the Qwen3 series

Recommended Use Cases

Academia

Finance

Health

Legal

Marketing

More from Alibaba

Qwen3.6 Plus

1M ctx$0.325/M

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers...

Qwen3.5-9B

262K ctx$0.100/M

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design...

Qwen3.5-35B-A3B

262K ctx$0.163/M

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall...

Qwen3.5-122B-A10B

262K ctx$0.260/M

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of...

Stay in the know

Qwen3.5-Flash

What is Qwen3.5-Flash?

More from Alibaba

Related content