GLM 4.7 Flash
GLM 4.7 Flash is available through Ollama for local agent workflows, with support for text input. As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
What is GLM 4.7 Flash?
GLM 4.7 Flash is a local model entry from Ollama that Agent Mag tracks for install commands, available tags, modalities, and agent workflow fit. Builders can install it with the Agent Mag CLI and run it through Ollama on their own machine.
As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
| Tag | Size | Context | Input |
|---|---|---|---|
| glm-4.7-flash:latest | 19GB | 198K | text |
| glm-4.7-flash:q4_K_M | 19GB | 198K | text |
| glm-4.7-flash:q8_0 | 32GB | 198K | text |
| glm-4.7-flash:bf16 | 60GB | 198K | text |
Related local models
GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin.
GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.
A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
Advancing the Coding Capability
Related content
Compare pricing, local installs, context windows, and modality filters across the full model catalog.
Find frameworks, SDKs, and infrastructure tools that pair with this model in production workflows.
See Agent Mag coverage of model benchmarks, agent frameworks, and deployment patterns.