GLM OCR
GLM OCR is available through Ollama for local agent workflows, with support for text, image input. GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.
What is GLM OCR?
GLM OCR is a local model entry from Ollama that Agent Mag tracks for install commands, available tags, modalities, and agent workflow fit. Builders can install it with the Agent Mag CLI and run it through Ollama on their own machine.
GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture.
| Tag | Size | Context | Input |
|---|---|---|---|
| glm-ocr:latest | 2.2GB | 128K | text, image |
| glm-ocr:q8_0 | 1.6GB | 128K | text, image |
| glm-ocr:bf16 | 2.2GB | 128K | text, image |
Related local models
GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin.
As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks.
Advancing the Coding Capability
Related content
Compare pricing, local installs, context windows, and modality filters across the full model catalog.
Find frameworks, SDKs, and infrastructure tools that pair with this model in production workflows.
See Agent Mag coverage of model benchmarks, agent frameworks, and deployment patterns.