MiMo-V2-Omni
MiMo-V2-Omni is an AI model from Xiaomi built for agent workflows, with support for text, audio, image, video input and text output. MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...
What is MiMo-V2-Omni?
MiMo-V2-Omni is an AI model from Xiaomi that Agent Mag tracks for pricing, context window, modalities, benchmarks, and API compatibility. Builders can use this page to compare MiMo-V2-Omni against other models for agent workflows and production deployments.
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...
More from Xiaomi
MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex software engineering, and long-horizon tasks, with top rankings on benchmarks such as ClawEval, GDPVal, and SWE-bench Pro....
MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surpassing MiMo-V2-Omni in multimodal perception across image and video understanding...
MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like...
MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a...
Related content
Compare pricing, local installs, context windows, and modality filters across the full model catalog.
Find frameworks, SDKs, and infrastructure tools that pair with this model in production workflows.
See Agent Mag coverage of model benchmarks, agent frameworks, and deployment patterns.