Llama 3.2 11B Vision Instruct
All models
MetaMetaLlama

Llama 3.2 11B Vision Instruct

131K context$0.245/M input$0.245/M output

Llama 3.2 11B Vision Instruct is an AI model from Meta built for agent workflows, with support for text, image input and text output. Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...

What is Llama 3.2 11B Vision Instruct?

Llama 3.2 11B Vision Instruct is an AI model from Meta that Agent Mag tracks for pricing, context window, modalities, benchmarks, and API compatibility. Builders can use this page to compare Llama 3.2 11B Vision Instruct against other models for agent workflows and production deployments.

Model ID

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and...

Modalities
Input
textimage
Output
text
Supported Parameters
frequency_penaltymax_tokensmin_ppresence_penaltyrepetition_penaltyresponse_formatseedstoptemperaturetop_ktop_p

Related content