Meta: Llama 3.2 11B Vision Instruct

Fiabilité 20%

meta-llama/llama-3.2-11b-vision-instruct

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and visual question answering, bridging the gap between language generation and visual reasoning. Pre-trained on a massive da...

🧠 Intelligence & Données
Knowledge Cutoff: 2023-12-31
Tokenizer: Llama3
Moderation: ✅ Non
📅 Cycle de vie
Ajouté le: 25/09/2024
Spécifications
  • Provider & Modalité meta-llama text+image->text
  • Fenêtre de contexte 131,072 tokens
  • Max Output Tokens 16,384
  • Support des Outils (Tools) Non supporté
🔍 Modèles similaires
Modèle Provider Input Output Contexte
Inception: Mercury 2
inception/mercury-2
inception $0.2500 $0.7500 128,000
OpenAI: GPT-5.3 Chat
openai/gpt-5.3-chat
openai $1.7500 $14.0000 128,000
Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)
google/gemini-3.1-flash-i...
google $0.5000 $3.0000 65,536
AionLabs: Aion-2.0
aion-labs/aion-2.0
aion-labs $0.8000 $1.6000 131,072
MiniMax: MiniMax M2.5 (free)
minimax/minimax-m2.5:free
minimax $0.0000 $0.0000 196,608