AI Models
The model layer — from frontier foundation models (GPT-4, Claude, Gemini, LLaMA) to fine-tuned specialists, quantized variants, and domain-specific models for code, vision, audio, and more.
Anthropic's 2026 flagship — strongest Claude for agents, long-horizon coding, and tool orchestration.
Google's flagship multimodal family — frontier reasoning, huge context, Search grounding, Flash tiers.
Anthropic's Opus-tier deep-reasoning model — hard coding, research, high-stakes agent steps.
Meta's open-weight flagship family (Scout/Maverick) — MoE, multimodal, huge context, self-hostable.
OpenAI's fastest multimodal flagship model with 128K context.
Open-source image generation — SD3, SDXL, massive ecosystem of LoRAs, ControlNets, runs locally.
AI image generation — artistic high-quality outputs, Discord bot, photorealistic V6 model.
Mistral's 123B flagship model rivaling GPT-4o.
Stability AI's 8B parameter flagship image generation model.
Mistral's 124B multimodal model with vision capabilities.
Mistral's efficient 24B model for production workloads.
Black Forest Labs' flow-matching image model from SD creators.
OpenAI's best speech recognition model for 100+ languages.
Open code model trained on 600+ languages.
Widely adopted open image model with massive ecosystem.
Snowflake's 480B MoE model for enterprise data tasks.
Hugging Face's small model family for on-device use.
Meta's foundation model for visual segmentation.
Alibaba's code-specialized model matching GPT-4o on coding.
Microsoft's 14B model rivaling 70B through data quality.
Microsoft's 3.8B model with 128K context for edge deployment.
Google's vision-language model for fine-grained tasks.
OpenAI's most powerful reasoning model for complex problems.
Tiny vision-language model for edge devices.
Open multimodal model for visual reasoning.
Meta's largest open multimodal model at 90B parameters.
Compact 3B model balancing capability with edge deployment.
Meta's multimodal 11B model with text and vision.
xAI's model with real-time X platform data access.
Cost-efficient small model replacing GPT-3.5 Turbo.
Microsoft's unified model for diverse vision tasks.
DeepSeek's 236B MoE model specialized for code.
Cohere's efficient model for high-volume RAG workloads.
Cohere's multilingual embedding model for search and RAG.
Mistral's dedicated 22B code generation model.
Meta's 70B specialized code generation model.
Google's code-specialized Gemma model.
Anthropic's balanced model for production workloads.
Anthropic's fastest model for high-throughput tasks.
Salesforce's efficient vision-language bridge model.
Bilingual Chinese-English language model.
AI21's hybrid Mamba-Transformer model with 256K context.
01.AI's bilingual 34B model with 200K context option.
1.1B model pre-trained on 3T tokens for edge use.
Gradio web UI for local LLMs with multiple backends.
Google's safety content classifiers built on Gemma.
Alibaba's 32B reasoning model with chain-of-thought.
Alibaba's 72B open model trained on 18T tokens.
Allen AI's fully open and transparent language model.
Latest compact reasoning model with native tool use.
Cost-efficient reasoning model with configurable effort levels.
Mistral's mixture-of-experts model with efficient routing.
Mistral's mixture-of-experts model with 176B total parameters.
Mistral's 12B model with 128K context window.
Meta's safety classifier for LLM content moderation.
Meta's LLM safety classifier for content policy enforcement.
Meta's 70B open model matching 405B-class performance.
Largest open-weight model at 405B parameters.
Hybrid Transformer-Mamba model with 256K context.
Shanghai AI Lab's multilingual foundation model.
Enhanced GPT-4 with 128K context and improved speed.
Google's open-weight model family from 1B to 27B parameters.
Google's 2B lightweight open model.
Google's efficient open model competitive above its weight class.
Google's most capable model with 1M context and native thinking.
Google's fast multimodal model with 1M context.
State-of-the-art open image model with exceptional prompt adherence.
TII's 180B model trained on curated RefinedWeb data.
671B MoE model matching GPT-4o at fraction of training cost.
Open-source reasoning model matching OpenAI o1.
Databricks' 132B MoE model with fine-grained expert routing.
Anthropic's most intelligent model, best-in-class for coding and agentic tasks.
Tsinghua's bilingual dialogue model.
Snowflake's enterprise MoE model for SQL and code.
01.AI's high-performance reasoning model.
Google's vision-language-action model for robotics.
Meta's prompt injection and jailbreak detection classifier.
Microsoft's compact model for edge deployment.
OpenAI's interactive testing environment for GPT models.
OpenAI's reasoning model with chain-of-thought problem solving.
Ultra-lightweight 1B model for on-device AI.
AI creative platform for production-quality visual assets and game art.
OpenAI's image generator with accurate text rendering and complex compositions.
automatic-speech-recognition model by undefined. 1,02,76,778 downloads.
automatic-speech-recognition model by undefined. 49,28,734 downloads.
Latent diffusion model for generating music and sound effects from text.
OpenAI's photorealistic text-to-video model with world simulation.
sentence-similarity model by undefined. 3,61,53,768 downloads.
sentence-similarity model by undefined. 23,35,18,673 downloads.
text-to-image model by undefined. 20,41,667 downloads.
sentence-similarity model by undefined. 4,39,47,771 downloads.
automatic-speech-recognition model by undefined. 75,44,359 downloads.
sentence-similarity model by undefined. 1,50,16,753 downloads.
text-generation model by undefined. 95,66,721 downloads.
text-generation model by undefined. 1,00,18,533 downloads.
text-generation model by undefined. 1,06,91,206 downloads.
text-generation model by undefined. 1,93,69,646 downloads.
text-generation model by undefined. 1,37,84,608 downloads.
text-generation model by undefined. 93,35,502 downloads.
text-generation model by undefined. 1,60,37,172 downloads.
What are AI Models?
AI models are the foundation layer — the neural networks that generate text, code, images, audio, and video. The landscape ranges from frontier foundation models (GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, LLaMA 3) to specialized models for code, vision, embeddings, and speech. Key decisions: closed-source APIs vs. open-weight models, cloud vs. local inference, and general-purpose vs. domain-specific.
How to Choose
Start with the task, not the model. For text generation and reasoning, benchmark on YOUR use case (not general benchmarks). For embeddings, test retrieval quality on your domain. For code, test on your language and codebase complexity. Key trade-offs: capability vs. cost vs. latency vs. privacy. Open models give you control and privacy; API models give you convenience and frontier performance.
Key Capabilities to Evaluate
Common Patterns
Call a hosted model via API. Simplest path, highest capability, but data leaves your infrastructure.
Run models on your hardware via Ollama, llama.cpp, or vLLM. Full privacy, but requires GPU resources.
Adapt a base model to your domain with custom training data. Higher quality for specific tasks, but requires data and compute.
Route requests to different models based on task complexity. Use a smaller model for simple tasks, a larger one for complex reasoning.
What to Watch Out For
Top Capabilities
Browse all →Analyzes selected code or entire files and generates natural language explanations of what the code does, how it works, and why certain patterns were chosen. The feature can produce documentation in multiple formats (docstrings, comments, markdown) and supports various documentation styles (JSDoc, Sphinx, etc.). Developers can request explanations at different levels of detail (high-level overview, line-by-line breakdown, architectural context) through the chat interface, with responses appearing as formatted text or code comments.
Cody utilizes a context-aware engine that analyzes the current file and project structure to provide relevant code completions. It integrates with the Visual Studio Code API to access the Abstract Syntax Tree (AST) of the code, allowing it to suggest completions that are semantically relevant to the context, rather than relying solely on keyword matching. This approach ensures that the suggestions are not only syntactically correct but also contextually appropriate, enhancing developer productivity.
Converts natural language prompts into executable full-stack web applications by invoking an AI agent that generates React/Next.js frontend code, Node.js backend logic, and database schemas. The agent runs code in-browser via WebContainers to validate syntax and functionality before deployment, iterating on the generated code based on execution feedback. Token consumption scales with project complexity (larger codebases consume more tokens per iteration), and the agent supports design system imports from Figma and GitHub to accelerate UI generation.
Provides six model variants (tiny, base, small, medium, large, turbo) with parameter counts ranging from 39M to 1550M, enabling developers to choose optimal speed-accuracy tradeoffs. Tiny model runs at ~10x speed with 1GB VRAM; large model runs at 1x speed with 10GB VRAM. English-only variants (tiny.en, base.en, small.en) provide higher English accuracy by removing multilingual capacity. Turbo model (809M params) offers 8x speedup over large with minimal accuracy loss but lacks translation support.
Translates non-English speech directly to English text by using a task-specific token in the TextDecoder that signals translation mode, bypassing the need for intermediate transcription-then-translation pipelines. The AudioEncoder processes mel spectrograms identically to transcription, but the decoder generates English tokens directly from audio embeddings, reducing latency and error propagation compared to cascaded systems.
Transcribes audio in 98 languages to text in the original language using a unified Transformer sequence-to-sequence architecture with a shared AudioEncoder that processes mel spectrograms into language-agnostic embeddings, then a TextDecoder that generates tokens autoregressively. The system handles variable-length audio by padding or trimming to 30-second segments and uses task-specific tokens to signal transcription mode, enabling a single model to handle multiple languages without language-specific branches.
Detects the spoken language in audio by processing mel spectrograms through the AudioEncoder and using a language classification head that outputs probability distributions over 98 supported languages. The model leverages 680K hours of multilingual training data to recognize language characteristics from acoustic features alone, without requiring transcription. Language detection occurs as a preliminary step in the transcription pipeline and can be called independently via the language detection task token.
W&B Personal tier (free) and Enterprise tier support self-hosted deployment via Docker, enabling on-premise installation for teams with data residency or security requirements. Self-hosted instances run independently from W&B cloud, with optional integration to W&B cloud for cross-instance features. Supports custom domain configuration, HTTPS, and integration with corporate identity providers (LDAP, SAML, OAuth).
Browse Other Types
Autonomous AI systems that act on your behalf
MCP ServersModel Context Protocol tools and integrations
RepositoriesOpen-source AI projects on GitHub
APIsProgrammatic endpoints for AI capabilities
ExtensionsBrowser and IDE extensions powered by AI
WorkflowsAutomation sequences and AI pipelines
View all 19 types →Frequently Asked Questions
What is the best AI model in 2026?
There is no single best model — it depends on your task, budget, and constraints. For general reasoning, Claude and GPT-4o lead. For open-source, LLaMA 3 and Mistral are top choices. For code, Claude and DeepSeek-Coder excel. For images, Midjourney and DALL-E 3 lead. Always benchmark on your specific use case.
Should I use open-source or closed-source AI models?
Open-source models give you control, privacy, and no per-token costs — but require infrastructure and may lag frontier capabilities. Closed-source APIs give you the best models with zero infrastructure — but create vendor dependency and data privacy concerns. Many teams use both: closed-source for complex tasks, open-source for simple/high-volume tasks.