Top AI Models List
By
Updated on June 26, 2025
Find the top AI models you can use today for text generation, images, summarization, speech, chatbot, documents parser, query and more.
There are thousands of AI models available today. Each one is built for different tasks like chatting, generating images, writing code, or recognizing speech. With so many options, it can be confusing to pick the right one for your AI app.
To help you out, I’ve put together a list of the best AI models from top providers like OpenAI, Google, Meta, Anthropic, and others. This list shows what each model can do, its size, how much text it can handle, its license type, pricing, and when it was released.
Here are the official pricing page from leading AI providers:
List of AI Models
AI Model Name | AI Provider | Use Cases | Size & Parameters | Context Length | License Type | Pricing (USD/1M Tokens) | Release Date |
o3 | OpenAI | Complex code generation, research analysis | Not disclosed | 128K | Proprietary | Input: $10 / Output: $40 | April, 2025 |
o4-mini (high) | OpenAI | Lightweight AI agents, document Q&A | Not disclosed | 200K | Proprietary | Input: $1.10 / Output: $4.40 | Apr, 2025 |
o3-mini (high) | OpenAI | Customer support bots, summarization | Not disclosed | 200K | Proprietary | Input: $1.10 / Output: $4.40 | Jan, 2025 |
o3-pro | OpenAI | Enterprise-level automation, technical writing | Not disclosed | 200K | Proprietary | Input: $1.10 / Output: $4.40 | June, 2024 |
o1 | OpenAI | Legal reasoning, advanced chatbot tasks | Not disclosed | 200K | Proprietary | Input: $15 / Output: $60 | Dec, 2024 |
o1-pro | OpenAI | Strategic decision-making, data interpretation | Not disclosed | 200K | Proprietary | Input: $150 / Output: $600 | Dec, 2024 |
o1-mini | OpenAI | Query resolution, light document processing | Not disclosed | 128K | Proprietary | Input: $1.10 / Output: $4.40 | Dec, 2024 |
GPT-4.5 (Preview) | OpenAI | Code-heavy workflows, logic-based writing | Not disclosed | 128K | Proprietary | Input: $75 / Output: $150 | Feb, 2025 |
GPT-4.1 mini | OpenAI | Mobile AI apps, concise content creation | Not disclosed | 1M | Proprietary | Input: $0.40 / Output: $1.60 | April, 2025 |
GPT-4.1 | OpenAI | Long-context reasoning, technical documentation | Not disclosed | 1M | Proprietary | Input: $2 / Output: $8 | April, 2025 |
GPT-4o | OpenAI | Multimodal assistants, audio-visual chatbots | Not disclosed | 128K | Proprietary | Input: $2.50 / Output: $10 | August 2024 |
GPT-4 Turbo | OpenAI | Fast coding help, real-time chat | Not disclosed | 128K | Proprietary | Input: $10 / Output: $30 | March, 2024 |
GPT-4.1 nano | OpenAI | IoT devices, quick-response tasks | Not disclosed | 1M | Proprietary | Input: $0.10 / Output: $0.40 | March, 2025 |
GPT-4o mini | OpenAI | Entry-level chatbots, basic summarization | Not disclosed | 128K | Proprietary | Input: $0.15 / Output: $0.60 per 1M tokens | November, 2024 |
GPT-4 | OpenAI | General AI chat, academic writing | Not disclosed | 8K | Proprietary | Input: $30 / Output: $60 | April, 2023 |
GPT-3.5 Turbo | OpenAI | Customer interactions, casual content tasks | Not disclosed | 4K | Proprietary | Input: $0.50 / Output: $1.50 | July, 2023 |
Gemini 2.5 Pro Preview | Advanced reasoning, multimodal, coding, datasets | not disclosed | 1M | Proprietary | Input: $1.25 / Output: $10.00 | May 2025 | |
Gemini 2.5 Flash Preview | Hybrid reasoning, cost efficiency, multimodal | not disclosed | 1M | Proprietary | Input: $0.15 / Output: $0.60 | May 2025 | |
Gemini 2.5 Flash Native Audio | Conversational audio, TTS, voice dialogue | not disclosed | 128k | Proprietary | Input: $3.00 (Audio) / Output: $12.00 | May 2025 | |
Gemini 2.5 Flash Preview TTS | Text-to-speech audio generation | not disclosed | 8k | Proprietary | Input: $0.50 / Output: $10.00 | May 2025 | |
Gemini 2.5 Pro Preview TTS | High-quality text-to-speech audio | not disclosed | 8k | Proprietary | Input: $1.00 / Output: $20.00 | May 2025 | |
Gemini 2.0 Flash | Multimodal, speed, streaming, tool use | not disclosed | 1M | Proprietary | Input: $0.10 / Output: $0.40 | February 2025 | |
Gemini 2.0 Flash Preview Image Gen | Conversational image generation/editing | not disclosed | 32k | Proprietary | $0.039 per image | May 2025 | |
Gemini 2.0 Flash-Lite | Cost efficiency, low latency | not disclosed | 1M | Proprietary | Input: $0.075 / Output: $0.30 | February 2025 | |
Gemini 2.0 Flash Live | Bidirectional voice/video interactions | not disclosed | 1M | Proprietary | Input: $2.10 / Output: $8.50 | April 2025 | |
Gemini 1.5 Pro | Complex reasoning, large context, multimodal | not disclosed | 2M | Proprietary | Input: $1.25 / Output: $5.00 (≤128k tokens) | September 2024 | |
Gemini 1.5 Flash | Fast, versatile, multimodal | not disclosed | 1M | Proprietary | Input: $0.075 / Output: $0.30 (≤128k tokens) | September 2024 | |
Gemini 1.5 Flash-8B | High volume, lower intelligence tasks | not disclosed | 1M | Proprietary | Input: $0.0375 / Output: $0.15 (≤128k tokens) | October 2024 | |
Gemini Embedding Experimental | Text embeddings | not disclosed | 8k | Proprietary | not disclosed | March 2025 | |
Text Embedding (text-embedding-004) | Text embeddings | not disclosed | 2k | Proprietary | not disclosed | Apr 1, 2024 | |
Embedding (embedding-001) | Text embeddings | not disclosed | 2k | Proprietary | not disclosed | Dec 1, 2023 | |
AQA (models/aqa) | Attributed Question Answering | not disclosed | 7k | Proprietary | not disclosed | Dec 1, 2023 | |
Imagen 3 | Text-to-image generation | not disclosed | not disclosed | Proprietary | $0.03 per image | Feb 1, 2025 | |
Veo 2 | Text/image to video generation | not disclosed | not disclosed | Proprietary | $0.35 per second (video) | Apr 1, 2025 | |
Gemma 2 | Open LLM for general use | 2B, 7B | not disclosed | Apache 2.0 | Free | February 2024 | |
Gemma 3 | Open, multimodal LLM (text, image, etc.) | 1B, 4B, 12B, 27B | 128k | Apache 2.0 | Free | March 2025 | |
Gemma 3n | Open, efficient, mobile-first multimodal | not disclosed | not disclosed | Apache 2.0 | Free | May 2025 | |
ShieldGemma 2 | Safety-tuned Gemma variant | not disclosed | not disclosed | Apache 2.0 | Free | March 2025 | |
MedGemma | Medical, multimodal (text & image) | not disclosed | not disclosed | Apache 2.0 | Free | May, 2025 | |
Claude 3.5 Haiku | Anthropic | Fast, cost-effective, high-throughput tasks | Not disclosed | 200k | Proprietary | Input: $0.80 / Output: $4.00 | Oct, 2024 |
Claude 3.7 Sonnet | Anthropic | Hybrid AI reasoning, speed/accuracy control | Not disclosed | 200k | Proprietary | Input: $3.00 / Output: $15.00 | Feb, 2025 |
Claude 3 Haiku | Anthropic | Fastest, basic tasks, high speed | Not disclosed | 200k | Proprietary | Input: $0.25 / Output: $1.25 | March, 2024 |
Claude 3 Sonnet | Anthropic | Balanced performance, general use | Not disclosed | 200k | Proprietary | Input: $3.00 / Output: $15.00 | March, 2024 |
Claude 3 Opus | Anthropic | Advanced reasoning, complex tasks | Not disclosed | 200k | Proprietary | $15.00 / Output: $75.00 | March, 2024 |
Qwen3-235B-A22B | Alibaba | Large language model, MoE, reasoning, multilingual | 235B | 32k | Apache 2.0 | Free (open source) | May, 2025 |
Qwen3-32B | Alibaba | LLM, reasoning, chat, agent, multilingual | 32B | 32k | Apache 2.0 | Free | May, 2025 |
Qwen3-14B | Alibaba | LLM, reasoning, chat, agent, multilingual | 14B | 32k | Apache 2.0 | Free | May, 2025 |
Qwen3-8B | Alibaba | LLM, reasoning, chat, agent, multilingual | 8B | 32k | Apache 2.0 | Free | May, 2025 |
Qwen3-4B | Alibaba | LLM, reasoning, chat, agent, multilingual | 4B | 32k | Apache 2.0 | Free | May, 2025 |
Qwen3-1.7B | Alibaba | LLM, reasoning, chat, agent, multilingual | 1.7B | 32k | Apache 2.0 | Free | May 2025 |
Qwen3-0.6B | Alibaba | LLM, reasoning, chat, agent, multilingual | 0.6B | 32k | Apache 2.0 | Free | May, 2025 |
Qwen2.5-72B | Alibaba | LLM, chat, multilingual, code | 72B | 32k | Apache 2.0 | Free | April, 2025 |
Qwen2.5-32B | Alibaba | LLM, chat, multilingual, code | 32B | 32k | Apache 2.0 | Free | April 2025 |
Qwen2.5-14B | Alibaba | LLM, chat, multilingual, code | 14B | 32k | Apache 2.0 | Free | April, 2025 |
Qwen2.5-7B | Alibaba | LLM, chat, multilingual, code | 7B | 32k | Apache 2.0 | Free | April, 2025 |
Qwen2.5-4B | Alibaba | LLM, chat, multilingual, code | 4B | 32k | Apache 2.0 | Free | April, 2025 |
Qwen2.5-1.8B | Alibaba | LLM, chat, multilingual, code | 1.8B | 32k | Apache 2.0 | Free | April, 2025 |
Qwen2.5-0.5B | Alibaba | LLM, chat, multilingual, code | 0.5B | 32k | Apache 2.0 | Free | April, 2025 |
Qwen2-72B | Alibaba | LLM, chat, multilingual, code | 72B | 32k | Apache 2.0 | Free | June, 2024 |
Qwen2-57B-A14B | Alibaba | LLM, MoE, chat, multilingual, code | 57B total (14B activated, MoE) | 32k | Apache 2.0 | Free | June, 2024 |
Qwen2-7B | Alibaba | LLM, chat, multilingual, code | 7B | 32k | Apache 2.0 | Free | June, 2024 |
Qwen2-1.5B | Alibaba | LLM, chat, multilingual, code | 1.5B | 32k | Apache 2.0 | Free | June, 2024 |
Qwen2-0.5B | Alibaba | LLM, chat, multilingual, code | 0.5B | 32k | Apache 2.0 | Free | June, 2024 |
Sonar | Perplexity | General-purpose text generation | Based on LLaMA 3.1 70B | Not disclosed | Proprietary | Input: $1.00 / Output: $1.00 | April, 2024 |
Sonar Large | Perplexity | Pro Search and advanced reasoning | Based on LLaMA 3.1 70B | Not disclosed | Proprietary | Input: $2.00 / Output: $10.00 | May 2024 |
Sonar-Pro | Perplexity | Enhanced comprehension & context | Based on LLaMA 3.1 70B | Not disclosed | Proprietary | Input: $3.00 / Output: $15.00 | June 2024 |
Sonar-Reasoning | Perplexity | Chain-of-thought & logic tasks | Based on LLaMA 3.1 70B | Not disclosed | Proprietary | Input: $1.00 / Output: $5.00 | July 2024 |
Sonar-Deep-Research | Perplexity | Deep multi-step research analysis | Based on LLaMA 3.1 70B | Not disclosed | Proprietary | Input: $2.00 / Output: $8.00 | August 2024 |
Tulu3 405B | Allen Institute for AI | Research AI | 405B | 128K | Open Source | Free | Nov, 2024 |
DeepHermes 3 - Mistral 24B | Nous Research | Text generation, reasoning | 24B | 32K | Llama 3 Community License | Input: $0.85 / Output: $0.85 | April, 2024 |
Hermes 3 - Llama-3.1 70B | Nous Research | Chatbots, code generation | 70B | 128K | Llama 3 Community License | Input: $0.12 / Output: $0.85 | July, 2024 |
Solar Pro | Upstage | Reasoning and Text related tasks | Not Disclosed | 4K | Proprietary | $0.25 | Sept, 2024 |
LFM 40B | Liquid AI | Document analysis, assistant bots | 40B | 32K | Proprietary | Free | Sept, 2024 |
Arctic | Snowflake | SQL generation, instruction following | Not Disclosed | 4K | Open Source | Free | April, 2024 |
Magistral Medium | Mistral AI | Complex reasoning tasks | Not disclosed | 40k | Commercial | Input: $2.00 / Output: $5.00 | June, 2025 |
Magistral Small | Mistral AI | Small reasoning | Not disclosed | 40k | Apache 2.0 (open-source) | Input: $0.50 / Output: $1.50 | June, 2025 |
Mistral Medium | Mistral AI | Visual content understanding, Document analysis | Not disclosed | 128k | Commercial | Input: $0.40 / Output: $2.00 | May, 2025 |
Mistral Large | Mistral AI | Reasoning, high‑complexity tasks | 123B | 128k | Research (commercial available) | Input: $2.00 / Output: $6.00 | Nov, 2024 |
Pixtral Large | Mistral AI | Image understanding, text related tasks | 124B | 128k | Research (commercial available) | Input: $2.00 / Output: $6.00 | Nov, 2024 |
Codestral | Mistral AI | Code completion, test generation | 22.2B | 256k | Commercial | Input: $0.30 / Output: $0.90 | Jan, 2025 |
Codestral Mamba | Mistral AI | Open-source code model | 7.3B | 256k | Apache 2.0 | Not disclosed | Jul, 2024 |
Mistral Saba | Mistral AI | Languages from Middle East & South Asia | Not disclosed | 32k | Commercial | Input: $0.20 / Output: $0.60 | Feb, 2025 |
Ministral 8B | Mistral AI | Content creation | 8B | 128k | Research license | Input: $0.10 / Output: $0.10 | Oct, 2024 |
Ministral 3B | Mistral AI | Best edge model | 3B | 128k | Commercial | Input: $0.04 / Output: $0.04 | Oct, 2024 |
Mistral Embed | Mistral AI | Semantic text embedding | Not disclosed | 8k | Not stated | Input: $0.10 | Dec, 2023 |
Codestral Embed | Mistral AI | Semantic code embedding | Not disclosed | 8k | Not stated | Input: $0.15 | May, 2025 |
Mistral Moderation | Mistral AI | Detect harmful content | Not disclosed | 8k | Not stated | Input: $0.10 | Nov, 2024 |
Mixtral 8x7B | Mistral AI | Sparse mixture expert model | 46.7B | Not disclosed | Apache 2.0 | Not disclosed | Dec, 2023 |
Mixtral 8x22B | Mistral AI | Sparse mixture expert model | 140.6B | Not disclosed | Apache 2.0 | Not disclosed | Apr, 2024 |
Mathstral | Mistral AI | STEM-focused math model | 7.3B | 32k | Apache 2.0 | Not disclosed | Jul, 2024 |
Mistral‑Nemo | Mistral AI | Multilingual small model | 12B | 128k | Apache 2.0 | Not disclosed | Jul, 2024 |
Mistral Small (various) | Mistral AI | Compact text/image model | 24B | 60‑128 k | Apache 2.0 | Not disclosed | 2024‑2025 |
Reka Core | Reka AI | Multimodal (text, media) | Not disclosed | 128k | Proprietary | Input: $2.00 / Output: $6.00 | April, 2024 |
Reka Flash 3 | Reka AI | Fast, efficient AI | 21B | 32k | Apache 2.0 | Input: $0.80 / Output: $2.00 | March, 2025 |
Reka Flash | Reka AI | Content Creation | Not disclosed | 128k | Proprietary | Input: $0.80 / Output: $2.00 | June 2024 |
Reka Spark | Reka AI | Research and text generation | Not disclosed | 128k | Proprietary | Input: $0.05 / Output: $0.05 | June, 2024 |
Cosmos 1.0 Diffusion | NVIDIA | Physics-aware video generation, synthetic data | 4B–14B parameters | Not specified | Open Model License | Free | Jan, 2025 |
Cosmos 1.0 Autoregressive | NVIDIA | Real-time video prediction, physical AI planning | 4B–14B parameters | Not specified | Open Model License | Free | Jan, 2025 |
Cosmos Upsampler | NVIDIA | Text prompt refinement, video enhancement | 12B parameters | Not specified | Open Model License | Free | Jan, 2025 |
Cosmos Video Decoder | NVIDIA | AR/VR applications, video processing | 7B parameters | Not specified | Open Model License | Free | Jan, 2025 |
Cosmos Guardrails | NVIDIA | Safety filtering, content moderation | Various sizes | Not specified | Open Model License | Free | Jan, 2025 |
Cosmos Nemotron VLM | NVIDIA | Vision-language tasks, multimodal AI agents | 14B parameters | Not specified | Open Model License | Via NIM API | Jan, 2025 |
Llama Nemotron 70B | NVIDIA | Agentic AI, reasoning, multi-turn conversations | 70B parameters | Not specified | Open Model License | Via NIM API | Jan, 2025 |
Nemotron 4-Instruct | NVIDIA | Instruction following, chat applications | 340B parameters | Not specified | Apache 2.0 | Input: $5.76 / Output: $5.76 | June, 2024 |
ChatQA-1.5 | NVIDIA | Conversational QA, RAG applications | 8B–72B parameters | Not specified | Apache 2.0 | Free (Open) | May, 2024 |
NIM Foundation Models | NVIDIA | RTX AI PCs, local inference | Optimized for RTX GPUs | Not specified | Commercial | Hardware-based | Jan, 2025 |
Llama 3 8B | Meta | General text generation, chat, instruction following | 8B | 8K | Llama 3 License | Input: $0.15 / Output: $0.15 | April, 2024 |
Llama 3 70B | Meta | Advanced reasoning, complex tasks | 70B | 8K | Llama 3 License | Input: $0.65 / Output: $0.65 | April, 2024 |
Llama 3.1 8B | Meta | Improved multilingual, tool use, long context | 8B | 128K | Llama 3.1 License | [Input: $0.18 / Output: $0.18 per 1M tokens] | July, 2024 |
Llama 3.1 70B | Meta | Advanced reasoning with extended context | 70B | 128K | Llama 3.1 License | [Input: $0.68 / Output: $0.68 per 1M tokens] | July, 2024 |
Llama 3.1 405B | Meta | State-of-the-art reasoning, research, fine-tuning base | 405B | 128K | Llama 3.1 License | [Input: $5.32 / Output: $16.00 per 1M tokens] | July, 2024 |
Llama 3.2 1B | Meta | Edge AI, mobile devices, lightweight applications | 1B | 128K | Llama 3.2 License | Free (Open) | Sept, 2024 |
Llama 3.2 3B | Meta | Edge AI, mobile devices, on-device inference | 3B | 128K | Llama 3.2 License | Free (Open) | Sept, 2024 |
Llama 3.2 11B Vision | Meta | Vision-language tasks, image understanding, visual reasoning | 11B | 128K | Llama 3.2 License | Input: $0.35 / Output: $0.35 | Sept, 2024 |
Llama 3.2 90B Vision | Meta | Advanced vision-language, document analysis, complex visual reasoning | 90B | 128K | Llama 3.2 License | Input: $1.20 / Output: $1.20 | Sept, 2024 |
Llama 3.3 70B | Meta | Enhanced reasoning, improved efficiency, cost-effective vs 405B | 70B | 128K | Llama 3.3 License | Input: $0.10 / Output: $0.40 | Dec, 2024 |
Code Llama 70B | Meta | Code generation, completion, debugging | 70B | 100K | Code Llama License | Input: $0.65 / Output: $0.65 | Jan, 2024 |
Yi‑1.5‑34B/9B/6B | 01 AI | General LLM, chat, coding, reasoning, math | 34B, 9B, 6B | 4K, 16K, 32K | Apache‑2.0 (open‑source) | Free | May, 2024 |
Yi‑Large | 01 AI | Reasoning, coding, math | Not disclosed | Not disclosed | Proprietary | Demo / private access | May, 2024 |
Yi‑Lightning | 01 AI | Real-time, low-latency chat & edge AI (MoE architecture) | Not disclosed | Not disclosed | Open‑weight (access via platform) | $0.14 (inference cost) | Oct, 2024 |
Yi‑VL‑6B/34B | 01 AI | Multimodal text+image conversations | 6B, 34B | Not disclosed | Apache‑2.0 (open-source) | Free/open weights | Jan, 2024 |
Yi‑Coder‑1.5B/9B | 01 AI | Code generation across 52 languages | 1.5B, 9B | 128K | Apache‑2.0 (open-source) | Free/open weights | Sept, 2024 |
Yi‑34B‑Chat | 01 AI | General LLM chat, coding, reasoning, math | 34B | 4K | Apache‑2.0 (open-source) | Free/open weights | Nov, 2023 |
Yi‑6B‑Chat | 01 AI | General LLM chat, coding, reasoning, math | 6B | 4K | Apache‑2.0 (open-source) | Free/open weights | Nov, 2023 |
Command A (command-a-03-2025) | Cohere | Tool use, agents, RAG, multilingual | Not disclosed | 256k | Proprietary | Input: $2.50 / Output: $10.00 | March, 2025 |
Command R7B (command-r7b-12-2024) | Cohere | RAG, tool use, agents, complex reasoning | 7 billion | 128k | Proprietary | Input: $0.0375 / Output: $0.15 | Dec, 2024 |
Command R+ (command-r-plus-04-2024) | Cohere | Complex RAG workflows, multi-step tool use | Not disclosed | 128k | Proprietary | Input: $2.50 / Output: $10.00 per 1M tokens | April, 2024 |
Command R (command-r-08-2024) | Cohere | RAG, tool use, code generation, agents | Not disclosed | 128k | Proprietary | Input: $0.15 / Output: $0.60 per 1M tokens | August, 2024 |
Command R (command-r-03-2024) | Cohere | Complex workflows, code generation, RAG | Not disclosed | 128k | Proprietary | Input: $0.50 / Output: $1.50 per 1M tokens | March, 2024 |
Command | Cohere | General instruction following, chat | Not disclosed | 4k | Proprietary | Input: $1.00 / Output: $2.00 per 1M tokens | Jan, 2023 |
Command Light | Cohere | Lightweight, fast text generation | Not disclosed | 4k | Proprietary | Input: $0.30 / Output: $0.60 per 1M tokens | Jan, 2023 |
Command Nightly | Cohere | Experimental version of Command | Not disclosed | 128k | Proprietary | Not disclosed | Mar, 2024 |
Aya Expanse 32B (c4ai-aya-expanse-32b) | Cohere | Multilingual (23 languages) | 32B | 128k | Research | Input: $0.50 / Output: $1.50 | April, 2024 |
Aya Vision 8B (c4ai-aya-vision-8b) | Cohere | Multimodal (text + images), 23 languages | 8B | 16k | Research | Input: $0.50 / Output: $1.50 | April, 2024 |
Aya Vision 32B (c4ai-aya-vision-32b) | Cohere | Multimodal (text, images) | 32B | 16k | Research | Input: $0.50 / Output: $1.50 | April, 2024 |
Rerank v3.5 | Cohere | English document reranking, JSON | Not disclosed | 4k | Proprietary | $2.00 per 1K searches | April, 2024 |
Rerank English v3.0 | Cohere | English document reranking | Not disclosed | 4k | Proprietary | Not disclosed | Oct, 2023 |
Rerank Multilingual v3.0 | Cohere | Multilingual document reranking | Not disclosed | 4k | Proprietary | Not disclosed | Oct, 2023 |
MiniMax-Text-01 | MiniMax | Large Language Model, Text Generation, Reasoning | 456B total | 4M tokens | Open Source | Input: $0.20 / Output: $1.10 | Jan, 2025 |
MiniMax-VL-01 | MiniMax | Vision-Language Model, Multimodal | 456B total | 4M tokens | Open Source | Input: $0.20 / Output: $1.10 | Jan, 2025 |
T2A-01-HD | MiniMax | Text-to-Audio Generation | Not disclosed | Not disclosed | Open Source | Not disclosed | Jan, 2025 |
T2V-01-Director | MiniMax | Text-to-Video Generation | Not disclosed | Not disclosed | Commercial API | $0.43 per video | Sept, 2024 |
I2V-01-Director | MiniMax | Image-to-Video Generation | Not disclosed | Not disclosed | Commercial API | $0.43 per video | Oct, 2024 |
Video-01 (Hailuo AI) | MiniMax | Text/Image-to-Video Generation | Not disclosed | Not disclosed | Commercial API | $0.43 per video | Sept, 2024 |
Image-01 | MiniMax | Text-to-Image Generation | Not disclosed | Not disclosed | Commercial API | $0.0035 per image | June, 2024 |
Speech-01-HD | MiniMax | Text-to-Speech, Voice Cloning | Not disclosed | Not disclosed | Commercial API | $0.10 per 1M characters | March, 2024 |
MiniMax Music | MiniMax | Text-to-Music Generation | Not disclosed | Not disclosed | Commercial API | Not disclosed | Aug, 2024 |
Phi-4 | Microsoft Azure | Reasoning and code generation in low-resource settings | 14B | 16K | MIT | Input: $2.00 / Output: $8.00 | Dec, 2024 |
Phi-4 Multimodal | Microsoft Azure | Multimodal AI (Image, text, and speech) | 5.6B | 128K | MIT | Input: $2.00 / Output: $8.00 | Feb, 2025 |
Phi-4 Mini | Microsoft Azure | On-device assistants, chatbots | 3.8B | 128K | MIT | Input: $0.20 / Output: $0.60 per 1M tokens | Feb, 2025 |
Phi-3 Medium 14B | Microsoft Azure | Small AI applications | 14B | 128K | Open Source | Input: $1.50 / Output: $6.00 per 1M tokens | April, 2024 |
Phi-3 Mini | Microsoft Azure | Multilingual summarization | Not disclosed | 4K | Open Source | Input: $0.20 / Output: $0.60 | April, 2024 |
Amazon Nova Micro | Amazon AWS | Text summarization, Interactive chatbots | Not disclosed | 128K | Proprietary | Input: $0.35 / Output: $1.40 | Dec, 2024 |
Amazon Nova Lite | Amazon AWS | Image analysis, Video content moderation | Not disclosed | 300K (30 min video) | Proprietary | Input: $0.60 / Output: $2.40 | Dec, 2024 |
Amazon Nova Pro | Amazon AWS | Document analysis, Complex reasoning tasks | Not disclosed | 300K (30 min video) | Proprietary | Input: $0.80 / Output: $3.20 | Dec, 2024 |
Amazon Nova Premier | Amazon AWS | Advanced research, Model distillation | Not disclosed | 1M | Proprietary | TBA | Feb, 2025 |
Amazon Nova Canvas | Amazon AWS | Marketing visuals, Professional advertising | Not disclosed | 1024 | Proprietary | Per image generation | Dec, 2024 |
Amazon Nova Reel | Amazon AWS | Social media content, Marketing videos | Not disclosed | 512 | Proprietary | Per video generation | Dec, 2024 |
Amazon Nova Sonic | Amazon AWS | Voice assistants, Real-time conversations | Not disclosed | 300K | Proprietary | Per speech generation | Dec, 2024 |
Jurassic-2 Light | AI21 Labs | Fast text generation, Simple tasks | Not disclosed | 8k | Proprietary | Customised Pricing | March, 2023 |
Jurassic-2 Mid | AI21 Labs | Balanced performance, General applications | Not disclosed | 8k | Proprietary | Customised Pricing | March, 2023 |
Jurassic-2 Ultra | AI21 Labs | High-quality text generation, Complex reasoning | Not disclosed | 8k | Proprietary | Customised Pricing | March, 2023 |
Jamba | AI21 Labs | Long-context processing, Document analysis | 52B parameters (12B active) | 256k | Apache 2.0 | Free | March 2024 |
Jamba-Instruct | AI21 Labs | Instruction following, Task completion | 52B parameters (12B active) | 256k | Proprietary | Customised Pricing | March 2024 |
Jamba 1.5 Mini | AI21 Labs | Efficient processing, Speed optimization | 52B parameters (12B active) | 256k | Proprietary | Input: $200 / Output: $400 | August 2024 |
Jamba 1.5 Large | AI21 Labs | Complex reasoning, Enterprise tasks | 398B parameters (94B active) | 256k | Proprietary | Input: $220.50 / Output: $441 | August 2024 |
Jamba 1.6 | AI21 Labs | Enterprise deployment, Production applications | Not disclosed | 256k | Apache 2.0 | Free | Nov, 2024 |
FLUX.1 [dev] | Black Forest Labs | Text-to-Image | 12B | 512 | Non-Commercial | $0.025/image | Aug, 2024 |
FLUX.1 [schnell] | Black Forest Labs | Text-to-Image (Fast) | 12B | 512 | Apache 2.0 | $0.025/image | Aug, 2024 |
FLUX.1 [pro] | Black Forest Labs | Text-to-Image (Premium) | 12B | 512 | Proprietary | $0.05/image | Aug, 2024 |
FLUX.1.1 [pro] | Black Forest Labs | Text-to-Image (Latest) | 12B | 512 | Proprietary | $0.04/image | Oct, 2024 |
Stable Diffusion 3.5 | Stability AI | Text-to-Image | 8B | 256 | Open Source | $0.03/image | Oct 2024 |
Stable Diffusion XL | Stability AI | Text-to-Image | 3.5B | 77 | CreativeML Open RAIL-M | $0.03/image | Jul, 2023 |
HunyuanVideo | Tencent | Text-to-Video | 13B | 256 | Apache 2.0 | Not disclosed | Dec 2024 |
CogVideoX-5B | THUDM | Text-to-Video | 5B | 256 | Apache 2.0 | Not disclosed | Aug 2024 |
CogVideoX1.5-5B | THUDM | Text/Image-to-Video | 5B | 256 | Apache 2.0 | Not disclosed | Dec, 2024 |
Stable Video Diffusion | Stability AI | Image-to-Video | 1.7B | 77 | Non-Commercial Research | Not disclosed | Nov 2023 |
Juggernaut XL v9 | Community | Photorealistic Images | 6.6B | 77 | CreativeML Open RAIL-M | Not disclosed | Mar, 2024 |
RealVisXL | Community | Hyperrealistic Images | 6.6B | 77 | CreativeML Open RAIL-M | Not disclosed | Mar 2024 |
DreamShaper XL | Community | Artistic Style Images | 6.6B | 77 | CreativeML Open RAIL-M | Not disclosed | Mar, 2024 |
Playground v2.5 | Playground AI | Aesthetic Images | 1.3B | 77 | Playground v2.5 Community | $0.008/image | Feb, 2024 |
PixArt-α | PixArt Team | High-res Generation | 611M | 120 | OpenRAIL++ | Not disclosed | Sep, 2023 |
Kandinsky 3.0 | AI Forever | Multilingual T2I | 3B | 77 | Apache 2.0 | $0.01/image | Nov, 2023 |
DeepFloyd IF | StabilityAI | Cascaded Diffusion | 4.3B | 77 | DeepFloyd IF License | $0.05/image | May, 2023 |
VideoCrafter2 | Tencent | Text-to-Video | 1.7B | 77 | Apache 2.0 | Not disclosed | Dec, 2023 |
LaVie | Vchitect | High-quality Video | 1.2B | 77 | Apache 2.0 | Not disclosed | Sep, 2023 |
I2VGen-XL | Alibaba | Image-to-Video | 1.3B | 77 | Apache 2.0 | Not disclosed | Nov, 2023 |
AnimateDiff | Community | Animation Sequences | Variable | 77 | Apache 2.0 | Not disclosed | Jul, 2023 |
Whisper Large v3 | OpenAI | ASR | 1.55B | 30 seconds | MIT | $0.006/minute | Nov, 2023 |
Whisper Large v3 Turbo | OpenAI | ASR (Faster) | 809M | 30 seconds | MIT | $0.002/minute | Oct, 2024 |
Azure OpenAI Whisper | Microsoft | ASR | 1.55B | 30 seconds | Commercial | $0.006/minute | March, 2024 |
Distil-Whisper v3 | Hugging Face | ASR (Optimized) | 756M | 30 seconds | MIT | Free (Open Source) | March, 2024 |
Suno v4 | Suno AI | Music Generation | Undisclosed | 4 minutes | Commercial | $10/month (Pro) | Nov, 2024 |
Suno v4.5 | Suno AI | Music Generation | Undisclosed | 8 minutes | Commercial | $10/month (Pro) | May, 2025 |
ElevenLabs Music | ElevenLabs | AI Music Generation | Undisclosed | Not Specified | Commercial | Usage-based pricing | May, 2024 |
ElevenLabs Voice v2 | ElevenLabs | TTS / Voice Cloning | Undisclosed | Not Specified | Commercial | $5 – $330/month | Nov, 2024 |
Udio v1.5 | Udio | Music Generation | Undisclosed | 4 minutes | Commercial | $10/month (Standard) | July, 2024 |
What are the AI Models like ChatGPT?
Gemini, Claude, DeepSeek, Perplexity are similar to ChatGPT for text generation and daily usage.
- Gemini by Google: Great for handling long chats and documents. 👉 See Gemini Review
- Claude by Anthropic: Focuses on safety and enterprise use. 👉 See Claude Review
- Perplexity AI: Answers questions using both AI and real-time web search. Great for finding up-to-date info.
- DeepSeek: Open-source and good for developers.
- Mistral Models: Free to use and fast for everyday AI tasks.
📝 Check out this list of ChatGPT alternatives for more options.
Which AI Model is Best for Image Generation?
DALL.E3, Stable Diffusion XL, Midjourney v6, Sora, Imagegen are the best AI models for image generation. 👇
- DALL·E 3 by OpenAI: Turns text into creative images.
- Stable Diffusion XL (SDXL): Open-source and widely used.
- Midjourney v6: Great for high-quality, artistic images.
- Sora by OpenAI: Can generate both images and video.
You can use them for design, art, marketing, and product visuals.
Which AI Model is Best for Speech Recognition?
Whisper and Gemini are great choice for speech recognition.
- Whisper v3 by OpenAI: Free and supports many languages.
- Distil-Whisper v3: Lightweight version of Whisper, faster with decent accuracy.
- Azure OpenAI Whisper: Whisper hosted on Azure, scalable with enterprise support.
👉 Here’s a helpful list of speech-to-text tools to check out.