Modelos

A AIVAX provê modelos de diferentes provedores para tornar o desenvolvimento ainda mais rápido, dispensando a necessidade de ter que configurar uma conta para cada provedor para ter acessos aos seus modelos mais recentes.

Veja a lista abaixo dos modelos disponíveis e suas precificações. Todos os preços consideram o total de entrada e saída de tokens, com ou sem cache.

Todos os preços estão em dólares dos Estados Unidos.

amazon

Nome do modelo	Preços	Descrição
`@amazon/nova-pro`	Entrada: $ 0.80 /1m tokens Saída: $ 3.20 /1m tokens	A highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Entrada: aceita imagens, vídeos Chamadas de função Raciocínio
`@amazon/nova-lite`	Entrada: $ 0.06 /1m tokens Saída: $ 0.24 /1m tokens	A very low cost multimodal model that is lightning fast for processing image, video, and text inputs. Entrada: aceita imagens, vídeos Chamadas de função Raciocínio
`@amazon/nova-micro`	Entrada: $ 0.04 /1m tokens Saída: $ 0.14 /1m tokens	A text-only model that delivers the lowest latency responses at very low cost. Chamadas de função

anthropic

Nome do modelo	Preços	Descrição
`@anthropic/claude-4.1-opus`	Entrada: $ 15.00 /1m tokens Entrada (em cache): $ 1.50 /1m tokens Saída: $ 75.00 /1m tokens	Claude Opus 4.1 is Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. Entrada: aceita imagens Chamadas de função Raciocínio
`@anthropic/claude-4.5-opus`	Entrada: $ 5.00 /1m tokens Entrada (em cache): $ 0.50 /1m tokens Saída: $ 25.00 /1m tokens	Claude Opus 4.5 is Anthropic’s latest reasoning model, developed for advanced software engineering, complex agent workflows, and extended computer tasks. Entrada: aceita imagens Chamadas de função Raciocínio
`@anthropic/claude-4.5-sonnet`	Entrada: $ 3.00 /1m tokens Entrada (em cache): $ 0.30 /1m tokens Saída: $ 15.00 /1m tokens	Claude Sonnet 4.5 is the newest model in the Sonnet series, offering improvements and updates over Sonnet 4. Entrada: aceita imagens Chamadas de função Raciocínio
`@anthropic/claude-4-sonnet`	Entrada: $ 3.00 /1m tokens Entrada (em cache): $ 0.30 /1m tokens Saída: $ 15.00 /1m tokens	Anthropic's mid-size model with superior intelligence for high-volume uses in coding, in-depth research, agents, & more. Entrada: aceita imagens Chamadas de função Raciocínio
`@anthropic/claude-4.5-haiku`	Entrada: $ 1.00 /1m tokens Entrada (em cache): $ 0.10 /1m tokens Saída: $ 5.00 /1m tokens	Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, offering near-frontier intelligence with much lower cost and latency than larger Claude models. Entrada: aceita imagens Chamadas de função
`@anthropic/claude-3.5-haiku`	Entrada: $ 0.80 /1m tokens Entrada (em cache): $ 0.08 /1m tokens Saída: $ 4.00 /1m tokens	Claude 3.5 Haiku is the next generation of our fastest model. For a similar speed to Claude 3 Haiku, Claude 3.5 Haiku improves across every skill set and surpasses Claude 3 Opus, the largest model in our previous generation, on many intelligence benchmarks. Entrada: aceita imagens Chamadas de função
`@anthropic/claude-3-haiku`	Entrada: $ 0.25 /1m tokens Entrada (em cache): $ 0.03 /1m tokens Saída: $ 1.25 /1m tokens	Claude 3 Haiku is Anthropic's fastest model yet, designed for enterprise workloads which often involve longer prompts. Entrada: aceita imagens Chamadas de função

cohere

Nome do modelo	Preços	Descrição
`@cohere/command-a`	Entrada: $ 2.50 /1m tokens Saída: $ 10.00 /1m tokens	Command A is Cohere's most performant model to date, excelling at tool use, agents, retrieval augmented generation (RAG), and multilingual use cases. Command A has a context length of 256K, only requires two GPUs to run, and has 150% higher throughput compared to Command R+ 08-2024. Entrada: aceita imagens Chamadas de função

deepseekai

Nome do modelo	Preços	Descrição
`@deepseekai/r1`	Entrada: $ 0.50 /1m tokens Entrada (em cache): $ 0.40 /1m tokens Saída: $ 2.15 /1m tokens	The DeepSeek R1 model has undergone a minor version upgrade, with the current version being DeepSeek-R1-0528. Chamadas de função Raciocínio
`@deepseekai/v3.1-terminus`	Entrada: $ 0.27 /1m tokens Entrada (em cache): $ 0.22 /1m tokens Saída: $ 1.00 /1m tokens	DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. Chamadas de função Raciocínio
`@deepseekai/v3.2-speciale`	Entrada: $ 0.28 /1m tokens Entrada (em cache): $ 0.03 /1m tokens Saída: $ 0.42 /1m tokens	DeepSeek-V3.2-Speciale is a high-compute version of DeepSeek-V3.2, designed for maximum reasoning and agentic performance. Raciocínio
`@deepseekai/v3.2`	Entrada: $ 0.28 /1m tokens Entrada (em cache): $ 0.03 /1m tokens Saída: $ 0.42 /1m tokens	DeepSeek-V3.2 is a large language model optimized for high computational efficiency and strong tool-use reasoning. Chamadas de função Raciocínio

google

Nome do modelo	Preços	Descrição
`@google/gemini-3-pro`	Entrada: $ 2.00 /1m tokens Saída: $ 12.00 /1m tokens	Gemini 3 Pro Preview is Google’s most advanced AI model, setting new records on leading benchmarks like LMArena (1501 Elo), GPQA Diamond (91.9%), and MathArena Apex (23.4%). Entrada: aceita imagens, vídeos, áudios Chamadas de função Raciocínio
`@google/gemini-2.5-pro`	Entrada: $ 1.25 /1m tokens Entrada (em cache): $ 0.31 /1m tokens Saída: $ 10.00 /1m tokens	One of the most powerful models today. Entrada: aceita imagens, vídeos, áudios Chamadas de função Raciocínio
`@google/gemini-3-flash`	Entrada: $ 0.50 /1m tokens Entrada (em cache): $ 0.05 /1m tokens Saída: $ 3.00 /1m tokens	Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. Entrada: aceita imagens, vídeos, áudios Chamadas de função Raciocínio
`@google/gemini-2.5-flash`	Entrada: $ 0.30 /1m tokens Entrada (em cache): $ 0.08 /1m tokens Saída: $ 2.50 /1m tokens	Google's best model in terms of price-performance, offering well-rounded capabilities. 2.5 Flash is best for large scale processing, low-latency, high volume tasks that require thinking, and agentic use cases. Entrada: aceita imagens, vídeos, áudios Chamadas de função Raciocínio
`@google/gemini-2.5-flash-lite`	Entrada: $ 0.10 /1m tokens Entrada (em cache): $ 0.03 /1m tokens Saída: $ 0.40 /1m tokens	A Gemini 2.5 Flash model optimized for cost efficiency and low latency. Entrada: aceita imagens, vídeos, áudios Chamadas de função Raciocínio
`@google/gemini-2.0-flash`	Entrada: $ 0.10 /1m tokens Entrada (em cache): $ 0.03 /1m tokens Saída: $ 0.40 /1m tokens	Gemini 2.0 Flash delivers next-gen features and improved capabilities, including superior speed, native tool use, and a 1M token context window. Entrada: aceita imagens, vídeos, áudios Chamadas de função
`@google/gemini-2.0-flash-lite`	Entrada: $ 0.08 /1m tokens Saída: $ 0.30 /1m tokens	General-purpose model, with image recognition, smart and fast. Great for an economical chat. Entrada: aceita imagens, vídeos, áudios Chamadas de função

inception

Nome do modelo	Preços	Descrição
`@inception/mercury`	Entrada: $ 0.25 /1m tokens Saída: $ 1.00 /1m tokens	Extremely fast model by generative diffusion. Chamadas de função

metaai

Nome do modelo	Preços	Descrição
`@metaai/llama-3.3-70b`	Entrada: $ 0.59 /1m tokens Saída: $ 0.79 /1m tokens	Previous generation model with many parameters and surprisingly fast speed. Chamadas de função
`@metaai/llama-4-maverick-17b-128e`	Entrada: $ 0.20 /1m tokens Saída: $ 0.60 /1m tokens	Fast model, with 17 billion activated parameters and 128 experts. Entrada: aceita imagens Chamadas de função
`@metaai/llama-4-scout-17b-16e`	Entrada: $ 0.11 /1m tokens Saída: $ 0.34 /1m tokens	Smaller version of the Llama 4 family with 17 billion activated parameters and 16 experts. Entrada: aceita imagens Chamadas de função
`@metaai/llama-3.1-8b`	Entrada: $ 0.05 /1m tokens Saída: $ 0.08 /1m tokens	Cheap and fast model for less demanding tasks. Chamadas de função

minimax

Nome do modelo	Preços	Descrição
`@minimax/m2.1`	Entrada: $ 0.30 /1m tokens Saída: $ 1.20 /1m tokens	MiniMax-M2.1 is a cutting-edge, lightweight large language model designed for coding, agentic workflows, and modern application development. Chamadas de função Raciocínio
`@minimax/m2`	Entrada: $ 0.30 /1m tokens Saída: $ 1.20 /1m tokens	MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. Chamadas de função Raciocínio

mistral

Nome do modelo	Preços	Descrição
`@mistral/pixtral-large`	Entrada: $ 2.00 /1m tokens Saída: $ 6.00 /1m tokens	Pixtral Large is the second model in our multimodal family and demonstrates frontier-level image understanding. Particularly, the model is able to understand documents, charts and natural images, while maintaining the leading text-only understanding of Mistral Large 2. Entrada: aceita imagens Chamadas de função
`@mistral/large-2512`	Entrada: $ 0.50 /1m tokens Saída: $ 1.50 /1m tokens	Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total). Chamadas de função
`@mistral/nemo-12b-it-2407`	Entrada: $ 0.02 /1m tokens Saída: $ 0.04 /1m tokens	12B model trained jointly by Mistral AI and NVIDIA, it significantly outperforms existing models smaller or similar in size. Chamadas de função

model-router

Nome do modelo	Preços	Descrição
`@model-router/complexity`	Entrada: $ 0.10 /1m tokens Saída: $ 0.50 /1m tokens	Model Router: chooses the best models according to the complexity of the conversation.

moonshotai

Nome do modelo	Preços	Descrição
`@moonshotai/kimi-k2-dynamic`	Entrada: $ 1.00 /1m tokens Entrada (em cache): $ 0.50 /1m tokens Saída: $ 3.00 /1m tokens	Kimi K2 optimized for chat, where thinking is dynamic and applied only in necessary and demanding situations. Entrada: aceita imagens, áudios Chamadas de função Raciocínio
`@moonshotai/kimi-k2`	Entrada: $ 1.00 /1m tokens Entrada (em cache): $ 0.50 /1m tokens Saída: $ 3.00 /1m tokens	Model with 1tri total parameters, 32bi activated parameters, optimized for agentic intelligence. Chamadas de função
`@moonshotai/kimi-k2-thinking`	Entrada: $ 0.60 /1m tokens Entrada (em cache): $ 0.15 /1m tokens Saída: $ 2.50 /1m tokens	Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Chamadas de função Raciocínio

openai

Nome do modelo	Preços	Descrição
`@openai/gpt-5.2-pro`	Entrada: $ 21.00 /1m tokens Saída: $ 168.00 /1m tokens	GPT-5.2 Pro is OpenAI's most advanced model, featuring significant upgrades in agentic coding and long-context capabilities compared to GPT-5 Pro. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-5.2-chat`	Entrada: $ 1.75 /1m tokens Entrada (em cache): $ 0.18 /1m tokens Saída: $ 14.00 /1m tokens	GPT-5.2 Chat (also known as Instant) is the fast and lightweight version of the 5.2 family, built for low-latency chatting while maintaining strong general intelligence. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-5.2`	Entrada: $ 1.75 /1m tokens Entrada (em cache): $ 0.18 /1m tokens Saída: $ 14.00 /1m tokens	GPT-5.2 is the newest frontier-level model in the GPT-5 line, providing enhanced agentic abilities and better long-context performance than GPT-5.1. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-4o`	Entrada: $ 2.50 /1m tokens Entrada (em cache): $ 1.25 /1m tokens Saída: $ 10.00 /1m tokens	Dedicated to tasks requiring reasoning for mathematical and logical problem solving. Entrada: aceita imagens Chamadas de função
`@openai/gpt-5.1`	Entrada: $ 1.25 /1m tokens Entrada (em cache): $ 0.13 /1m tokens Saída: $ 10.00 /1m tokens	GPT-5.1 is the newest top-tier model in the GPT-5 series, featuring enhanced general reasoning, better instruction following, and a more natural conversational tone compared to GPT-5. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-5.1-chat`	Entrada: $ 1.25 /1m tokens Entrada (em cache): $ 0.13 /1m tokens Saída: $ 10.00 /1m tokens	GPT-5.1 Chat (also known as Instant) is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-5.1-codex`	Entrada: $ 1.25 /1m tokens Entrada (em cache): $ 0.13 /1m tokens Saída: $ 10.00 /1m tokens	GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-5.1-codex-max`	Entrada: $ 1.25 /1m tokens Entrada (em cache): $ 0.13 /1m tokens Saída: $ 10.00 /1m tokens	GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-5-chat`	Entrada: $ 1.25 /1m tokens Entrada (em cache): $ 0.13 /1m tokens Saída: $ 10.00 /1m tokens	GPT-5 snapshot currently used by OpenAI's ChatGPT. Entrada: aceita imagens Chamadas de função
`@openai/gpt-5-codex`	Entrada: $ 1.25 /1m tokens Entrada (em cache): $ 0.13 /1m tokens Saída: $ 10.00 /1m tokens	GPT-5-Codex is a specialized version of GPT-5 tailored for software engineering and coding tasks. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-5`	Entrada: $ 1.25 /1m tokens Entrada (em cache): $ 0.13 /1m tokens Saída: $ 10.00 /1m tokens	OpenAI's newest flagship model for coding, reasoning, and agentic tasks across domains. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-4.1`	Entrada: $ 2.00 /1m tokens Entrada (em cache): $ 0.50 /1m tokens Saída: $ 8.00 /1m tokens	Versatile, highly intelligent, and top-of-the-line. One of the most capable models currently available. Entrada: aceita imagens Chamadas de função
`@openai/o3`	Entrada: $ 2.00 /1m tokens Entrada (em cache): $ 0.50 /1m tokens Saída: $ 8.00 /1m tokens	A well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/o4-mini`	Entrada: $ 1.10 /1m tokens Entrada (em cache): $ 0.28 /1m tokens Saída: $ 4.40 /1m tokens	Optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/o3-mini`	Entrada: $ 1.10 /1m tokens Entrada (em cache): $ 0.55 /1m tokens Saída: $ 4.40 /1m tokens	o3-mini provides high intelligence at the same cost and latency targets of previous versions of o-mini series. Chamadas de função Raciocínio
`@openai/gpt-5.1-codex-mini`	Entrada: $ 0.25 /1m tokens Entrada (em cache): $ 0.03 /1m tokens Saída: $ 2.00 /1m tokens	GPT-5.1-Codex-Mini is a more compact and faster variant of GPT-5.1-Codex. Entrada: aceita imagens Chamadas de função Raciocínio
`@openai/gpt-5-mini`	Entrada: $ 0.25 /1m tokens Entrada (em cache): $ 0.03 /1m tokens Saída: $ 2.00 /1m tokens	GPT-5 mini is a faster, more cost-efficient version of GPT-5. Entrada: aceita imagens Chamadas de função
`@openai/gpt-4.1-mini`	Entrada: $ 0.40 /1m tokens Entrada (em cache): $ 0.10 /1m tokens Saída: $ 1.60 /1m tokens	Fast and cheap for focused tasks. Entrada: aceita imagens Chamadas de função
`@openai/gpt-oss-120b`	Entrada: $ 0.15 /1m tokens Saída: $ 0.75 /1m tokens	OpenAI's flagship open source model, built on a Mixture-of-Experts (MoE) architecture with 120 billion parameters and 128 experts. Chamadas de função Raciocínio
`@openai/gpt-4o-mini`	Entrada: $ 0.15 /1m tokens Entrada (em cache): $ 0.08 /1m tokens Saída: $ 0.60 /1m tokens	Smaller version of 4o, optimized for everyday tasks. Entrada: aceita imagens Chamadas de função
`@openai/gpt-oss-20b`	Entrada: $ 0.10 /1m tokens Saída: $ 0.50 /1m tokens	OpenAI's flagship open source model, built on a Mixture-of-Experts (MoE) architecture with 20 billion parameters and 128 experts. Chamadas de função Raciocínio
`@openai/gpt-4.1-nano`	Entrada: $ 0.10 /1m tokens Entrada (em cache): $ 0.03 /1m tokens Saída: $ 0.40 /1m tokens	The fastest and cheapest GPT 4.1 model. Entrada: aceita imagens Chamadas de função
`@openai/gpt-5-nano`	Entrada: $ 0.05 /1m tokens Entrada (em cache): $ 0.01 /1m tokens Saída: $ 0.40 /1m tokens	OpenAI's fastest, cheapest version of GPT-5. Entrada: aceita imagens Chamadas de função

qwen

Nome do modelo	Preços	Descrição
`@qwen/qwen3-max`	Entrada: $ 1.20 /1m tokens Entrada (em cache): $ 0.24 /1m tokens Saída: $ 6.00 /1m tokens	Qwen3-Max improves instruction following, multilingual ability, and tool use; reduced hallucinations. Chamadas de função Raciocínio
`@qwen/qwen3-coder-plus`	Entrada: $ 1.00 /1m tokens Saída: $ 5.00 /1m tokens	Powered by Qwen3, this is a powerful Coding Agent that excels in tool calling and environment interaction to achieve autonomous programming. Chamadas de função
`@qwen/qwen3-next-80b-a3b-it`	Entrada: $ 0.14 /1m tokens Saída: $ 1.40 /1m tokens	An 80 B-parameter instruction model with hybrid attention and Mixture‑of‑Experts, optimized for ultra‑long contexts up to 262 k tokens. Chamadas de função
`@qwen/qwen3-next-80b-a3b-think`	Entrada: $ 0.14 /1m tokens Saída: $ 1.40 /1m tokens	A 80 B‑parameter “thinking‑only” model with hybrid attention and high‑sparsity MoE, designed for deep reasoning over ultra‑long contexts. Chamadas de função Raciocínio
`@qwen/qwen3-coder-480b-a35b-it`	Entrada: $ 0.29 /1m tokens Saída: $ 1.20 /1m tokens	Qwen3-Coder-480B-A35B-Instruct is the Qwen3's most agentic code model, featuring Significant Performance on Agentic Coding, Agentic Browser-Use and other foundational coding tasks, achieving results comparable to Claude Sonnet. Chamadas de função
`@qwen/qwen3-32b`	Entrada: $ 0.29 /1m tokens Saída: $ 0.59 /1m tokens	32B-parameter LLM with a 131K-token context window, offering advanced chain-of-thought reasoning, seamless tool calling, native JSON outputs, and robust multilingual fluency. Chamadas de função Raciocínio

venice

Nome do modelo	Preços	Descrição
`@venice/dphn-24b-uncensored`	Entrada: $ 0.10 /1m tokens Saída: $ 0.45 /1m tokens	Venice Uncensored is a fine-tuned version of Mistral-Small-24B-Instruct-2501, created by dphn.ai in partnership with Venice.ai.

x-ai

Nome do modelo	Preços	Descrição
`@x-ai/grok-4`	Entrada: $ 3.00 /1m tokens Saída: $ 15.00 /1m tokens	xAI's latest and greatest flagship model, offering unparalleled performance in natural language, math and reasoning - the perfect jack of all trades. Entrada: aceita imagens Chamadas de função Raciocínio
`@x-ai/grok-3`	Entrada: $ 3.00 /1m tokens Saída: $ 15.00 /1m tokens	xAI's flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in finance, healthcare, law, and science. Entrada: aceita imagens Chamadas de função
`@x-ai/grok-code-fast`	Entrada: $ 0.20 /1m tokens Entrada (em cache): $ 0.02 /1m tokens Saída: $ 1.50 /1m tokens	Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding. Chamadas de função Raciocínio
`@x-ai/grok-3-mini`	Entrada: $ 0.30 /1m tokens Saída: $ 0.50 /1m tokens	xAI's lightweight model that thinks before responding. Great for simple or logic-based tasks that do not require deep domain knowledge. The raw thinking traces are accessible. Entrada: aceita imagens Chamadas de função Raciocínio
`@x-ai/grok-4.1-fast-reasoning`	Entrada: $ 0.20 /1m tokens Entrada (em cache): $ 0.05 /1m tokens Saída: $ 0.50 /1m tokens	Grok 4.1 Fast Reasoning is xAI's most capable tool-calling model, engineered for production-grade agentic applications with a 2M token context window. Entrada: aceita imagens Chamadas de função Raciocínio
`@x-ai/grok-4.1-fast`	Entrada: $ 0.20 /1m tokens Entrada (em cache): $ 0.05 /1m tokens Saída: $ 0.50 /1m tokens	Grok 4.1 Fast Non-Reasoning is xAI's high-speed variant optimized for instant responses and straightforward queries, featuring a 2M token context window. Entrada: aceita imagens Chamadas de função
`@x-ai/grok-4-fast-reasoning`	Entrada: $ 0.20 /1m tokens Entrada (em cache): $ 0.05 /1m tokens Saída: $ 0.50 /1m tokens	Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window. Entrada: aceita imagens Chamadas de função Raciocínio
`@x-ai/grok-4-fast`	Entrada: $ 0.20 /1m tokens Entrada (em cache): $ 0.05 /1m tokens Saída: $ 0.50 /1m tokens	Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window. Entrada: aceita imagens Chamadas de função

xiaomi

Nome do modelo	Preços	Descrição
`@xiaomi/mimo-v2-flash`	Entrada: $ 0.10 /1m tokens Saída: $ 0.30 /1m tokens	MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. Chamadas de função Raciocínio

z-ai

Nome do modelo	Preços	Descrição
`@z-ai/glm-4.7`	Entrada: $ 0.60 /1m tokens Entrada (em cache): $ 0.11 /1m tokens Saída: $ 2.20 /1m tokens	GLM-4.7 is Z.AI’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. Chamadas de função Raciocínio
`@z-ai/glm-4.6`	Entrada: $ 0.60 /1m tokens Saída: $ 2.00 /1m tokens	GLM‑4.6 is a high‑capacity LLM with a 200K‑token context window, strong coding and reasoning abilities, and enhanced tool‑use capabilities. Chamadas de função Raciocínio

Table of Contents