Models

Available Models

The Agent API supports direct access to models from multiple providers. All models are accessed directly from first-party providers with transparent token-based pricing. Pricing rates are updated monthly and reflect direct first-party provider pricing with no markup. All charges are based on actual token consumption, and every API response includes exact token counts so you know your costs per request.

Not all third-party models support all features (e.g., reasoning, tools). Check model documentation for specific capabilities.

Model	Input Price	Output Price	Cache Read Price	Provider Documentation
Perplexity Models
`perplexity/sonar`	$0.25 / 1M tokens	$2.50 / 1M tokens	$0.0625 / 1M tokens	Sonar
Anthropic Models
`anthropic/claude-opus-4-6`	$5 / 1M tokens	$25 / 1M tokens	$0.50 / 1M tokens	Claude Opus 4.6
`anthropic/claude-opus-4-5`	$5 / 1M tokens	$25 / 1M tokens	$0.50 / 1M tokens	Claude Opus 4.5
`anthropic/claude-sonnet-4-6`	$3 / 1M tokens	$15 / 1M tokens	$0.30 / 1M tokens	Claude Sonnet 4.6
`anthropic/claude-sonnet-4-5`	$3 / 1M tokens	$15 / 1M tokens	$0.30 / 1M tokens	Claude Sonnet 4.5
`anthropic/claude-haiku-4-5`	$1 / 1M tokens	$5 / 1M tokens	$0.10 / 1M tokens	Claude Haiku 4.5
OpenAI Models
`openai/gpt-5.4`	$2.50 / 1M tokens	$15.00 / 1M tokens	$0.25 / 1M tokens	GPT-5.4
`openai/gpt-5.2`	$1.75 / 1M tokens	$14 / 1M tokens	$0.175 / 1M tokens	GPT-5.2
`openai/gpt-5.1`	$1.25 / 1M tokens	$10 / 1M tokens	$0.125 / 1M tokens	GPT-5.1
`openai/gpt-5-mini`	$0.25 / 1M tokens	$2 / 1M tokens	$0.025 / 1M tokens	GPT-5 Mini
Google Models
`google/gemini-3.1-pro-preview`	$2.00 / 1M tokens (≤200k context) $4.00 / 1M tokens (>200k context)	$12.00 / 1M tokens (≤200k context) $18.00 / 1M tokens (>200k context)	90% discount	Gemini 3.1 Pro
`google/gemini-3-flash-preview`	$0.50 / 1M tokens	$3.00 / 1M tokens	90% discount	Gemini 3.0 Flash
`google/gemini-2.5-pro`	$1.25 / 1M tokens (≤200k context) $2.50 / 1M tokens (>200k context)	$10.00 / 1M tokens (≤200k context) $15.00 / 1M tokens (>200k context)	90% discount	Gemini 2.5 Pro
`google/gemini-2.5-flash`	$0.30 / 1M tokens	$2.50 / 1M tokens	90% discount	Gemini 2.5 Flash
NVIDIA Models
`nvidia/nemotron-3-super-120b-a12b`	$0.25 / 1M tokens	$2.50 / 1M tokens	—	Nemotron 3 Super 120B
xAI Models
`xai/grok-4-1-fast-non-reasoning`	$0.20 / 1M tokens	$0.50 / 1M tokens	$0.05 / 1M tokens	Grok 4.1

See Your Costs in Real-Time: Every response includes a usage field with exact input tokens, output tokens, and cache read tokens. Calculate your cost instantly using the pricing table above.

{
  "usage": {
    "input_tokens": 150,
    "output_tokens": 320,
    "total_tokens": 470
  }
}

Using a Model

from perplexity import Perplexity

client = Perplexity()

response = client.responses.create(
    model="nvidia/nemotron-3-super-120b-a12b",
    input="Explain the difference between supervised and unsupervised learning in machine learning.",
    max_output_tokens=300,
)

print(f"Response ID: {response.id}")
print(response.output_text)

Configuration Options

The Agent API supports two ways to configure models:

Presets — Pre-configured model setups optimized for specific use cases
Models — Direct model selection, including third-party models

Model Fallback

For high-availability applications, you can specify multiple models in a fallback chain. When one model fails or is unavailable, the API automatically tries the next model in the chain.

Model Fallback Chain

Learn how to use model fallback chains to ensure high availability and reliability by automatically trying multiple models when one fails.

Example:

response = client.responses.create(
    models=["nvidia/nemotron-3-super-120b-a12b", "openai/gpt-5.4", "google/gemini-3-flash-preview"],
    input="Your question here"
)

For detailed examples, pricing information, and best practices, see the Model Fallback documentation.

Next Steps

Model Fallback

Learn how to use model fallback chains for higher availability.

Presets

Explore available presets and their configurations.

Agent API Quickstart

Get started with your first Agent API call.

API Reference

View complete endpoint documentation.

Getting Started

Perplexity SDK

Agent API

Search API

Sonar API

Embeddings API

Admin & Management

Resources

Available Models

Using a Model

Configuration Options

Model Fallback

Model Fallback Chain

Next Steps

Model Fallback

Presets

Agent API Quickstart

API Reference

Getting Started

Perplexity SDK

Agent API

Search API

Sonar API

Embeddings API

Admin & Management

Resources

​Available Models

​Using a Model

​Configuration Options

​Model Fallback

Model Fallback Chain

​Next Steps

Model Fallback

Presets

Agent API Quickstart

API Reference

Available Models

Using a Model

Configuration Options

Model Fallback

Next Steps