> ## Documentation Index
> Fetch the complete documentation index at: https://docs.perplexity.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Models

> Explore available presets and third-party models for the Agent API, including Perplexity presets and third-party model support.

## Available Models

The Agent API supports direct access to models from multiple providers. All models are accessed directly from first-party providers with transparent token-based pricing.

Pricing rates are updated monthly and **reflect direct first-party provider pricing with no markup**. All charges are based on actual token consumption, and every API response includes exact token counts so you know your costs per request.

<Tip>
  Looking for pre-configured model setups? See [**Presets**](/docs/agent-api/presets) — optimized for specific use cases.
</Tip>

<Tabs>
  <Tab title="Perplexity">
    <Card title="Perplexity">
      Sonar — Perplexity's grounded search model.
    </Card>

    | Model              | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs                                                        |
    | ------------------ | ------------- | -------------- | ------------- | ----------------------------------------------------------- |
    | `perplexity/sonar` | 0.25          | 2.50           | 0.0625        | [Sonar](https://docs.perplexity.ai/docs/sonar/models/sonar) |
  </Tab>

  <Tab title="Anthropic">
    <Card title="Anthropic">
      Claude Opus (highest reasoning), Sonnet (balanced), and Haiku (fastest, cheapest).
    </Card>

    | Model                         | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs                                                                  |
    | ----------------------------- | ------------- | -------------- | ------------- | --------------------------------------------------------------------- |
    | `anthropic/claude-opus-4-7`   | 5             | 25             | 0.50          | [Claude Opus 4.7](https://www.anthropic.com/news/claude-opus-4-7)     |
    | `anthropic/claude-opus-4-6`   | 5             | 25             | 0.50          | [Claude Opus 4.6](https://www.anthropic.com/news/claude-opus-4-6)     |
    | `anthropic/claude-opus-4-5`   | 5             | 25             | 0.50          | [Claude Opus 4.5](https://www.anthropic.com/news/claude-opus-4-5)     |
    | `anthropic/claude-sonnet-4-6` | 3             | 15             | 0.30          | [Claude Sonnet 4.6](https://www.anthropic.com/news/claude-sonnet-4-6) |
    | `anthropic/claude-sonnet-4-5` | 3             | 15             | 0.30          | [Claude Sonnet 4.5](https://www.anthropic.com/news/claude-sonnet-4-5) |
    | `anthropic/claude-haiku-4-5`  | 1             | 5              | 0.10          | [Claude Haiku 4.5](https://www.anthropic.com/news/claude-haiku-4-5)   |
  </Tab>

  <Tab title="OpenAI">
    <Card title="OpenAI">
      GPT-5 family — flagship, mini, and nano variants.
    </Card>

    | Model                 | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs                                                                 |
    | --------------------- | ------------- | -------------- | ------------- | -------------------------------------------------------------------- |
    | `openai/gpt-5.5`      | 5.00          | 30.00          | 0.50          | [GPT-5.5](https://developers.openai.com/api/docs/models/gpt-5.5)     |
    | `openai/gpt-5.4`      | 2.50          | 15.00          | 0.25          | [GPT-5.4](https://platform.openai.com/docs/models/gpt-5.4)           |
    | `openai/gpt-5.4-mini` | 0.75          | 4.50           | 0             | [GPT-5.4 Mini](https://platform.openai.com/docs/models/gpt-5.4-mini) |
    | `openai/gpt-5.4-nano` | 0.20          | 1.25           | 0             | [GPT-5.4 Nano](https://platform.openai.com/docs/models/gpt-5.4-nano) |
    | `openai/gpt-5.2`      | 1.75          | 14             | 0.175         | [GPT-5.2](https://platform.openai.com/docs/models/gpt-5.2)           |
    | `openai/gpt-5.1`      | 1.25          | 10             | 0.125         | [GPT-5.1](https://platform.openai.com/docs/models/gpt-5.1)           |
    | `openai/gpt-5`        | 1.25          | 10             | 0.125         | [GPT-5](https://platform.openai.com/docs/models/gpt-5)               |
    | `openai/gpt-5-mini`   | 0.25          | 2              | 0.025         | [GPT-5 Mini](https://platform.openai.com/docs/models/gpt-5-mini)     |
  </Tab>

  <Tab title="Google">
    <Card title="Google">
      Gemini 3 family — Pro for long-context, Flash and Flash Lite for speed.
    </Card>

    | Model                                  | Input (\$/1M)                  | Output (\$/1M)                   | Cache (\$/1M) | Docs                                                                                                        |
    | -------------------------------------- | ------------------------------ | -------------------------------- | ------------- | ----------------------------------------------------------------------------------------------------------- |
    | `google/gemini-3.1-pro-preview`        | 2.00 (≤200k)<br />4.00 (>200k) | 12.00 (≤200k)<br />18.00 (>200k) | 90% off input | [Gemini 3.1 Pro](https://ai.google.dev/gemini-api/docs/models#gemini-3.1-pro-preview)                       |
    | `google/gemini-3.1-flash-lite`         | 0.25                           | 1.50                             | 90% off input | [Gemini 3.1 Flash Lite](https://ai.google.dev/gemini-api/docs/models/gemini-3.1-flash-lite)                 |
    | `google/gemini-3.1-flash-lite-preview` | 0.25                           | 1.50                             | 90% off input | [Gemini 3.1 Flash Lite Preview](https://ai.google.dev/gemini-api/docs/models#gemini-3.1-flash-lite-preview) |
    | `google/gemini-3.5-flash`              | 1.50                           | 9.00                             | 0.15          | [Gemini 3.5 Flash](https://ai.google.dev/gemini-api/docs/models/gemini-3.5-flash)                           |
    | `google/gemini-3-flash-preview`        | 0.50                           | 3.00                             | 90% off input | [Gemini 3.0 Flash](https://ai.google.dev/gemini-api/docs/models#gemini-3-flash-preview)                     |
  </Tab>

  <Tab title="xAI">
    <Card title="xAI">
      Grok 4.3 and 4.20 variants — reasoning, non-reasoning, and multi-agent.
    </Card>

    | Model                         | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs                                                           |
    | ----------------------------- | ------------- | -------------- | ------------- | -------------------------------------------------------------- |
    | `xai/grok-4.3`                | 1.25          | 2.50           | 0.20          | [Grok 4.3](https://docs.x.ai/developers/models)                |
    | `xai/grok-4.20-reasoning`     | 1.25          | 2.50           | 0.20          | [Grok 4.20 Reasoning](https://docs.x.ai/developers/models)     |
    | `xai/grok-4.20-non-reasoning` | 1.25          | 2.50           | 0.20          | [Grok 4.20 Non Reasoning](https://docs.x.ai/developers/models) |
    | `xai/grok-4.20-multi-agent`   | 1.25          | 2.50           | 0.20          | [Grok 4.20 Multi-Agent](https://docs.x.ai/developers/models)   |
  </Tab>

  <Tab title="NVIDIA">
    <Card title="NVIDIA">
      Nemotron 3 Super — NVIDIA's open-weight reasoning model.
    </Card>

    | Model                               | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs                                                                           |
    | ----------------------------------- | ------------- | -------------- | ------------- | ------------------------------------------------------------------------------ |
    | `nvidia/nemotron-3-super-120b-a12b` | 0.25          | 2.50           | —             | [Nemotron 3 Super 120B](https://research.nvidia.com/labs/nemotron/Nemotron-3/) |
  </Tab>
</Tabs>

<Warning>
  Not all third-party models support all features (e.g., reasoning, tools). Check model documentation for specific capabilities.
</Warning>

## Using a Model

<CodeGroup>
  ```python Python theme={null}
  from perplexity import Perplexity

  client = Perplexity()

  response = client.responses.create(
      model="openai/gpt-5.5",
      input="Explain the difference between supervised and unsupervised learning in machine learning.",
      max_output_tokens=300,
  )

  print(f"Response ID: {response.id}")
  print(response.output_text)
  ```

  ```typescript Typescript theme={null}
  import Perplexity from '@perplexity-ai/perplexity_ai';

  const client = new Perplexity();

  const response = await client.responses.create({
      model: "openai/gpt-5.5",
      input: "Explain the difference between supervised and unsupervised learning in machine learning.",
      max_output_tokens: 300,
  });

  console.log(`Response ID: ${response.id}`);
  console.log(response.output_text);
  ```

  ```bash cURL theme={null}
  curl https://api.perplexity.ai/v1/agent \
    -H "Authorization: Bearer $PERPLEXITY_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai/gpt-5.5",
      "input": "Explain the difference between supervised and unsupervised learning in machine learning.",
      "max_output_tokens": 300
    }' | jq
  ```
</CodeGroup>

<Tip>
  **See Your Costs in Real-Time:** Every response includes a `usage` field with exact input tokens, output tokens, and cache read tokens. Calculate your cost instantly using the pricing table above.

  ```json theme={null}
  {
    "usage": {
      "input_tokens": 150,
      "output_tokens": 320,
      "total_tokens": 470
    }
  }
  ```
</Tip>

## Model Fallback

For high-availability applications, you can specify multiple models in a fallback chain. When one model fails or is unavailable, the API automatically tries the next model in the chain.

<Card title="Model Fallback Chain" icon="square-rounded-arrow-down" href="/docs/agent-api/model-fallback">
  Learn how to use model fallback chains to ensure high availability and reliability by automatically trying multiple models when one fails.
</Card>

<Info>
  **Example:**

  ```python theme={null}
  response = client.responses.create(
      models=["openai/gpt-5.5", "anthropic/claude-sonnet-4-6", "google/gemini-3-flash-preview"],
      input="Your question here"
  )
  ```

  For detailed examples, pricing information, and best practices, see the [Model Fallback documentation](/docs/agent-api/model-fallback).
</Info>

## Next Steps

<CardGroup cols={2}>
  <Card title="Web Search" icon="screwdriver-wrench" href="/docs/agent-api/tools/web-search">
    Equip your model with web search for source-grounded context.
  </Card>

  <Card title="Prompt Guide" icon="lightbulb" href="/docs/agent-api/prompt-guide">
    Write prompts that get the most out of the Agent API.
  </Card>

  <Card title="Output Control" icon="wand-magic-sparkles" href="/docs/agent-api/output-control">
    Shape responses with structured outputs and JSON schemas.
  </Card>

  <Card title="Finance Search" icon="chart-line" href="/docs/agent-api/tools/finance-search">
    Query market data, filings, and ticker-level information.
  </Card>
</CardGroup>
