Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.perplexity.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

LiteLLM is a Python SDK and proxy server that gives you a single OpenAI-compatible interface to 100+ LLM providers. Both Perplexity’s Sonar models and the Agent API (with third-party models like GPT-5, Claude, and Gemini routed through Perplexity) are first-class providers in LiteLLM.
LiteLLM lets you swap providers without rewriting code, run a self-hosted proxy that fronts every model behind one API key, and track spend, latency, and errors per provider. Learn more at litellm.ai.

Installation

pip install litellm

API Key Setup

LiteLLM uses two environment variables depending on which Perplexity endpoint you’re calling:
# For Sonar chat completions (litellm.completion)
export PERPLEXITYAI_API_KEY="your_api_key_here"

# For Agent API responses (litellm.responses)
export PERPLEXITY_API_KEY="your_api_key_here"
In practice, set both to the same key.

Get API Key

Generate your Perplexity API key from the API portal.

Sonar Chat Completions

Call Perplexity’s Sonar models through litellm.completion with the perplexity/ model prefix:
from litellm import completion
import os

os.environ["PERPLEXITYAI_API_KEY"] = "your_api_key_here"

response = completion(
    model="perplexity/sonar-pro",
    messages=[
        {"role": "user", "content": "What are the latest fusion breakthroughs?"}
    ],
)

print(response.choices[0].message.content)

Streaming

from litellm import completion

response = completion(
    model="perplexity/sonar-pro",
    messages=[{"role": "user", "content": "Explain quantum computing."}],
    stream=True,
)

for chunk in response:
    print(chunk)

Reasoning Effort

For reasoning-capable Sonar models, pass reasoning_effort to control depth:
response = completion(
    model="perplexity/sonar-reasoning",
    messages=[{"role": "user", "content": "Walk through your reasoning."}],
    reasoning_effort="high",  # "low" | "medium" | "high"
)

Supported Sonar Models

ModelLiteLLM Identifier
sonarperplexity/sonar
sonar-properplexity/sonar-pro
sonar-reasoningperplexity/sonar-reasoning
sonar-reasoning-properplexity/sonar-reasoning-pro
sonar-deep-researchperplexity/sonar-deep-research

Agent API

Use litellm.responses to call the Agent API, which routes through Perplexity to third-party models with tool orchestration and presets.

Presets

from litellm import responses
import os

os.environ["PERPLEXITY_API_KEY"] = "your_api_key_here"

response = responses(
    model="perplexity/preset/pro-search",
    input="What are the latest developments in AI?",
    custom_llm_provider="perplexity",
)

print(response.output)
Available presets: fast-search, pro-search, deep-research, advanced-deep-research.

Tool Use (web_search and fetch_url)

from litellm import responses

response = responses(
    model="perplexity/openai/gpt-5.2",
    input="Research quantum computing breakthroughs and cite sources.",
    custom_llm_provider="perplexity",
    tools=[
        {"type": "web_search"},
        {"type": "fetch_url"},
    ],
    instructions="Use web_search and fetch_url to gather citations.",
    max_output_tokens=1000,
    temperature=0.7,
)

print(response.output)

Structured Outputs

from litellm import responses

response = responses(
    model="perplexity/preset/pro-search",
    input="Extract key facts about the Eiffel Tower.",
    custom_llm_provider="perplexity",
    text={
        "format": {
            "type": "json_schema",
            "name": "facts",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "height_meters": {"type": "number"},
                    "year_built": {"type": "integer"},
                },
                "required": ["name", "height_meters", "year_built"],
            },
            "strict": True,
        }
    },
)

Supported Third-Party Models via Agent API

ProviderModels
OpenAIperplexity/openai/gpt-5.5, perplexity/openai/gpt-5.4, perplexity/openai/gpt-5.4-mini, perplexity/openai/gpt-5.2, perplexity/openai/gpt-5.1, perplexity/openai/gpt-5-mini
Anthropicperplexity/anthropic/claude-opus-4-7, perplexity/anthropic/claude-opus-4-6, perplexity/anthropic/claude-sonnet-4-6, perplexity/anthropic/claude-opus-4-5, perplexity/anthropic/claude-sonnet-4-5, perplexity/anthropic/claude-haiku-4-5
Googleperplexity/google/gemini-3.1-pro-preview, perplexity/google/gemini-3-flash-preview, perplexity/google/gemini-3.1-flash-lite
xAIperplexity/xai/grok-4.20-non-reasoning
Perplexityperplexity/perplexity/sonar
See the Agent API model list for the canonical, up-to-date catalogue.

LiteLLM Proxy

Run LiteLLM as a self-hosted proxy that fronts Perplexity (and any other provider) behind a single OpenAI-compatible endpoint.

config.yaml

model_list:
  - model_name: perplexity-sonar-reasoning
    litellm_params:
      model: perplexity/sonar-reasoning
      api_key: os.environ/PERPLEXITYAI_API_KEY

  - model_name: perplexity-pro-search
    litellm_params:
      model: perplexity/preset/pro-search
      api_key: os.environ/PERPLEXITY_API_KEY

Start the Proxy

litellm --config /path/to/config.yaml

Call the Proxy

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer anything" \
  -d '{
    "model": "perplexity-sonar-reasoning",
    "messages": [{"role": "user", "content": "Who won the World Cup in 2022?"}],
    "reasoning_effort": "high"
  }'

LiteLLM Perplexity Docs

Official LiteLLM Perplexity provider docs.

LiteLLM Docs

Full LiteLLM documentation.

Perplexity Agent API

Agent API reference and presets.

Perplexity Models

Available Sonar and Agent API models.

Support

Need help with the integration?