Perplexity with LiteLLM

Overview

LiteLLM is a Python SDK and proxy server that gives you a single OpenAI-compatible interface to 100+ LLM providers. Both Perplexity’s Sonar models and the Agent API (with third-party models like GPT-5, Claude, and Gemini routed through Perplexity) are first-class providers in LiteLLM.

LiteLLM lets you swap providers without rewriting code, run a self-hosted proxy that fronts every model behind one API key, and track spend, latency, and errors per provider. Learn more at litellm.ai.

Installation

pip install litellm

API Key Setup

LiteLLM uses two environment variables depending on which Perplexity endpoint you’re calling:

# For Sonar chat completions (litellm.completion)
export PERPLEXITYAI_API_KEY="your_api_key_here"

# For Agent API responses (litellm.responses)
export PERPLEXITY_API_KEY="your_api_key_here"

In practice, set both to the same key.

Get API Key

Generate your Perplexity API key from the API portal.

Sonar Chat Completions

Call Perplexity’s Sonar models through litellm.completion with the perplexity/ model prefix:

from litellm import completion
import os

os.environ["PERPLEXITYAI_API_KEY"] = "your_api_key_here"

response = completion(
    model="perplexity/sonar-pro",
    messages=[
        {"role": "user", "content": "What are the latest fusion breakthroughs?"}
    ],
)

print(response.choices[0].message.content)

Streaming

from litellm import completion

response = completion(
    model="perplexity/sonar-pro",
    messages=[{"role": "user", "content": "Explain quantum computing."}],
    stream=True,
)

for chunk in response:
    print(chunk)

Reasoning Effort

For reasoning-capable Sonar models, pass reasoning_effort to control depth:

response = completion(
    model="perplexity/sonar-reasoning",
    messages=[{"role": "user", "content": "Walk through your reasoning."}],
    reasoning_effort="high",  # "low" | "medium" | "high"
)

Supported Sonar Models

Model	LiteLLM Identifier
`sonar`	`perplexity/sonar`
`sonar-pro`	`perplexity/sonar-pro`
`sonar-reasoning`	`perplexity/sonar-reasoning`
`sonar-reasoning-pro`	`perplexity/sonar-reasoning-pro`
`sonar-deep-research`	`perplexity/sonar-deep-research`

Agent API

Use litellm.responses to call the Agent API, which routes through Perplexity to third-party models with tool orchestration and presets.

Presets

from litellm import responses
import os

os.environ["PERPLEXITY_API_KEY"] = "your_api_key_here"

response = responses(
    model="perplexity/preset/pro-search",
    input="What are the latest developments in AI?",
    custom_llm_provider="perplexity",
)

print(response.output)

Available presets: fast-search, pro-search, deep-research, advanced-deep-research.

Tool Use (`web_search` and `fetch_url`)

from litellm import responses

response = responses(
    model="perplexity/openai/gpt-5.2",
    input="Research quantum computing breakthroughs and cite sources.",
    custom_llm_provider="perplexity",
    tools=[
        {"type": "web_search"},
        {"type": "fetch_url"},
    ],
    instructions="Use web_search and fetch_url to gather citations.",
    max_output_tokens=1000,
    temperature=0.7,
)

print(response.output)

Structured Outputs

from litellm import responses

response = responses(
    model="perplexity/preset/pro-search",
    input="Extract key facts about the Eiffel Tower.",
    custom_llm_provider="perplexity",
    text={
        "format": {
            "type": "json_schema",
            "name": "facts",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "height_meters": {"type": "number"},
                    "year_built": {"type": "integer"},
                },
                "required": ["name", "height_meters", "year_built"],
            },
            "strict": True,
        }
    },
)

Supported Third-Party Models via Agent API

Provider	Models
OpenAI	`perplexity/openai/gpt-5.5`, `perplexity/openai/gpt-5.4`, `perplexity/openai/gpt-5.4-mini`, `perplexity/openai/gpt-5.2`, `perplexity/openai/gpt-5.1`, `perplexity/openai/gpt-5-mini`
Anthropic	`perplexity/anthropic/claude-opus-4-7`, `perplexity/anthropic/claude-opus-4-6`, `perplexity/anthropic/claude-sonnet-4-6`, `perplexity/anthropic/claude-opus-4-5`, `perplexity/anthropic/claude-sonnet-4-5`, `perplexity/anthropic/claude-haiku-4-5`
Google	`perplexity/google/gemini-3.1-pro-preview`, `perplexity/google/gemini-3-flash-preview`, `perplexity/google/gemini-3.1-flash-lite`
xAI	`perplexity/xai/grok-4.20-non-reasoning`
Perplexity	`perplexity/perplexity/sonar`

See the Agent API model list for the canonical, up-to-date catalogue.

LiteLLM Proxy

Run LiteLLM as a self-hosted proxy that fronts Perplexity (and any other provider) behind a single OpenAI-compatible endpoint.

config.yaml

model_list:
  - model_name: perplexity-sonar-reasoning
    litellm_params:
      model: perplexity/sonar-reasoning
      api_key: os.environ/PERPLEXITYAI_API_KEY

  - model_name: perplexity-pro-search
    litellm_params:
      model: perplexity/preset/pro-search
      api_key: os.environ/PERPLEXITY_API_KEY

Start the Proxy

litellm --config /path/to/config.yaml

Call the Proxy

curl http://0.0.0.0:4000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer anything" \
  -d '{
    "model": "perplexity-sonar-reasoning",
    "messages": [{"role": "user", "content": "Who won the World Cup in 2022?"}],
    "reasoning_effort": "high"
  }'

Links & Resources

LiteLLM Perplexity Docs

Official LiteLLM Perplexity provider docs.

LiteLLM Docs

Full LiteLLM documentation.

Perplexity Agent API

Agent API reference and presets.

Perplexity Models

Available Sonar and Agent API models.

Support

Need help with the integration?

Browse the LiteLLM documentation
Review our FAQ

​Overview

​Installation

​API Key Setup

Get API Key

​Sonar Chat Completions

​Streaming

​Reasoning Effort

​Supported Sonar Models

​Agent API

​Presets

​Tool Use (web_search and fetch_url)

​Structured Outputs

​Supported Third-Party Models via Agent API

​LiteLLM Proxy

​config.yaml

​Start the Proxy

​Call the Proxy

​Links & Resources

LiteLLM Perplexity Docs

LiteLLM Docs

Perplexity Agent API

Perplexity Models

​Support

Overview

Installation

API Key Setup

Sonar Chat Completions

Streaming

Reasoning Effort

Supported Sonar Models

Agent API

Presets

Tool Use (`web_search` and `fetch_url`)

Structured Outputs

Supported Third-Party Models via Agent API

LiteLLM Proxy

config.yaml

Start the Proxy

Call the Proxy

Links & Resources

Support