Chat Completions SDK

Overview

Generate AI responses with web-grounded knowledge using either the Python or TypeScript SDKs. Both SDKs provide full support for chat completions, streaming responses, async operations, and comprehensive error handling.

Quick Start

Python SDK
TypeScript SDK

from perplexity import Perplexity

client = Perplexity()

completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Tell me about the latest developments in AI",
        }
    ],
    model="sonar",
)

print(f"Response: {completion.choices[0].message.content}")

Example Output

Response: Based on the latest information, here are some key developments in AI for 2024:

**Large Language Models & Foundation Models:**
- GPT-4 and its variants continue to improve with better reasoning capabilities
- Open-source models like Llama 2 and Code Llama have gained significant traction
- Specialized models for coding, math, and scientific tasks have emerged

**Multimodal AI:**
- Vision-language models can now process images, text, and audio simultaneously
- Real-time image generation and editing capabilities have improved dramatically

**AI Safety & Alignment:**
- Constitutional AI and RLHF techniques are becoming standard practice
- Increased focus on AI governance and regulatory frameworks...

Request ID: req_123abc456def789

Features

Model Selection

Choose from different Sonar models based on your needs:

Python SDK
TypeScript SDK

# Standard Sonar model for general queries
completion = client.chat.completions.create(
    messages=[{"role": "user", "content": "What is quantum computing?"}],
    model="sonar"
)

# Sonar Pro for more complex queries
completion = client.chat.completions.create(
    messages=[{"role": "user", "content": "Analyze the economic implications of renewable energy adoption"}],
    model="sonar-pro"
)

# Sonar Reasoning for complex analytical tasks
completion = client.chat.completions.create(
    messages=[{"role": "user", "content": "Solve this complex mathematical problem step by step"}],
    model="sonar-reasoning"
)

Conversation Context

Build multi-turn conversations with context:

Python SDK
TypeScript SDK

messages = [
    {"role": "system", "content": "You are a helpful research assistant."},
    {"role": "user", "content": "What are the main causes of climate change?"},
    {"role": "assistant", "content": "The main causes of climate change include..."},
    {"role": "user", "content": "What are some potential solutions?"}
]

completion = client.chat.completions.create(
    messages=messages,
    model="sonar"
)

Web Search Options

Control how the model searches and uses web information:

Python SDK
TypeScript SDK

completion = client.chat.completions.create(
    messages=[
        {"role": "user", "content": "What are the latest developments in renewable energy?"}
    ],
    model="sonar",
    web_search_options={
        "search_recency_filter": "week",  # Focus on recent results
        "search_domain_filter": ["energy.gov", "iea.org", "irena.org"],  # Trusted sources
        "max_search_results": 10
    }
)

Response Customization

Customize response format and behavior:

Python SDK
TypeScript SDK

completion = client.chat.completions.create(
    messages=[
        {"role": "user", "content": "Explain machine learning in simple terms"}
    ],
    model="sonar",
    max_tokens=500,  # Limit response length
    temperature=0.7,  # Control creativity
    top_p=0.9,       # Control diversity
    presence_penalty=0.1,  # Reduce repetition
    frequency_penalty=0.1
)

Streaming Responses

Get real-time response streaming for better user experience:

Python SDK
TypeScript SDK

stream = client.chat.completions.create(
    messages=[
        {"role": "user", "content": "Write a summary of recent AI breakthroughs"}
    ],
    model="sonar",
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

For comprehensive streaming documentation including metadata collection, error handling, advanced patterns, and raw HTTP examples, see the Streaming Guide.

Async Chat Completions

Async chat completions are only available for the sonar-deep-research model.

For long-running or batch processing tasks, use the async endpoints:

Creating Async Requests

Python SDK
TypeScript SDK

# Start an async completion request
async_request = client.async_.chat.completions.create(
    messages=[
        {"role": "user", "content": "Write a comprehensive analysis of renewable energy trends"}
    ],
    model="sonar-deep-research",
    max_tokens=2000
)

print(f"Request submitted with ID: {async_request.request_id}")
print(f"Status: {async_request.status}")

Checking Request Status

Python SDK
TypeScript SDK

# Check the status of an async request
request_id = "req_123abc456def789"
status = client.async_.chat.completions.get(request_id)

print(f"Status: {status.status}")
if status.status == "completed":
    print(f"Response: {status.result.choices[0].message.content}")
elif status.status == "failed":
    print(f"Error: {status.error}")

Listing Async Requests

Python SDK
TypeScript SDK

# List recent async requests
requests = client.async_.chat.completions.list(
    limit=10,
    status="completed"
)

for request in requests.data:
    print(f"ID: {request.id}, Status: {request.status}")

Advanced Usage

Error Handling

Handle chat-specific errors:

Python SDK
TypeScript SDK

import perplexity

try:
    completion = client.chat.completions.create(
        messages=[{"role": "user", "content": "What is AI?"}],
        model="sonar",
        max_tokens=50000  # Exceeds limit
    )
except perplexity.BadRequestError as e:
    print(f"Invalid request parameters: {e}")
except perplexity.RateLimitError as e:
    print("Rate limit exceeded, please retry later")
except perplexity.APIStatusError as e:
    print(f"API error: {e.status_code}")

Custom Instructions

Use system messages for consistent behavior:

Python SDK
TypeScript SDK

system_prompt = """You are an expert research assistant specializing in technology and science. 
Always provide well-sourced, accurate information and cite your sources. 
Format your responses with clear headings and bullet points when appropriate."""

completion = client.chat.completions.create(
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "Explain quantum computing applications"}
    ],
    model="sonar-pro"
)

Concurrent Operations

Handle multiple conversations efficiently:

Python SDK
TypeScript SDK

async def handle_multiple_chats(user_messages):
    client = AsyncPerplexity()
    
    tasks = [
        client.chat.completions.create(
            messages=[{"role": "user", "content": msg}],
            model="sonar-deep-reseach"
        )
        for msg in user_messages
    ]
    
    return await asyncio.gather(*tasks, return_exceptions=True)

Best Practices

Use appropriate models

Choose the right model for your use case: sonar for general queries, sonar-pro for complex analysis, sonar-reasoning for analytical tasks.

Python SDK
TypeScript SDK

# For quick factual queries
simple_query = client.chat.completions.create(
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    model="sonar"
)

# For complex analysis
complex_query = client.chat.completions.create(
    messages=[{"role": "user", "content": "Analyze the economic impact of AI on employment"}],
    model="sonar-pro"
)

Implement streaming for long responses

Use streaming for better user experience with lengthy responses.

Python SDK
TypeScript SDK

def stream_response(query):
    stream = client.chat.completions.create(
        messages=[{"role": "user", "content": query}],
        model="sonar",
        stream=True
    )
    
    response = ""
    for chunk in stream:
        if chunk.choices[0].delta.content:
            content = chunk.choices[0].delta.content
            print(content, end="", flush=True)
            response += content
    
    return response

Handle rate limits gracefully

Implement exponential backoff for production applications.

Python SDK
TypeScript SDK

import time
import random

def chat_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                messages=messages,
                model="sonar"
            )
        except perplexity.RateLimitError:
            if attempt == max_retries - 1:
                raise
            delay = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(delay)

Optimize for specific use cases

Configure parameters based on your application’s needs.

Python SDK
TypeScript SDK

# For factual Q&A
factual_config = {
    "temperature": 0.1,  # Low creativity for accuracy
    "top_p": 0.9,
    "search_recency_filter": "month"
}

# For creative writing
creative_config = {
    "temperature": 0.8,  # Higher creativity
    "top_p": 0.95,
    "presence_penalty": 0.1,
    "frequency_penalty": 0.1
}

# Usage
factual_response = client.chat.completions.create(
    messages=[{"role": "user", "content": "What is the current inflation rate?"}],
    model="sonar",
    **factual_config
)

Resources

The Perplexity SDK Guide - The Perplexity SDK guide
OpenAI Compatibility - OpenAI compatibility guide
Streaming Responses - Complete streaming guide with advanced patterns
API Reference - Chat Completions - Chat Completions API Reference
API Reference - Async Chat Completions - Async API Reference
Structured Outputs - Formatted response generation

Getting Started

Perplexity SDK

Pro Search

Search

Grounded LLM

Admin

Help & Resources

Chat Completions SDK

Overview

Quick Start

Features

Model Selection

Conversation Context

Web Search Options

Response Customization

Streaming Responses

Async Chat Completions

Creating Async Requests

Checking Request Status

Listing Async Requests

Advanced Usage

Error Handling

Custom Instructions

Concurrent Operations

Best Practices

Resources

Getting Started

Perplexity SDK

Pro Search

Search

Grounded LLM

Admin

Help & Resources

​Overview

​Quick Start

​Features

​Model Selection

​Conversation Context

​Web Search Options

​Response Customization

​Streaming Responses

​Async Chat Completions

​Creating Async Requests

​Checking Request Status

​Listing Async Requests

​Advanced Usage

​Error Handling

​Custom Instructions

​Concurrent Operations

​Best Practices

​Resources

Overview

Quick Start

Features

Model Selection

Conversation Context

Web Search Options

Response Customization

Streaming Responses

Async Chat Completions

Creating Async Requests

Checking Request Status

Listing Async Requests

Advanced Usage

Error Handling

Custom Instructions

Concurrent Operations

Best Practices

Resources