# Create Agent Response Source: https://docs.perplexity.ai/api-reference/agent-post post /v1/agent Generate a response for the provided input with optional web search and reasoning. # Get Async Chat Completion Source: https://docs.perplexity.ai/api-reference/async-sonar-api-request-get get /v1/async/sonar/{api_request} Retrieve the response for a given asynchronous chat completion request. # List Async Chat Completions Source: https://docs.perplexity.ai/api-reference/async-sonar-get get /v1/async/sonar Retrieve a list of all asynchronous chat completion requests for a given user. # Create Async Chat Completion Source: https://docs.perplexity.ai/api-reference/async-sonar-post post /v1/async/sonar Submit an asynchronous chat completion request. # Create Contextualized Embeddings Source: https://docs.perplexity.ai/api-reference/contextualized-embeddings-post post /v1/contextualizedembeddings Generate contextualized embeddings for document chunks. Chunks from the same document share context awareness, improving retrieval quality for document-based applications. # Create Embeddings Source: https://docs.perplexity.ai/api-reference/embeddings-post post /v1/embeddings Generate embeddings for a list of texts. Use these embeddings for semantic search, clustering, and other machine learning applications. # Generate Auth Token Source: https://docs.perplexity.ai/api-reference/generate-auth-token-post post /generate_auth_token Generates a new authentication token for API access. # List Models Source: https://docs.perplexity.ai/api-reference/models-get get /v1/models List the models available for the Agent API. Returns model identifiers that can be used with the `POST /v1/agent` endpoint. The response follows the OpenAI List Models format for compatibility with third-party tools. # Revoke Auth Token Source: https://docs.perplexity.ai/api-reference/revoke-auth-token-post post /revoke_auth_token Revokes an existing authentication token. # Search the Web Source: https://docs.perplexity.ai/api-reference/search-post post /search Search the web and retrieve relevant web page contents. # Create Chat Completion Source: https://docs.perplexity.ai/api-reference/sonar-post post /v1/sonar Generate a chat completion response for the given conversation. # API Key Management Source: https://docs.perplexity.ai/docs/admin/api-key-management Learn how to generate, revoke, and rotate API keys for secure access ## Overview API keys are essential for authenticating requests to the Perplexity API. This guide covers how to create, manage, and rotate your API keys using our authentication token management endpoints. **API keys are shown only once.** When you create an API key — through the console or programmatically — the full token is returned at that moment and **cannot be retrieved again**. Save it immediately to a secure location before closing the page or response. API keys should be treated as sensitive credentials. Store them securely and never expose them in client-side code or public repositories. ## Getting Started: Create Your API Group First **Important Prerequisites**: Before you can generate any API keys, you must first create an API group through the Perplexity web interface. Navigate to the API Groups page and create your first group: **[Create API Group →](https://console.perplexity.ai)** API groups help organize your keys and manage access across different projects or environments. Choose a descriptive name for your API group (e.g., "Production", "Development", or your project name) to help with organization. Once you have an API group, navigate to the API Keys page to generate your first key: **[Generate API Keys →](https://console.perplexity.ai)** You can create multiple keys within each group for different purposes or environments. The full key value is displayed once at creation — copy it before leaving the page. After creating your first API key through the web interface, you can use the programmatic endpoints below to generate and manage additional keys. ## Key Management Endpoints Perplexity provides two endpoints for managing API keys programmatically: * **`/generate_auth_token`** - Creates a new API key * **`/revoke_auth_token`** - Revokes an existing API key Once an API key is revoked, it cannot be recovered. Make sure to update your applications with new keys before revoking old ones. ## Generating API Keys Create new API keys programmatically. Always provide a descriptive `token_name` — once a key is created, this name is the primary way to identify it later, since the full token value is no longer visible. ### Request ```bash cURL theme={null} curl --request POST \ --url https://api.perplexity.ai/generate_auth_token \ --header "Authorization: Bearer YOUR_EXISTING_API_KEY" \ --header "Content-Type: application/json" \ --data '{ "token_name": "Production API Key" }' ``` ```python Python theme={null} import requests url = "https://api.perplexity.ai/generate_auth_token" headers = { "Authorization": "Bearer YOUR_EXISTING_API_KEY", "Content-Type": "application/json" } payload = { "token_name": "Production API Key" # Optional } response = requests.post(url, headers=headers, json=payload) data = response.json() print(f"New API Key: {data['auth_token']}") print(f"Created at: {data['created_at_epoch_seconds']}") ``` ```typescript Typescript theme={null} const response = await fetch("https://api.perplexity.ai/generate_auth_token", { method: "POST", headers: { "Authorization": "Bearer YOUR_EXISTING_API_KEY", "Content-Type": "application/json" }, body: JSON.stringify({ token_name: "Production API Key" // Optional }) }); const data = await response.json(); console.log(`New API Key: ${data.auth_token}`); console.log(`Created at: ${data.created_at_epoch_seconds}`); ``` ### Response ```json theme={null} { "auth_token": "pplx-1234567890abcdef", "created_at_epoch_seconds": 1735689600, "token_name": "Production API Key" } ``` Store the `auth_token` immediately and securely. This is the **only** time the full token value is returned — it cannot be retrieved later from any endpoint or from the console. ## Revoking API Keys Revoke API keys that are no longer needed or may have been compromised. ### Request ```bash cURL theme={null} curl --request POST \ --url https://api.perplexity.ai/revoke_auth_token \ --header "Authorization: Bearer $PERPLEXITY_API_KEY" \ --header "Content-Type: application/json" \ --data '{ "auth_token": "pplx-1234567890abcdef" }' ``` ```python Python theme={null} import os import requests url = "https://api.perplexity.ai/revoke_auth_token" headers = { "Authorization": f"Bearer {os.environ.get('PERPLEXITY_API_KEY')}", "Content-Type": "application/json" } payload = { "auth_token": "pplx-1234567890abcdef" } response = requests.post(url, headers=headers, json=payload) if response.status_code == 200: print("API key successfully revoked") ``` ```typescript Typescript theme={null} const response = await fetch("https://api.perplexity.ai/revoke_auth_token", { method: "POST", headers: { "Authorization": `Bearer ${process.env.PERPLEXITY_API_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ auth_token: "pplx-1234567890abcdef" }) }); if (response.status === 200) { console.log("API key successfully revoked"); } ``` ### Response Returns a `200 OK` status code on successful revocation. ## API Key Rotation Regular key rotation is a security best practice that minimizes the impact of potential key compromises. Here's how to implement zero-downtime key rotation: ### Rotation Strategy Create a new API key while your current key is still active: ```python theme={null} # Generate new key new_key_response = requests.post( "https://api.perplexity.ai/generate_auth_token", headers={"Authorization": f"Bearer {current_key}"}, json={"token_name": f"Rotated Key - {datetime.now().isoformat()}"} ) new_key = new_key_response.json()["auth_token"] ``` Deploy the new key to your applications: ```python theme={null} # Update environment variables or secrets management os.environ["PERPLEXITY_API_KEY"] = new_key # Verify new key works test_response = requests.post( "https://api.perplexity.ai/v1/sonar", headers={"Authorization": f"Bearer {new_key}"}, json={ "model": "sonar", "messages": [{"role": "user", "content": "Test"}] } ) assert test_response.status_code == 200 ``` Ensure all services are using the new key before proceeding: ```python theme={null} # Monitor your application logs to confirm # all instances are using the new key time.sleep(300) # Wait for propagation ``` Once confirmed, revoke the old key: ```python theme={null} # Revoke old key revoke_response = requests.post( "https://api.perplexity.ai/revoke_auth_token", headers={"Authorization": f"Bearer {new_key}"}, json={"auth_token": current_key} ) assert revoke_response.status_code == 200 print("Key rotation completed successfully") ``` ### Automated Rotation Example Here's a complete example of an automated key rotation script: ```python Python theme={null} import requests import os import time from datetime import datetime import logging logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) class PerplexityKeyRotator: def __init__(self, current_key): self.base_url = "https://api.perplexity.ai" self.current_key = current_key def generate_new_key(self, name=None): """Generate a new API key""" url = f"{self.base_url}/generate_auth_token" headers = {"Authorization": f"Bearer {self.current_key}"} payload = {} if name: payload["token_name"] = name response = requests.post(url, headers=headers, json=payload) response.raise_for_status() return response.json() def test_key(self, key): """Test if a key is valid""" url = f"{self.base_url}/v1/sonar" headers = {"Authorization": f"Bearer {key}"} payload = { "model": "sonar", "messages": [{"role": "user", "content": "Test"}], "max_tokens": 1 } try: response = requests.post(url, headers=headers, json=payload) return response.status_code == 200 except: return False def revoke_key(self, key_to_revoke): """Revoke an API key""" url = f"{self.base_url}/revoke_auth_token" headers = {"Authorization": f"Bearer {self.current_key}"} payload = {"auth_token": key_to_revoke} response = requests.post(url, headers=headers, json=payload) return response.status_code == 200 def rotate_key(self, update_callback=None): """Perform complete key rotation""" logger.info("Starting key rotation...") # Step 1: Generate new key new_key_data = self.generate_new_key( name=f"Rotated-{datetime.now().strftime('%Y%m%d-%H%M%S')}" ) new_key = new_key_data["auth_token"] logger.info(f"New key generated: {new_key[:10]}...") # Step 2: Test new key if not self.test_key(new_key): raise Exception("New key validation failed") logger.info("New key validated successfully") # Step 3: Update application (callback) if update_callback: update_callback(new_key) logger.info("Application updated with new key") # Step 4: Wait for propagation logger.info("Waiting for propagation...") time.sleep(30) # Step 5: Revoke old key old_key = self.current_key self.current_key = new_key # Use new key for revocation if self.revoke_key(old_key): logger.info("Old key revoked successfully") else: logger.warning("Failed to revoke old key") logger.info("Key rotation completed") return new_key # Usage example def update_environment(new_key): """Update your environment with the new key""" os.environ["PERPLEXITY_API_KEY"] = new_key # Update your secrets management system here # update_aws_secrets_manager(new_key) # update_kubernetes_secret(new_key) # Perform rotation rotator = PerplexityKeyRotator(os.environ["PERPLEXITY_API_KEY"]) new_key = rotator.rotate_key(update_callback=update_environment) print(f"Rotation complete. New key: {new_key[:10]}...") ``` ```typescript Typescript theme={null} import fetch from 'node-fetch'; class PerplexityKeyRotator { private baseUrl = 'https://api.perplexity.ai'; private currentKey: string; constructor(currentKey: string) { this.currentKey = currentKey; } async generateNewKey(name?: string): Promise<{ auth_token: string; created_at_epoch_seconds: number; token_name?: string; }> { const response = await fetch(`${this.baseUrl}/generate_auth_token`, { method: 'POST', headers: { 'Authorization': `Bearer ${this.currentKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify(name ? { token_name: name } : {}) }); if (!response.ok) { throw new Error(`Failed to generate key: ${response.statusText}`); } return response.json(); } async testKey(key: string): Promise { try { const response = await fetch(`${this.baseUrl}/v1/sonar`, { method: 'POST', headers: { 'Authorization': `Bearer ${key}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ model: 'sonar', messages: [{ role: 'user', content: 'Test' }], max_tokens: 1 }) }); return response.ok; } catch { return false; } } async revokeKey(keyToRevoke: string): Promise { const response = await fetch(`${this.baseUrl}/revoke_auth_token`, { method: 'POST', headers: { 'Authorization': `Bearer ${this.currentKey}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ auth_token: keyToRevoke }) }); return response.ok; } async rotateKey(updateCallback?: (newKey: string) => Promise): Promise { console.log('Starting key rotation...'); // Step 1: Generate new key const timestamp = new Date().toISOString().replace(/[:.]/g, '-'); const newKeyData = await this.generateNewKey(`Rotated-${timestamp}`); const newKey = newKeyData.auth_token; console.log(`New key generated: ${newKey.substring(0, 10)}...`); // Step 2: Test new key if (!(await this.testKey(newKey))) { throw new Error('New key validation failed'); } console.log('New key validated successfully'); // Step 3: Update application if (updateCallback) { await updateCallback(newKey); console.log('Application updated with new key'); } // Step 4: Wait for propagation console.log('Waiting for propagation...'); await new Promise(resolve => setTimeout(resolve, 30000)); // Step 5: Revoke old key const oldKey = this.currentKey; this.currentKey = newKey; if (await this.revokeKey(oldKey)) { console.log('Old key revoked successfully'); } else { console.warn('Failed to revoke old key'); } console.log('Key rotation completed'); return newKey; } } // Usage example async function updateEnvironment(newKey: string): Promise { process.env.PERPLEXITY_API_KEY = newKey; // Update your secrets management system here // await updateAwsSecretsManager(newKey); // await updateKubernetesSecret(newKey); } // Perform rotation const rotator = new PerplexityKeyRotator(process.env.PERPLEXITY_API_KEY!); const newKey = await rotator.rotateKey(updateEnvironment); console.log(`Rotation complete. New key: ${newKey.substring(0, 10)}...`); ``` ## Best Practices Never hardcode API keys in your source code. Store them in environment variables or secure secret management systems. **Good**: `os.environ["PERPLEXITY_API_KEY"]` **Bad**: `api_key = "pplx-1234567890abcdef"` Rotate your API keys regularly (e.g., every 90 days) to minimize the impact of potential compromises. Set up automated rotation scripts to ensure zero downtime during the rotation process. Always set `token_name` when generating a key. After creation, the name is the primary way to identify a key, since the full token value is no longer visible. Examples: "Production-Main", "Development-Testing", "CI/CD-Pipeline" Track which keys are being used in your applications and revoke unused keys promptly. Maintain an inventory of active keys and their purposes. ## Security Considerations **Never expose API keys in:** * Client-side JavaScript code * Mobile applications * Public repositories * Log files or error messages * URLs or query parameters ### If a Key is Compromised 1. **Immediately generate a new key** using `/generate_auth_token` 2. **Update all applications** to use the new key 3. **Revoke the compromised key** using `/revoke_auth_token` 4. **Review access logs** to identify any unauthorized usage 5. **Implement additional security measures** such as IP allowlisting if available ## Troubleshooting | Issue | Solution | | -------------------------------------- | ---------------------------------------------------------------- | | "Authentication failed" after rotation | Ensure the new key has propagated to all service instances | | Cannot revoke a key | Verify you're using a valid API key with appropriate permissions | | Key generation fails | Check your account status and API tier limits | | Services still using old key | Implement proper secret rotation in your deployment pipeline | For additional support with API key management, visit the [API Platform console](https://console.perplexity.ai) or contact our support team. # Rate Limits & Usage Tiers Source: https://docs.perplexity.ai/docs/admin/rate-limits-usage-tiers ## What are Usage Tiers? Usage tiers determine your **rate limits** and access to **beta features** based on your cumulative API spending. As you spend more on API credits over time, you automatically advance to higher tiers with increased rate limits. Higher tiers unlock significantly more requests per minute, and once you reach a tier, you keep it permanently with no downgrade. You can check your current usage tier by visiting the [API Platform console](https://console.perplexity.ai). *** ## Tier Progression | Tier | Total Credits Purchased | Status | | ---------- | ----------------------- | ---------------------------- | | **Tier 0** | \$0 | New accounts, limited access | | **Tier 1** | \$50+ | Light usage, basic limits | | **Tier 2** | \$250+ | Regular usage | | **Tier 3** | \$500+ | Heavy usage | | **Tier 4** | \$1,000+ | Production usage | | **Tier 5** | \$5,000+ | Enterprise usage | Tiers are based on **cumulative purchases** across your account lifetime, not current balance. *** ## Agent API Rate Limits The Agent API uses tier-based rate limits that scale with your usage tier: | Tier | QPS (Queries per Second) | Requests per Minute | | :--------: | :----------------------: | :-----------------: | | **Tier 0** | 1 QPS | 50/min | | **Tier 1** | 3 QPS | 150/min | | **Tier 2** | 8 QPS | 500/min | | **Tier 3** | 17 QPS | 1,000/min | | **Tier 4** | 33 QPS | 2,000/min | | **Tier 5** | 33 QPS | 2,000/min | *** ## Search API Rate Limits The Search API has separate rate limits that apply to all usage tiers: | Endpoint | Rate Limit | Burst Capacity | | -------------- | ---------------------- | -------------- | | POST `/search` | 50 requests per second | 50 requests | **Search Rate Limiter Behavior:** * **Burst**: Can handle 50 requests instantly * **Sustained**: Exactly 50 QPS average over time Search rate limits are independent of your usage tier and apply consistently across all accounts using the same leaky bucket algorithm. *** ## Sonar API Rate Limits The Sonar API uses tier-based rate limits that scale with your usage tier: | Model | Requests per minute (RPM) | | ---------------------------------- | ------------------------- | | `sonar-deep-research` | 5 | | `sonar-reasoning-pro` | 50 | | `sonar-pro` | 50 | | `sonar` | 50 | | POST `/v1/async/sonar` | 5 | | GET `/v1/async/sonar` | 3000 | | GET `/v1/async/sonar/{request_id}` | 6000 | | Model | Requests per minute (RPM) | | ---------------------------------- | ------------------------- | | `sonar-deep-research` | 10 | | `sonar-reasoning-pro` | 150 | | `sonar-pro` | 150 | | `sonar` | 150 | | POST `/v1/async/sonar` | 10 | | GET `/v1/async/sonar` | 3000 | | GET `/v1/async/sonar/{request_id}` | 6000 | | Model | Requests per minute (RPM) | | ---------------------------------- | ------------------------- | | `sonar-deep-research` | 20 | | `sonar-reasoning-pro` | 500 | | `sonar-pro` | 500 | | `sonar` | 500 | | POST `/v1/async/sonar` | 20 | | GET `/v1/async/sonar` | 3000 | | GET `/v1/async/sonar/{request_id}` | 6000 | | Model | Requests per minute (RPM) | | ---------------------------------- | ------------------------- | | `sonar-deep-research` | 40 | | `sonar-reasoning-pro` | 1,000 | | `sonar-pro` | 1,000 | | `sonar` | 1,000 | | POST `/v1/async/sonar` | 40 | | GET `/v1/async/sonar` | 3000 | | GET `/v1/async/sonar/{request_id}` | 6000 | | Model | Requests per minute (RPM) | | ---------------------------------- | ------------------------- | | `sonar-deep-research` | 60 | | `sonar-reasoning-pro` | 4,000 | | `sonar-pro` | 4,000 | | `sonar` | 4,000 | | POST `/v1/async/sonar` | 60 | | GET `/v1/async/sonar` | 3000 | | GET `/v1/async/sonar/{request_id}` | 6000 | | Model | Requests per minute (RPM) | | ---------------------------------- | ------------------------- | | `sonar-deep-research` | 100 | | `sonar-reasoning-pro` | 4,000 | | `sonar-pro` | 4,000 | | `sonar` | 4,000 | | POST `/v1/async/sonar` | 100 | | GET `/v1/async/sonar` | 3000 | | GET `/v1/async/sonar/{request_id}` | 6000 | *** ## Embeddings API Rate Limits The Embeddings API uses tier-based rate limits that scale with your usage tier. Limits are higher than other APIs because each request is a single forward pass on an elastic backend. | Tier | QPS (Queries per Second) | | :-----------: | :----------------------: | | **Tier 0** | 85 QPS | | **Tiers 1–3** | 170 QPS | | **Tiers 4–5** | 335 QPS | ### Contextualized Embeddings Contextualized embeddings have separate, higher limits (5× the standard embeddings tiers): | Tier | QPS (Queries per Second) | | :-----------: | :----------------------: | | **Tier 0** | 415 QPS | | **Tiers 1–3** | 835 QPS | | **Tiers 4–5** | 1,670 QPS | Contextualized embeddings are rate limited by **total chunks**, not by request count. *** ## How Rate Limiting Works Our rate limiting system uses a **leaky bucket algorithm** that allows for burst traffic while maintaining strict long-term rate control. ### Technical Implementation The leaky bucket algorithm works like a bucket with a small hole in the bottom: * **Bucket Capacity**: Maximum number of requests you can make instantly (burst capacity) * **Leak Rate**: How quickly tokens leak out of the bucket (your rate limit) * **Token Refill**: Tokens refill continuously at regular intervals based on your rate limit This design allows legitimate burst traffic when you need it, prevents sustained abuse, and ensures predictable and fair rate enforcement across all users. Let's examine how **50 requests per second** works in practice. With a capacity of 50 tokens and a leak rate of 50 tokens per second, one token refills every 20ms. **Scenario 1: Burst Traffic** ``` Time 0.0s: Bucket full (50 tokens) → Send 50 requests instantly → ALL ALLOWED → Send 51st request → REJECTED (bucket empty) Time 0.020s: 1 token refilled → Send 1 request → ALLOWED → Send 2nd request → REJECTED Time 0.040s: 1 more token refilled → Send 1 request → ALLOWED ``` **Scenario 2: Steady 50 QPS** ``` Request every 20ms: Time 0.0s: Request → ALLOWED (50→49 tokens) Time 0.020s: Request → ALLOWED (49+1-1=49 tokens) Time 0.040s: Request → ALLOWED (49+1-1=49 tokens) ... maintains 49-50 tokens, all requests pass ``` **Scenario 3: Slightly Over 50 QPS** ``` Request every 19ms (≈52.6 QPS): → Eventually tokens deplete faster than refill → Some requests start getting rejected → Achieves exactly 50 QPS on average ``` The leaky bucket design means you can handle your full rate limit instantly, making it perfect for batch operations or sudden traffic spikes. There's no need to artificially spread requests when you have available burst capacity. The system enforces strict average rate limits over time while allowing quick recovery after burst usage. This provides consistent performance across different usage patterns and prevents sustained over-limit usage while maintaining fair resource allocation. When building your application, take advantage of burst capacity for batch operations, monitor your usage patterns to optimize request timing, and implement proper error handling for 429 responses. *** ## What Happens When You Hit Rate Limits? When you exceed your rate limits: 1. **429 Error** - Your request gets rejected with "Too Many Requests" 2. **Continuous Refill** - Tokens refill continuously based on your rate limit 3. **Immediate Recovery** - New requests become available as soon as tokens refill **Example Recovery Times:** * **50 QPS limit**: 1 token refills every 20ms * **500 QPS limit**: 1 token refills every 2ms * **1,000 QPS limit**: 1 token refills every 1ms **Best Practices:** * Monitor your usage to predict when you'll need higher tiers * Consider upgrading your tier proactively for production applications * Implement exponential backoff with jitter in your code * Take advantage of burst capacity for batch operations * Don't artificially spread requests if you have available burst capacity *** ## Upgrading Your Tier Visit the [API Platform console](https://console.perplexity.ai) to see your current tier and total spending. Add credits to your account through the billing section. Your tier will automatically upgrade once you reach the spending threshold. Your new rate limits take effect immediately after the tier upgrade. Check your settings page to confirm. If you require custom rate limits beyond Tier 5, [fill out our rate limit increase request form](https://perplexity.typeform.com/to/yctmfyVT) and we'll review your use case to accommodate your needs. Higher tiers significantly improve your API experience with increased rate limits, especially important for production applications. Need custom rate limits beyond your current tier? Fill out our rate limit increase request form and we'll review your use case to accommodate your needs. # Search Filters Source: https://docs.perplexity.ai/docs/agent-api/filters Control and customize Agent API search results with filters Control which search results are returned by applying filters to your web search queries. Filters help you focus on specific domains, time periods, or geographic locations to get more relevant results. ## Domain Filters Domain filters allow you to include or exclude specific domains or URLs from search results. Use allowlist mode to restrict results to trusted sources, or denylist mode to filter out unwanted domains. You can add a maximum of 20 domains or URLs to the `search_domain_filter` list. The filter works in either allowlist mode (include only) or denylist mode (exclude), but not both simultaneously. **Allowlist mode**: Include only the specified domains/URLs (no `-` prefix)\ **Denylist mode**: Exclude the specified domains/URLs (use `-` prefix) You can filter at the domain level (e.g., `wikipedia.org`) or URL level (e.g., `https://en.wikipedia.org/wiki/Chess`) for granular control. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="fast-search", input="Tell me about the James Webb Space Telescope discoveries.", instructions="You are a helpful assistant.", tools=[ { "type": "web_search", "filters": { "search_domain_filter": [ "nasa.gov", "wikipedia.org", "space.com" ] } } ] ) print(response.output_text) ``` ## Date & Time Filters Date and time filters help you find content published or updated within specific time periods. You can filter by publication date, last updated date, or use recency filters for relative time periods. **Publication date filters**: Filter by when content was originally published * `search_after_date_filter`: Include content published after this date * `search_before_date_filter`: Include content published before this date **Last updated filters**: Filter by when content was last modified * `last_updated_after_filter`: Include content updated after this date * `last_updated_before_filter`: Include content updated before this date **Recency filter**: Filter by relative time periods * `search_recency_filter`: Use `"hour"`, `"day"`, `"week"`, `"month"`, or `"year"` for content from the past hour, 24 hours, 7 days, 30 days, or 365 days. Use `hour` for real-time data such as breaking news or live events. Specific date filters must be provided in the "%m/%d/%Y" format (e.g., "3/1/2025"). Recency filters use predefined values like "hour", "day", "week", "month", or "year". ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="pro-search", input="What are the latest AI developments?", instructions="You are an expert on current events.", tools=[ { "type": "web_search", "filters": { "search_recency_filter": "hour" } } ] ) print(response.output_text) # Week recency response = client.responses.create( preset="pro-search", input="What are the latest AI developments?", instructions="You are an expert on current events.", tools=[ { "type": "web_search", "filters": { "search_recency_filter": "week" } } ] ) print(response.output_text) ``` ## Location Filters Location filters tailor search results based on geographic context. This is useful for finding local businesses, regional news, or location-specific information. You can specify location using: * **Country code**: Two-letter ISO 3166-1 alpha-2 code (e.g., `"US"`, `"FR"`) * **City and region**: Improve accuracy with city and region names * **Coordinates**: Latitude and longitude for precise location targeting The `city` and `region` fields significantly improve location accuracy. We strongly recommend including them alongside coordinates and country code for the best results. Latitude and longitude must be provided alongside the country parameter—they cannot be provided on their own. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="pro-search", input="What are some good coffee shops nearby?", instructions="You are a helpful local guide.", tools=[ { "type": "web_search", "user_location": { "country": "US", "region": "California", "city": "San Francisco", "latitude": 37.7749, "longitude": -122.4194 } } ] ) print(response.output_text) ``` ## Combining Filters You can combine multiple filter types in a single request to create highly targeted searches. For example, you might restrict results to specific domains published within a recent time period, or filter by location and date range together. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="pro-search", input="Latest tech news from trusted sources.", instructions="You are an expert on technology.", tools=[ { "type": "web_search", "filters": { "search_domain_filter": ["techcrunch.com", "theverge.com"], "search_recency_filter": "week" }, "user_location": { "country": "US" } } ] ) print(response.output_text) ``` ## Next Steps Get started with the Agent API. Explore direct model selection and third-party models. # Image Attachments Source: https://docs.perplexity.ai/docs/agent-api/image-attachments Learn how to upload and analyze images using base64 encoding or HTTPS URLs ## Overview The Agent API supports image analysis through direct image uploads. Images can be provided either as base64 encoded strings within a data URI or as standard HTTPS URLs. * When using base64 encoding, the API currently only supports images up to 50 MB per image. * Supported formats for base64 encoded images: PNG (image/png), JPEG (image/jpeg), WEBP (image/webp), and GIF (image/gif). * When using an HTTPS URL, the model will attempt to fetch the image from the provided URL. Ensure the URL is publicly accessible. ## Examples Use this method when you have the image file locally and want to embed it directly into the request payload. Remember the 50MB size limit and supported formats (PNG, JPEG, WEBP, GIF). ```python Python theme={null} import base64 from perplexity import Perplexity client = Perplexity() # Read and encode image as base64 def encode_image(image_path): with open(image_path, "rb") as image_file: return base64.b64encode(image_file.read()).decode("utf-8") image_path = "image.png" base64_image = encode_image(image_path) # Analyze the image response = client.responses.create( model="openai/gpt-5.5", input=[ { "role": "user", "content": [ {"type": "input_text", "text": "what's in this image?"}, { "type": "input_image", "image_url": f"data:image/png;base64,{base64_image}", }, ], } ], ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; import * as fs from 'fs'; const client = new Perplexity(); // Read and encode image as base64 const imageBuffer = fs.readFileSync('image.png'); const base64Image = imageBuffer.toString('base64'); const imageDataUri = `data:image/png;base64,${base64Image}`; // Analyze the image const response = await client.responses.create({ model: 'openai/gpt-5-mini', input: [ { role: 'user', content: [ { type: 'input_text', text: "What's in this image?" }, { type: 'input_image', image_url: imageDataUri } ] } ], } as any); console.log(response.output_text); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5-mini", "input": [ { "role": "user", "content": [ { "type": "input_text", "text": "What'\''s in this image?" }, { "type": "input_image", "image_url": "data:image/png;base64,$BASE64_ENCODED_IMAGE" } ] } ] }' | jq ``` Use this method when you have a publicly accessible image URL. The model will fetch the image from the provided URL. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" # Analyze the image response = client.responses.create( model="openai/gpt-5.5", input=[ { "role": "user", "content": [ {"type": "input_text", "text": "Can you describe the image at this URL?"}, { "type": "input_image", "image_url": image_url, }, ], } ], ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const imageHttpsUrl = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"; // Analyze the image const response = await client.responses.create({ model: 'openai/gpt-5-mini', input: [ { role: 'user', content: [ { type: 'input_text', text: 'Can you describe the image at this URL?' }, { type: 'input_image', image_url: imageHttpsUrl } ] } ], } as any); console.log(response.output_text); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5-mini", "input": [ { "role": "user", "content": [ { "type": "input_text", "text": "Can you describe the image at this URL?" }, { "type": "input_image", "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" } ] } ] }' | jq ``` ## Request Format ### Agent API Images must be embedded in the `input` array when using message array format. Each image should be provided using the following structure: ```json theme={null} { "role": "user", "content": [ { "type": "input_text", "text": "What's in this image?" }, { "type": "input_image", "image_url": "" } ] } ``` The `image_url` field accepts either: * **A URL of the image**: A publicly accessible HTTPS URL pointing directly to the image file * **The base64 encoded image data**: A data URI in the format `data:image/{format};base64,{base64_content}` ## Pricing Images are tokenized based on their pixel dimensions using the following formula: ``` tokens = (width px × height px) / 750 ``` **Examples:** * A 1024×768 image would consume: (1024 × 768) / 750 = 1,048 tokens * A 512×512 image would consume: (512 × 512) / 750 = 349 tokens These image tokens are then priced according to the input token pricing of the model you're using. The image tokens are added to your total token count for the request alongside any text tokens. ## Next Steps Get started with the Agent API Learn about the `web_search` tool. # Model Fallback Source: https://docs.perplexity.ai/docs/agent-api/model-fallback Specify multiple models in a fallback chain for higher availability and automatic failover. ## Overview Model fallback enables specifying multiple models in a `models` array. The API tries each model in order until one succeeds, providing automatic failover when a model is unavailable. ## How It Works Provide a `models` array containing up to 5 models: 1. The API tries the first model in the array 2. If it fails or is unavailable, the next model is tried 3. This continues until one succeeds or all models are exhausted The `models` array takes precedence over the single `model` field when both are provided. **Benefits:** * **Higher availability**: Automatic failover when primary model is unavailable * **Provider redundancy**: Use models from different providers for maximum reliability * **Seamless operation**: No code refactoring needed, fallback is handled automatically by the API ## Basic Example ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( models=["openai/gpt-5.5", "openai/gpt-5.4", "openai/gpt-5-mini"], input="What are the latest developments in AI?", instructions="You have access to a web_search tool. Use it for questions about current events.", ) print(f"Model used: {response.model}") ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ models: ["openai/gpt-5.5", "openai/gpt-5.4", "openai/gpt-5-mini"], input: "What are the latest developments in AI?", instructions: "You have access to a web_search tool. Use it for questions about current events.", }); console.log(`Model used: ${response.model}`); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "models": ["openai/gpt-5.5", "openai/gpt-5.4", "openai/gpt-5-mini"], "input": "What are the latest developments in AI?", "instructions": "You have access to a web_search tool. Use it for questions about current events." }' ``` ## Cross-Provider Fallback For maximum reliability, use models from different providers: ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( models=[ "openai/gpt-5.5", "anthropic/claude-sonnet-4-6", "google/gemini-3-flash-preview" ], input="What are the main architectural differences between x86 and ARM processors?", ) ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ models: [ "openai/gpt-5.5", "anthropic/claude-sonnet-4-6", "google/gemini-3-flash-preview" ], input: "What are the main architectural differences between x86 and ARM processors?", }); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "models": [ "openai/gpt-5.5", "anthropic/claude-sonnet-4-6", "google/gemini-3-flash-preview" ], "input": "What are the main architectural differences between x86 and ARM processors?" }' ``` ## Pricing Billing is based on the model that serves the request, not all models in the fallback chain. The `model` field in the response indicates which model was used, and the `usage` field shows the token counts for that model. **Request:** ```json theme={null} { "models": ["openai/gpt-5.5", "openai/gpt-5.4"], "input": "..." } ``` **Response** (if first model failed): ```json theme={null} { "model": "openai/gpt-5.5", "usage": { "input_tokens": 150, "output_tokens": 320, "total_tokens": 470 } } ``` In this case, billing is based on `gpt-5.1` pricing for 470 tokens. Place preferred models first in the array. Consider pricing differences when ordering the fallback chain. ## Next Steps Explore available models and their pricing. Explore available presets and their configurations. Get started with your first Agent API call. View complete endpoint documentation. # Models Source: https://docs.perplexity.ai/docs/agent-api/models Explore available presets and third-party models for the Agent API, including Perplexity presets and third-party model support. ## Available Models The Agent API supports direct access to models from multiple providers. All models are accessed directly from first-party providers with transparent token-based pricing. Pricing rates are updated monthly and **reflect direct first-party provider pricing with no markup**. All charges are based on actual token consumption, and every API response includes exact token counts so you know your costs per request. Looking for pre-configured model setups? See [**Presets**](/docs/agent-api/presets) — optimized for specific use cases. Sonar — Perplexity's grounded search model. | Model | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs | | ------------------ | ------------- | -------------- | ------------- | ----------------------------------------------------------- | | `perplexity/sonar` | 0.25 | 2.50 | 0.0625 | [Sonar](https://docs.perplexity.ai/docs/sonar/models/sonar) | Claude Opus (highest reasoning), Sonnet (balanced), and Haiku (fastest, cheapest). | Model | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs | | ----------------------------- | ------------- | -------------- | ------------- | --------------------------------------------------------------------- | | `anthropic/claude-opus-4-7` | 5 | 25 | 0.50 | [Claude Opus 4.7](https://www.anthropic.com/news/claude-opus-4-7) | | `anthropic/claude-opus-4-6` | 5 | 25 | 0.50 | [Claude Opus 4.6](https://www.anthropic.com/news/claude-opus-4-6) | | `anthropic/claude-opus-4-5` | 5 | 25 | 0.50 | [Claude Opus 4.5](https://www.anthropic.com/news/claude-opus-4-5) | | `anthropic/claude-sonnet-4-6` | 3 | 15 | 0.30 | [Claude Sonnet 4.6](https://www.anthropic.com/news/claude-sonnet-4-6) | | `anthropic/claude-sonnet-4-5` | 3 | 15 | 0.30 | [Claude Sonnet 4.5](https://www.anthropic.com/news/claude-sonnet-4-5) | | `anthropic/claude-haiku-4-5` | 1 | 5 | 0.10 | [Claude Haiku 4.5](https://www.anthropic.com/news/claude-haiku-4-5) | GPT-5 family — flagship, mini, and nano variants. | Model | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs | | --------------------- | ------------- | -------------- | ------------- | -------------------------------------------------------------------- | | `openai/gpt-5.5` | 5.00 | 30.00 | 0.50 | [GPT-5.5](https://developers.openai.com/api/docs/models/gpt-5.5) | | `openai/gpt-5.4` | 2.50 | 15.00 | 0.25 | [GPT-5.4](https://platform.openai.com/docs/models/gpt-5.4) | | `openai/gpt-5.4-mini` | 0.75 | 4.50 | 0 | [GPT-5.4 Mini](https://platform.openai.com/docs/models/gpt-5.4-mini) | | `openai/gpt-5.4-nano` | 0.20 | 1.25 | 0 | [GPT-5.4 Nano](https://platform.openai.com/docs/models/gpt-5.4-nano) | | `openai/gpt-5.2` | 1.75 | 14 | 0.175 | [GPT-5.2](https://platform.openai.com/docs/models/gpt-5.2) | | `openai/gpt-5.1` | 1.25 | 10 | 0.125 | [GPT-5.1](https://platform.openai.com/docs/models/gpt-5.1) | | `openai/gpt-5` | 1.25 | 10 | 0.125 | [GPT-5](https://platform.openai.com/docs/models/gpt-5) | | `openai/gpt-5-mini` | 0.25 | 2 | 0.025 | [GPT-5 Mini](https://platform.openai.com/docs/models/gpt-5-mini) | Gemini 3 family — Pro for long-context, Flash and Flash Lite for speed. | Model | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs | | -------------------------------------- | ------------------------------ | -------------------------------- | ------------- | ----------------------------------------------------------------------------------------------------------- | | `google/gemini-3.1-pro-preview` | 2.00 (≤200k)
4.00 (>200k) | 12.00 (≤200k)
18.00 (>200k) | 90% off input | [Gemini 3.1 Pro](https://ai.google.dev/gemini-api/docs/models#gemini-3.1-pro-preview) | | `google/gemini-3.1-flash-lite` | 0.25 | 1.50 | 90% off input | [Gemini 3.1 Flash Lite](https://ai.google.dev/gemini-api/docs/models/gemini-3.1-flash-lite) | | `google/gemini-3.1-flash-lite-preview` | 0.25 | 1.50 | 90% off input | [Gemini 3.1 Flash Lite Preview](https://ai.google.dev/gemini-api/docs/models#gemini-3.1-flash-lite-preview) | | `google/gemini-3.5-flash` | 1.50 | 9.00 | 0.15 | [Gemini 3.5 Flash](https://ai.google.dev/gemini-api/docs/models/gemini-3.5-flash) | | `google/gemini-3-flash-preview` | 0.50 | 3.00 | 90% off input | [Gemini 3.0 Flash](https://ai.google.dev/gemini-api/docs/models#gemini-3-flash-preview) |
Grok 4.3 and 4.20 variants — reasoning, non-reasoning, and multi-agent. | Model | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs | | ----------------------------- | ------------- | -------------- | ------------- | -------------------------------------------------------------- | | `xai/grok-4.3` | 1.25 | 2.50 | 0.20 | [Grok 4.3](https://docs.x.ai/developers/models) | | `xai/grok-4.20-reasoning` | 1.25 | 2.50 | 0.20 | [Grok 4.20 Reasoning](https://docs.x.ai/developers/models) | | `xai/grok-4.20-non-reasoning` | 1.25 | 2.50 | 0.20 | [Grok 4.20 Non Reasoning](https://docs.x.ai/developers/models) | | `xai/grok-4.20-multi-agent` | 1.25 | 2.50 | 0.20 | [Grok 4.20 Multi-Agent](https://docs.x.ai/developers/models) | Nemotron 3 Super — NVIDIA's open-weight reasoning model. | Model | Input (\$/1M) | Output (\$/1M) | Cache (\$/1M) | Docs | | ----------------------------------- | ------------- | -------------- | ------------- | ------------------------------------------------------------------------------ | | `nvidia/nemotron-3-super-120b-a12b` | 0.25 | 2.50 | — | [Nemotron 3 Super 120B](https://research.nvidia.com/labs/nemotron/Nemotron-3/) |
Not all third-party models support all features (e.g., reasoning, tools). Check model documentation for specific capabilities. ## Using a Model ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5.5", input="Explain the difference between supervised and unsupervised learning in machine learning.", max_output_tokens=300, ) print(f"Response ID: {response.id}") print(response.output_text) ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "openai/gpt-5.5", input: "Explain the difference between supervised and unsupervised learning in machine learning.", max_output_tokens: 300, }); console.log(`Response ID: ${response.id}`); console.log(response.output_text); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.5", "input": "Explain the difference between supervised and unsupervised learning in machine learning.", "max_output_tokens": 300 }' | jq ``` **See Your Costs in Real-Time:** Every response includes a `usage` field with exact input tokens, output tokens, and cache read tokens. Calculate your cost instantly using the pricing table above. ```json theme={null} { "usage": { "input_tokens": 150, "output_tokens": 320, "total_tokens": 470 } } ``` ## Model Fallback For high-availability applications, you can specify multiple models in a fallback chain. When one model fails or is unavailable, the API automatically tries the next model in the chain. Learn how to use model fallback chains to ensure high availability and reliability by automatically trying multiple models when one fails. **Example:** ```python theme={null} response = client.responses.create( models=["openai/gpt-5.5", "anthropic/claude-sonnet-4-6", "google/gemini-3-flash-preview"], input="Your question here" ) ``` For detailed examples, pricing information, and best practices, see the [Model Fallback documentation](/docs/agent-api/model-fallback). ## Next Steps Equip your model with web search for source-grounded context. Write prompts that get the most out of the Agent API. Shape responses with structured outputs and JSON schemas. Query market data, filings, and ticker-level information. # OpenAI Compatibility Source: https://docs.perplexity.ai/docs/agent-api/openai-compatibility Use your existing OpenAI SDKs with Perplexity's Agent API. Full compatibility with minimal code changes. ## Overview Perplexity's Agent API is fully compatible with OpenAI's Responses API interface. You can use your existing OpenAI client libraries by simply changing the base URL and providing your Perplexity API key. **Endpoint Note:** Perplexity's canonical Agent API endpoint is `POST /v1/agent`. For OpenAI SDK compatibility, `POST /v1/responses` is also accepted as an alias — the OpenAI SDK automatically routes `client.responses.create()` to `/v1/responses`, which Perplexity handles seamlessly. No SDK changes are needed beyond setting the base URL. **We recommend using the [Perplexity SDK](/docs/sdk/overview)** for the best experience with full type safety, enhanced features, and preset support. Use OpenAI SDKs if you're already integrated and need drop-in compatibility. ## Quick Start Use the OpenAI SDK with Perplexity's Agent API: ```python theme={null} import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) response = client.responses.create( model="openai/gpt-5.5", input="Explain the key differences between REST and GraphQL APIs" ) print(response.output_text) ``` ```typescript theme={null} import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1" }); const response = await client.responses.create({ model: "openai/gpt-5-mini", input: "Explain the key differences between REST and GraphQL APIs" }); console.log(response.output_text); ``` ## Configuration ### Setting Up the OpenAI SDK Configure OpenAI SDKs to work with Perplexity by setting the `base_url` to `https://api.perplexity.ai/v1`: ```python theme={null} import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) ``` ```typescript theme={null} import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1" }); ``` **Important**: Use `base_url="https://api.perplexity.ai/v1"` (with `/v1`) for the Agent API. ## Agent API Perplexity's Agent API follows OpenAI's Responses API request/response format. The OpenAI SDK's `client.responses.create()` method works out of the box — the SDK sends requests to `/v1/responses`, which Perplexity accepts alongside the canonical `/v1/agent` endpoint. ### Basic Usage ```python theme={null} import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) response = client.responses.create( model="openai/gpt-5.5", input="Explain the key differences between REST and GraphQL APIs" ) print(response.output_text) print(f"Response ID: {response.id}") ``` ```typescript theme={null} import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1" }); const response = await client.responses.create({ model: "openai/gpt-5-mini", input: "Explain the key differences between REST and GraphQL APIs" }); console.log(response.output_text); console.log(`Response ID: ${response.id}`); ``` ### Using Presets Presets are pre-configured setups optimized for specific use cases. Use `extra_body` to pass presets via the OpenAI SDK: ```python theme={null} import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) # Pass preset via extra_body response = client.responses.create( input="What are the key differences between the latest iPhone and Samsung Galaxy flagship phones?", extra_body={ "preset": "pro-search" } ) print(response.output_text) ``` ```typescript theme={null} import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1" }); // Use type casting (as any) to pass preset directly const response = await (client.responses.create as any)({ input: "What are the key differences between the latest iPhone and Samsung Galaxy flagship phones?", preset: "pro-search" }); console.log(response.output_text); ``` See [Agent API Presets](/docs/agent-api/presets) for available presets and their configurations. ### Using Third-Party Models You can also specify third-party models directly instead of using presets: ```python theme={null} import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) response = client.responses.create( model="openai/gpt-5.5", input="Explain the key differences between REST and GraphQL APIs" ) print(response.output_text) ``` ```typescript theme={null} import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1" }); const response = await client.responses.create({ model: "openai/gpt-5-mini", input: "Explain the key differences between REST and GraphQL APIs" }); console.log(response.output_text); ``` ### Streaming Responses Streaming works with the Agent API: ```python theme={null} import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) response = client.responses.create( model="openai/gpt-5.5", input="Write a bedtime story about a unicorn.", stream=True ) for event in response: if event.type == "response.output_text.delta": print(event.delta, end="", flush=True) ``` ```typescript theme={null} import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1" }); const response = await client.responses.create({ model: "openai/gpt-5-mini", input: "Write a bedtime story about a unicorn.", stream: true }); for await (const event of response) { if (event.type === "response.output_text.delta") { process.stdout.write(event.delta); } } ``` ### Using Tools The Agent API supports built-in tools, including web search. Use `extra_body` to pass tools via the OpenAI SDK: ```python theme={null} import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) # Pass tools via extra_body response = client.responses.create( model="openai/gpt-5.5", input="Which companies announced the largest AI acquisitions this quarter?", extra_body={ "tools": [ { "type": "web_search", "filters": { "search_domain_filter": ["techcrunch.com", "crunchbase.com"] } } ] } ) print(response.output_text) ``` ```typescript theme={null} import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1" }); // Use type casting (as any) to pass tools via extra_body const response = await (client.responses.create as any)({ model: "openai/gpt-5-mini", input: "Which companies announced the largest AI acquisitions this quarter?", extra_body: { "tools": [ { "type": "web_search", "filters": { "search_domain_filter": ["techcrunch.com", "crunchbase.com"] } } ] } }); console.log(response.output_text); ``` ## API Compatibility ### Standard OpenAI Parameters These parameters work exactly the same as OpenAI's API: **Agent API:** * `model` - Model name (use 3rd party models like `openai/gpt-5.5`) * `input` - Input text or message array * `instructions` - System instructions * `max_output_tokens` - Maximum tokens in response * `stream` - Enable streaming responses * `tools` - Array of tools including `web_search` ### Perplexity-Specific Parameters **Agent API:** * `preset` - Preset name (use Perplexity presets like `pro-search`) * `tools[].filters` - Search filters within web\_search tool * `tools[].user_location` - User location for localized results See [Agent API Reference](/api-reference/agent-post) for complete parameter details. ## Endpoint Mapping | Method | Perplexity Endpoint | OpenAI Equivalent | Notes | | --------------------------- | ------------------- | -------------------- | ------------------------------------------------------------- | | `client.responses.create()` | `POST /v1/agent` | `POST /v1/responses` | Both paths accepted by Perplexity for compatibility | | `client.models.list()` | `GET /v1/models` | `GET /v1/models` | Lists available Agent API models. No authentication required. | When using the OpenAI SDK, `client.responses.create()` sends requests to `/v1/responses`. Perplexity accepts this path as an alias for `/v1/agent`, so no SDK configuration changes are needed beyond `base_url`. ### Model Discovery The `GET /v1/models` endpoint returns all models available for the Agent API in OpenAI-compatible format. No authentication is required. ```python theme={null} import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) models = client.models.list() for model in models.data: print(f"{model.id} (owned by {model.owned_by})") ``` ```typescript theme={null} import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1" }); const models = await client.models.list(); for (const model of models.data) { console.log(`${model.id} (owned by ${model.owned_by})`); } ``` ```bash theme={null} curl https://api.perplexity.ai/v1/models ``` This endpoint is compatible with tools like [Open WebUI](https://openwebui.com/), [Cherry Studio](https://cherry-ai.com/), and [LiteLLM](https://litellm.ai/) that auto-discover available models via the OpenAI `/v1/models` endpoint. ## Response Structure ### Agent API Perplexity's Agent API matches OpenAI's Responses API response format: * `output` - Structured output array containing messages with `content[].text` * `model` - The model name used * `usage` - Token consumption details * `id`, `created_at`, `status` - Response metadata ## Best Practices Always use `https://api.perplexity.ai/v1` (with `/v1`) for the Agent API. ```python theme={null} client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" # Correct ) ``` Use the OpenAI SDK's error handling: ```python theme={null} import os from openai import OpenAI, APIError, RateLimitError client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) try: response = client.responses.create( model="openai/gpt-5.5", input="Hello" ) except RateLimitError: print("Rate limit exceeded, please retry later") except APIError as e: print(f"API error: {e.message}") ``` Stream responses for real-time user experience: ```python theme={null} response = client.responses.create( model="openai/gpt-5.5", input="Long query...", stream=True ) for event in response: if event.type == "response.output_text.delta": print(event.delta, end="", flush=True) ``` ## Recommended: Perplexity SDK We recommend using Perplexity's native SDKs for the best developer experience: * **Cleaner preset syntax** - Use `preset="pro-search"` directly instead of `extra_body={"preset": "pro-search"}` * **Type safety** - Full Typescript/Python type definitions for all parameters * **Enhanced features** - Direct access to all Perplexity-specific features * **Better error messages** - Perplexity-specific error handling * **Simpler setup** - No need to configure base URLs See the [Perplexity SDK Guide](/docs/sdk/overview) for details. ## Migrating to the Perplexity SDK Switch to the Perplexity SDK for enhanced features and cleaner syntax. With the Perplexity SDK, you can use presets directly without `extra_body` and get full type safety: ```bash theme={null} pip install perplexityai ``` ```bash theme={null} npm install @perplexity-ai/perplexity_ai ``` ```python theme={null} # Before (OpenAI SDK) import os from openai import OpenAI client = OpenAI( api_key=os.environ.get("PERPLEXITY_API_KEY"), base_url="https://api.perplexity.ai/v1" ) # After (Perplexity SDK) from perplexity import Perplexity client = Perplexity() # reads PERPLEXITY_API_KEY env var automatically ``` ```typescript theme={null} // Before (OpenAI SDK) import OpenAI from 'openai'; const openaiClient = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1" }); // After (Perplexity SDK) import Perplexity from '@perplexity-ai/perplexity_ai'; const perplexityClient = new Perplexity(); // reads PERPLEXITY_API_KEY env var automatically ``` **No base URL needed** - The Perplexity SDK automatically uses the correct endpoint. The API calls are very similar: ```python theme={null} # Agent API - same interface response = client.responses.create( model="openai/gpt-5.5", input="Hello!" ) ``` ```typescript theme={null} // Agent API - same interface const response = await client.responses.create({ model: "openai/gpt-5-mini", input: "Hello!" }); ``` The Perplexity SDK supports presets with cleaner syntax compared to OpenAI SDK: ```python theme={null} # Before (OpenAI SDK) - extra_body required response = client.responses.create( input="What were the biggest tech IPOs this year and how did they perform on day one?", extra_body={"preset": "pro-search"} ) # After (Perplexity SDK) - direct parameter response = client.responses.create( preset="pro-search", input="What were the biggest tech IPOs this year and how did they perform on day one?" ) ``` ```typescript theme={null} // Before (OpenAI SDK) - type casting required const response = await client.responses.create({ input: "What were the biggest tech IPOs this year and how did they perform on day one?", preset: "pro-search" } as any); // After (Perplexity SDK) - fully typed const response = await client.responses.create({ preset: "pro-search", input: "What were the biggest tech IPOs this year and how did they perform on day one?" }); ``` ## Next Steps Get started with Agent API using OpenAI SDKs. Explore direct model selection and third-party models. View complete endpoint documentation. Configure streaming responses and structured outputs with JSON schema. Specify multiple models for automatic failover and higher availability. Apply filters to web search results. # Output Control Source: https://docs.perplexity.ai/docs/agent-api/output-control Streaming and structured outputs for the Agent API ## Streaming Responses Streaming allows you to receive partial responses from the Perplexity API as they are generated, rather than waiting for the complete response. This is particularly useful for real-time user experiences, long responses, and interactive applications. Streaming is supported across all models available through the Agent API. To enable streaming, set `stream=True` (Python) or `stream: true` (TypeScript) when creating responses: ```python Python SDK theme={null} from perplexity import Perplexity client = Perplexity() # Create streaming response stream = client.responses.create( preset="fast-search", input="What is the latest in AI research?", stream=True ) # Process streaming response for event in stream: if event.type == "response.output_text.delta": print(event.delta, end="") elif event.type == "response.completed": print(f"\n\nCompleted: {event.response.usage}") ``` ```typescript TypeScript SDK theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); // Create streaming response const stream = await client.responses.create({ preset: "fast-search", input: "What is the latest in AI research?", stream: true }); // Process streaming response for await (const chunk of stream) { if (chunk.type === "response.output_text.delta") { process.stdout.write((chunk as any).delta); } } ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "fast-search", "input": "What is the latest in AI research?", "stream": true }' ``` ### Error Handling Handle errors gracefully during streaming: ```python Python SDK theme={null} import perplexity from perplexity import Perplexity client = Perplexity() try: stream = client.responses.create( preset="fast-search", input="Explain machine learning concepts", stream=True ) for event in stream: if event.type == "response.output_text.delta": print(event.delta, end="") elif event.type == "response.completed": print(f"\n\nCompleted: {event.response.usage}") except perplexity.APIConnectionError as e: print(f"Network connection failed: {e}") except perplexity.RateLimitError as e: print(f"Rate limit exceeded, please retry later: {e}") except perplexity.APIStatusError as e: print(f"API error {e.status_code}: {e.response}") ``` ```typescript TypeScript SDK theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); try { const stream = await client.responses.create({ preset: "fast-search", input: "Explain machine learning concepts", stream: true }); for await (const chunk of stream) { if (chunk.type === "response.output_text.delta") { process.stdout.write((chunk as any).delta); } } } catch (error) { if (error instanceof Perplexity.APIConnectionError) { console.error("Network connection failed:", (error as any).cause); } else if (error instanceof Perplexity.RateLimitError) { console.error("Rate limit exceeded, please retry later"); } else if (error instanceof Perplexity.APIError) { console.error(`API error ${error.status}: ${error.message}`); } } ``` If you need search results immediately for your user interface, consider using non-streaming requests for use cases where search result display is critical to the real-time user experience. ## Structured Outputs Structured outputs enable you to enforce specific response formats from Perplexity's models, ensuring consistent, machine-readable data that can be directly integrated into your applications without manual parsing. We currently support **JSON Schema** structured outputs. To enable structured outputs, add a `response_format` field to your request: ```json theme={null} { "response_format": { "type": "json_schema", "json_schema": { "name": "your_schema_name", "schema": { /* your JSON schema object */ } } } } ``` The `name` field is required and must be 1-64 alphanumeric characters. The schema should be a valid JSON schema object. LLM responses will match the specified format unless the output exceeds `max_tokens`. **Improve Schema Compliance**: Give the LLM some hints about the output format in your prompts to improve adherence to the structured format. For example, include phrases like "Please return the data as a JSON object with the following structure..." or "Extract the information and format it as specified in the schema." The first request with a new JSON Schema expects to incur delay on the first token. Typically, it takes 10 to 30 seconds to prepare the new schema, and may result in timeout errors. Once the schema has been prepared, the subsequent requests will not see such delay. ### Example ```python Python theme={null} from perplexity import Perplexity from typing import List, Optional from pydantic import BaseModel class FinancialMetrics(BaseModel): company: str quarter: str revenue: float net_income: float eps: float revenue_growth_yoy: Optional[float] = None key_highlights: Optional[List[str]] = None client = Perplexity() response = client.responses.create( preset="pro-search", input="Analyze the latest quarterly earnings report for Apple Inc. Extract key financial metrics.", response_format={ "type": "json_schema", "json_schema": { "name": "financial_metrics", "schema": { **FinancialMetrics.model_json_schema(), "required": list(FinancialMetrics.model_fields.keys()), "additionalProperties": False, } } } ) metrics = FinancialMetrics.model_validate_json(response.output_text) print(f"Revenue: ${metrics.revenue}B") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; interface FinancialMetrics { company: string; quarter: string; revenue: number; net_income: number; eps: number; revenue_growth_yoy?: number; key_highlights?: string[]; } const client = new Perplexity(); const response = await client.responses.create({ preset: 'pro-search', input: 'Analyze the latest quarterly earnings report for Apple Inc. Extract key financial metrics.', response_format: { type: 'json_schema', json_schema: { name: 'financial_metrics', schema: { type: 'object', properties: { company: { type: 'string' }, quarter: { type: 'string' }, revenue: { type: 'number' }, net_income: { type: 'number' }, eps: { type: 'number' }, revenue_growth_yoy: { anyOf: [{ type: 'number' }, { type: 'null' }] }, key_highlights: { anyOf: [{ type: 'array', items: { type: 'string' } }, { type: 'null' }] } }, required: ['company', 'quarter', 'revenue', 'net_income', 'eps', 'revenue_growth_yoy', 'key_highlights'], additionalProperties: false } } } }); const metrics: FinancialMetrics = JSON.parse(response.output_text ?? '{}'); ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "pro-search", "input": "Analyze the latest quarterly earnings report for Apple Inc. Extract key financial metrics.", "response_format": { "type": "json_schema", "json_schema": { "name": "financial_metrics", "schema": { "type": "object", "properties": { "company": {"type": "string"}, "quarter": {"type": "string"}, "revenue": {"type": "number"}, "net_income": {"type": "number"}, "eps": {"type": "number"}, "revenue_growth_yoy": {"type": "number"}, "key_highlights": { "type": "array", "items": {"type": "string"} } }, "required": ["company", "quarter", "revenue", "net_income", "eps"] } } } }' | jq ``` **Links in JSON Responses**: Requesting links as part of a JSON response may not always work reliably and can result in hallucinations or broken links. Models may generate invalid URLs when forced to include links directly in structured outputs. To ensure all links are valid, use the links returned in the `citations` or `search_results` fields from the API response. Never count on the model to return valid links directly as part of the JSON response content. ## Next Steps Get started with the Agent API. Explore direct model selection and third-party models. # Presets Source: https://docs.perplexity.ai/docs/agent-api/presets Explore Perplexity's Agent API presets - pre-configured setups optimized for different use cases with specific models, search configs, and tool access. ## Overview Presets are pre-configured setups optimized for specific use cases. Each preset bundles a model, search config, reasoning steps, system prompt, and available tools. Presets can be used in two ways: * **Dynamic preset (recommended)** — call a preset by name (e.g., `preset="pro-search"`) to opt in to the latest Perplexity-optimized configuration. Perplexity updates the underlying configuration as evals show improvements; your application picks up those improvements automatically with no code change. * **Frozen configuration** — copy a preset's current underlying configuration (model, tools, system prompt, parameters) into your request to lock in a specific setup. Use this when you want to insulate your application from future preset updates or pin the exact underlying model and tool setup. Presets provide sensible defaults optimized for their use case. You can override any parameter (like `model`, `max_steps`, or `tools`) by passing additional parameters. See [Customizing Presets](#customizing-presets) for code examples. **No explicit versioning.** Presets are not pinned to a specific version. Calling a preset by name always resolves to the latest Perplexity-recommended configuration. When we ship a meaningfully better configuration, we surface it as an improved preset — the name stays the same. If you need to pin a specific configuration, use the [frozen configuration](#frozen-configurations) approach instead. ### What Changes When a Preset Is Updated When Perplexity updates a preset, we aim to keep changes within the same expected profile so your application sees a quality improvement without surprises: * **Cost profile** — preset updates target the same cost band. The underlying model may change, but updates are tuned to stay close to the existing per-request cost. * **Latency profile** — preset updates target the same latency band. Step count, search config, and tool budget are kept close to the current values. * **Quality** — this is the dimension preset updates optimize for. New configurations ship when evals show meaningful improvements. If you need to insulate your application from future preset updates — for example, change-managed environments, regulated workflows, or applications that need to pin a specific model and tool setup — use a [frozen configuration](#frozen-configurations). ## Available Presets The table below shows each preset's current underlying configuration. The `Model`, `Search Config`, `Max Steps`, and `Tools used` columns reflect today's setup — if you call a preset by name, you opt in to whatever Perplexity ships as the latest version of that configuration. To pin these exact values, see [Frozen configurations](#frozen-configurations). | Preset | Description | Model | Search Config | Max Steps | Prompt Token Count | Tools used | Use When | | -------------------------- | -------------------------------------------------------------------------------------------------------------- | ------------------------------- | ------------- | --------- | ------------------ | ------------------------- | ----------------------------------------------------------------------------------------- | | **fast-search** | Optimized for fast, straightforward queries without reasoning overhead | `google/gemini-3-flash-preview` | `low` | 1 | \~1,240 | `web_search` | You need quick responses for simple queries without multi-step reasoning | | **pro-search** | Balanced for accurate, well-researched responses with moderate reasoning | `openai/gpt-5.1` | `medium` | 3 | \~1,502 | `web_search`, `fetch_url` | You need reliable, researched answers with tool access for most queries | | **deep-research** | Optimized for complex, in-depth analysis requiring extensive research and reasoning | `openai/gpt-5.2` | `high` | 10 | \~3,267 | `web_search`, `fetch_url` | You need comprehensive analysis with extensive multi-step reasoning and research | | **advanced-deep-research** | Advanced preset for institutional-grade research with enhanced tool access and extended reasoning capabilities | `anthropic/claude-opus-4-6` | `high` | 10 | \~3,500 | `web_search`, `fetch_url` | You need maximum depth research with extensive source coverage and sophisticated analysis | ## Parameter Glossary | Parameter | Definition | Learn More | | -------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------- | | **Model** | The underlying AI model used to generate responses. Each preset uses a specific third-party model optimized for its use case. | [Models](/docs/agent-api/models) | | **Search Config** | Static `web_search` context size: `low`, `medium`, or `high`. Start here for most applications. | [Web Search](/docs/agent-api/tools/web-search#search-configs) | | **Explicit Token Budgets** | Optional advanced override using `max_tokens` and `max_tokens_per_page` on `web_search`. Use this when you need exact budget control. | [Web Search](/docs/agent-api/tools/web-search#advanced) | | **Max Steps** | Maximum number of reasoning or tool-use iterations the model can perform. Higher values enable more complex multi-step reasoning: `1` (fast-search), `3` (pro-search), `10` (deep-research, advanced-deep-research). | — | | **Available Tools** | Tools the preset can use: `web_search` performs web searches for current information, and `fetch_url` fetches content from specific URLs. Presets without tools rely solely on training data. | [Agent API Tools](/docs/agent-api/tools/web-search) | ## System Prompts Each preset includes a tailored system prompt that guides the model's behavior, search strategy, and response formatting. ``` ## Role You are Perplexity, a helpful search assistant built by Perplexity AI. Your task is to deliver accurate, well-cited answers by leveraging web search results. You prioritize speed and precision, providing direct answers that respect the user's time while maintaining factual accuracy. Given a user's query, generate an expert, useful, and contextually relevant response. Answer only the current query using its provided search results and relevant conversation history. Do not repeat information from previous answers. ## Tools Workflow You must call the web search tool before answering. Do not rely on internal knowledge when search results can provide current, verifiable information. - Decompose complex queries into discrete, parallel search calls for accuracy - Use short, keyword-based queries (2-5 words optimal, 8 words maximum) - Do not generate redundant or overlapping queries - Match the language of the user's query - If search results are empty or unhelpful, answer using existing knowledge and state this limitation Make at most one tool call before concluding. ## Citation Instructions Your response must include citations. Add a citation to every sentence that includes information derived from search results. - Use brackets with the source index immediately after the relevant statement: [1], [2], etc. - Do not leave a space between the last word and the citation - When multiple sources support a claim, use separate brackets: [1][2][3] - Cite up to three relevant sources per sentence, choosing the most pertinent results - Never use formats with spaces, commas, or dashes inside brackets - Citations must appear inline, never in a separate References section Correct: "The Eiffel Tower is located in Paris[1][2]." Incorrect: "The Eiffel Tower is located in Paris [1, 2]." Incorrect: "The Eiffel Tower is located in Paris[1-2]." If you did not perform a search, do not include citations. ## Response Guidelines - Begin with a direct 1-2 sentence answer to the core question - Never start with a header or meta-commentary about your process - Use Level 2 headers (##) for sections only when organizing substantial content - Use bolded text (**text**) sparingly for emphasis on key terms - Keep responses concise; users should not need to scroll extensively - Lists: Use flat lists only (no nesting). Numbers for sequential items, bullets (-) otherwise. One item per line with no indentation. - Tables: Use markdown tables for comparisons. Ensure headers are properly defined. Include citations within cells directly after relevant data. - Code: Use markdown code blocks with language identifiers for syntax highlighting. - Math: Use LaTeX with \( \) for inline and \[ \] for block formulas. Never use $ or unicode for math. - Quotes: Use markdown blockquotes for relevant supporting quotes. - Write with precision and clarity using plain language - Use active voice and vary sentence structure naturally - Avoid hedging phrases ("It is important to...", "It is subjective...") - Do not use first-person pronouns or self-referential phrases - Ensure smooth transitions between sentences ## Query Type Adaptations Adapt your response structure based on query type while following all general guidelines. Provide detailed, well-structured answers formatted as scientific write-ups with paragraphs and sections using markdown headers. Summarize recent events concisely, grouping by topic. Use lists with bolded news titles at the start of each item. Prioritize diverse perspectives from trustworthy sources. Combine overlapping coverage with multiple citations. Prioritize recency. Never start with a header. Provide only the weather forecast in a brief format. If search results lack relevant weather data, state this clearly. Write a concise, comprehensive biography. If results reference multiple people with the same name, describe each separately without mixing information. Never start with the person's name as a header. Use markdown code blocks with appropriate language identifiers. Present code first, then explain it. Provide step-by-step instructions with clear ingredient amounts and precise directions for each step. Provide the translation directly without citations or search references. Follow user instructions precisely. Search results and citations are not required. Focus on delivering exactly what the user needs. For simple calculations, answer with the final result only. Use LaTeX for all formulas (\( \) inline, \[ \] block). Add citations after formulas: \[ \sin(x) \] [1][2]. Never use $ or unicode for math expressions. When the query includes a URL, rely solely on information from that source. Always cite [1] for the URL content. If the query is only a URL without instructions, summarize its content. ## Prohibited Content Never include in your responses: - Meta-commentary about your search or research process - Phrases like "Based on my search results...", "According to my research...", "Let me provide..." - URLs or links - Verbatim song lyrics or copyrighted content - A header at the beginning of your response - References or bibliography sections ## Copyright - Never reproduce copyrighted content verbatim (text, lyrics, etc.) - Public domain content (expired copyrights, traditional works) may be shared - When copyright status is uncertain, treat as copyrighted - Keep summaries brief (under 30 words) and original - Brief factual statements (names, dates, facts) are always acceptable ``` ``` ## Abstract You are an AI assistant developed by Perplexity AI. Given a user's query, your goal is to generate an expert, useful, factually correct, and contextually relevant response by leveraging available tools and conversation history. First, you will receive the tools you can call iteratively to gather the necessary knowledge for your response. You need to use these tools rather than using internal knowledge. Second, you will receive guidelines to format your response for clear and effective presentation. Third, you will receive guidelines for citation practices to maintain factual accuracy and credibility. ## Instructions Begin each turn with tool calls to gather information. You must call at least one tool before answering, even if information exists in your knowledge base. Decompose complex user queries into discrete tool calls for accuracy and parallelization. After each tool call, assess if your output fully addresses the query and its subcomponents. Continue until the user query is resolved or until the below is reached. End your turn with a comprehensive response. Never mention tool calls in your final response as it would badly impact user experience. Make at most three tool calls before concluding. {% if tool_instructions|default(false) %} {{ tool_instructions }} {% endif %}{# endif for tool_instructions|default(false) #} ## Citation Instructions Your response must include at least 1 citation. Add a citation to every sentence that includes information derived from tool outputs. Tool results are provided using `id` in the format `type:index`. `type` is the data source or context. `index` is the unique identifier per citation. are included below. - `web`: Internet sources - `page`: Full web page content - `conversation_history`: past queries and answers from your interaction with the user Use brackets to indicate citations like this: [type:index]. Commas, dashes, or alternate formats are not valid citation formats. If citing multiple sources, write each citation in a separate bracket like [web:1][web:2][web:3]. Correct: "The Eiffel Tower is in Paris [web:3]." Incorrect: "The Eiffel Tower is in Paris [web-3]." Your citations must be inline - not in a separate References or Citations section. Cite the source immediately after each sentence containing referenced information. If your response presents a markdown table with referenced information from `web`, `memory`, `attached_file`, or `calendar_event` tool result, cite appropriately within table cells directly after relevant data instead in of a new column. Do not cite `generated_image` or `generated_video` inside table cells. ## Response Guidelines Responses are displayed on web interfaces where users should not need to scroll extensively. Limit responses to 5 sections maximum. Users can ask follow-up questions if they need additional detail. Prioritize the most relevant information for the initial query. ### Answer Formatting - Begin with a direct 1-2 sentence answer to the core question. - Organize the rest of your answer into sections led with Markdown headers (using ##, ###) when appropriate to ensure clarity (e.g. entity definitions, biographies, and wikis). - Your answer should be at least 3 sentences long. - Each Markdown header should be concise (less than 6 words) and meaningful. - Markdown headers should be plain text, not numbered. - Between each Markdown header is a section consisting of 2-3 well-cited sentences. - When comparing entities with multiple dimensions, use a markdown table to show differences (instead of lists). - Whenever possible, present information as bullet point lists to improve readability. - You are allowed to bold at most one word (**example**) per paragraph. You can't bold consecutive words. - For grouping multiple related items, present the information with a mix of paragraphs and bullet point lists. Do not nest lists within other lists. ### Tone Explain clearly using plain language. Use active voice and vary sentence structure to sound natural. Ensure smooth transitions between sentences. Avoid personal pronouns like "I". Keep explanations direct; use examples or metaphors only when they meaningfully clarify complex concepts that would otherwise be unclear. ### Lists and Paragraphs Use lists for: multiple facts/recommendations, steps, features/benefits, comparisons, or biographical information. Avoid repeating content in both intro paragraphs and list items. Keep intros minimal. Either start directly with a header and list, or provide 1 sentence of context only. List formatting: - Use numbers when sequence matters; otherwise bullets (-) with a space after the dash. - Use numbers when sequence matters; otherwise bullets (-). - No whitespace before bullets (i.e. no indenting), one item per line. - Sentence capitalization; periods only for complete sentences. Paragraphs: - Use for brief context (2-3 sentences max) or simple answers - Separate with blank lines - If exceeding 3 consecutive sentences, consider restructuring as a list ### Summaries and Conclusions Avoid summaries and conclusions. They are not needed and are repetitive. Markdown tables are not for summaries. For comparisons, provide a table to compare, but avoid labeling it as 'Comparison/Key Table', provide a more meaningful title. ## Images If you receive images from tools, follow the instructions below. Citing Images: - Use ONLY [image:x] format where x is the numeric id - NEVER use ![alt](url) or URLs. - Place [image:x] at the end of sentences or list items. - Must be accompanied by text in the same sentence/bullet - never standalone. - Only cite when metadata matches the content. - Cite each image at most once. Examples - CORRECT: - The Golden Pheasant is known for its vibrant plumage [web:5][image:1]. - The striking Wellington Dam mural. [image:2] Examples - INCORRECT: - ![Golden Pheasant](https://example.com/pheasant.jpg) ## Prohibited Meta-Commentary - Never reference your information gathering process in your final answer. - Do not use phrases such as: - "Based on my search results..." - "Now I have gathered comprehensive information..." - "According to my research..." - "My search revealed..." - "I found information about..." - "Let me provide a detailed answer..." - "Let me compile this information..." - "Short Answer: ..." - Begin answers immediately with factual content that directly addresses the user's query. - Never reproduce copyrighted content (text, lyrics, etc.) - You may share public domain content (expired copyrights, traditional works) - When copyright status is uncertain, treat as copyrighted - Keep summaries brief (under 30 words) and original — don't reconstruct sources - Brief factual statements (names, dates, facts) are always acceptable ``` ``` ## Abstract You are a world-class research expert built by Perplexity AI. Your expertise spans deep domain knowledge, sophisticated analytical frameworks, and executive communication. You synthesize complex information into actionable intelligence while adapting your reasoning, structure, and exposition to match the highest conventions of the user's domain (finance, law, strategy, science, policy, etc.). You produce reports with substantial economic value—documents that executives, investors, and decision-makers would pay premium consulting fees to access. You should plan strategically in research methodology and make expert-level decisions along the way when leveraging search and other tools to generate the final report. Specifically, you should iteratively gather evidence, prioritizing authoritative sources through tool calls. Continue researching, analyzing, and making tool calls until the question is comprehensively resolved with institutional-grade depth. Before presenting your final answer, you must use these tools iteratively to gather comprehensive comparisons and fact-based evidence, reason carefully, and only then compose your final report. Generate your final report directly, starting with a header, when you are confident the answer meets the quality bar of a $200,000+ professional deliverable. You must generate a full report. The report is most valuable when it is readable and easy to process. Your report should help users learn more about the topic they are asking about. For instance, the language, jargon, and vocabulary used in the report should reflect the user's knowledge level and be explained when necessary. Please also include inline tables, visualizations, charts, and graphs to reduce cognitive load. Inline visualizations should be informative and deliver additional information, highlighting trends and actionable insights. Your work is evaluated against a rigorous expert research rubric that emphasizes factual accuracy, completeness and depth of analysis, clarity and writing quality, and proper use of sources and citations. Every research decision—from source selection to analysis of gathered information to final report generation—must optimize for these four dimensions. Optimize every report along these dimensions. As a research expert, you are responsible for: - iteratively gathering information (``) - and, in a separate final turn, generating the answer to the user's query (``). - Begin your turn by generating tool calls to gather information. - Break down complex user questions into a series of simple, sequential tasks so that each corresponding tool can perform its specific function more efficiently and accurately. - NEVER call the same tool with the same arguments more than once. If a tool call with specific arguments fails or does not provide the desired result, use a different method, try alternative arguments, or notify the user of the limitation. - For topics that involve quantitative data, NEVER simulate real data by generating synthetic data. Do NOT simulate "representative" or "sample" data based on high-level trends. Any specific quantitative data you use must be directly sourced. Creating synthetic data is misleading and renders the result untrustworthy. - If you cannot answer due to unavailable tools or inaccessible information, explicitly mention this and explain the limitation. - In your final turn, generate text that answers only the user's question with in-depth insights that three domain experts would agree on. - When invoking tools, output tool calls only (no natural language). If you generate text answers alongside tool calls - this constitutes a catastrophic failure that breaks the entire system. - When you call a tool, provide ONLY the tool call with no accompanying text, thoughts, or explanations. - While you read and analyze many sources, try to control your output length to 1000-4000 words to avoid being too long. - Any text output combined with a tool call will cause the system to malfunction and treat your response as a final answer rather than a tool execution. - Use as many sources as needed to achieve coverage + cross-validation, prioritizing primary/authoritative sources. Typical ranges for reference: 1. Simple factual queries: 20-30 sources minimum, until you have confidence in the answer you find 2. Moderate research requests: 30-50 sources minimum, until you can generate in-depth analysis 3. Complex research queries (reports, comprehensive analysis, literature reviews, competitive analysis, market research, academic papers, data visualization requests): 50-80+ sources minimum, until you can collect all viewpoints, provide in-depth analysis, provide recommendations, outline limitations - Systematic reviews, meta-analyses, or queries using terms like "exhaustive," "comprehensive," "latest findings," "state-of-the-art": 100+ sources when feasible Using the {{ web_search }} tool: - Use short, simple, keyword-based search queries. - You may include up to 3 separate queries in each call to the {{ web_search }} tool. - If you need to search for more than 3 topics or keywords, split your searches into multiple {{ web_search }} tool calls, each with no more than 3 queries. - Scale your research intensity of using the {{ search_web }} tool based on the query's complexity and research requirements: - Simple factual queries: 10-30 sources minimum - Moderate research requests: 30-50 sources minimum - Complex research queries (reports, comprehensive analysis, literature reviews, competitive analysis, market research, academic papers, data visualization requests): 50-80+ sources minimum - Systematic reviews, meta-analyses, or queries using terms like "exhaustive," "comprehensive," "latest findings," "state-of-the-art": 100+ sources when feasible - Key research triggers: when users request "reports," "analysis," use terms like "research," "analyze," "comprehensive," "thorough," "detailed," "latest," or ask for comparisons, trends, or evidence-based conclusions - prioritize extensive research over speed. - If the question is complex or involves multiple entities, break it down into simple, single-entity search queries and run them in parallel. - Example: Avoid long search queries like "Atlassian Cloudflare Twilio current market cap" - Instead, break them down into separate, shorter queries like "Atlassian market cap", "Cloudflare market cap", "Twilio market cap". - Otherwise, if the question is already simple, use it as your search query, correcting grammar only if necessary. - Do not generate multiple queries for questions that are already simple. - When handling queries that need current or up-to-date information, always reference today's date (as provided by the user) when using the {{ search_web }} tool. - Do not assume or rely on potentially outdated knowledge for information that changes over time (e.g., stock index components, rankings, event results). - Use only the information provided in the question or found during the research workflow. Do not add inferred or extra information. Using the {{ fetch_url }} tool: - Use the {{ fetch_url }} tool when a question asks for information from a specific URL or from several URLs. - When in doubt, prefer using the {{ fetch_url }} tool first. ONLY use {{ fetch_url }} if search results are insufficient. - If you know in advance that you need to fetch several URLs, do so in one call by providing {{ fetch_url }} with a list of URLs. NEVER fetch these URLs sequentially. - Use {{ fetch_url }} when you need complete information from a URL, such as lists, tables, or extended text sections. Before responding, follow the instructions in `` and ``. - Always prioritize readability, hierarchy, and visual organization. - Use clear headers and subheaders. - Use headers to organize each section logically. - Use tables when comparing entities (e.g., companies, models, frameworks, datasets). - Apply MECE principles (Mutually Exclusive, Collectively Exhaustive) to ensure analytical completeness without overlap. - Use numbered or bulleted lists for clarity and conciseness cautiously, do not overuse, only use it if it highlights key insights. Your task is to generate a comprehensive, high-quality, and expert-level report that reflects best-in-class expertise in the relevant domain. Carefully read the user's question to identify the most appropriate response format (such as detailed explanation, comparative analysis, data table, procedural guide, etc.) and organize your answer accordingly. 1. Domain-Specific Standards The report must follow the conventional structure of the domain, with examples below (these are not exhaustive — adapt as needed): - Academic Research: Abstract, Introduction, Literature Review (if applicable), Methodology, Analysis, Discussion, and Conclusion. - Investment / Market Reports: Executive Summary, Macro Trends, Industry Overview, Competitive Landscape, Consumer Analysis, Financials, Risks, and Conclusion. - Technical Reports: Overview, Architecture, Methodology, Experiments, Results, and Discussion. - Policy / Legal Reports: Summary, Context, Stakeholder Analysis, Evidence/Precedent Review, Implications, and Recommendations. - Other Domains: Apply structures that are standard for the field (e.g., medical, engineering, UX, marketing, product management, etc.). 2. Writing as a Domain Expert: - The structure, tone, vocabulary, and analytical frameworks must mirror what executives expect from premium professional services - Simulate the writing style, analytical depth, and intellectual sophistication of a senior professional in the field. For example: 1. Finance/Investment: Write as a Managing Director who has led 50+ deals, understands capital markets deeply, and thinks in DCF, multiples, and risk-adjusted returns 2. Strategy: Write as a McKinsey partner who has advised C-suites across industries, applies Porter's Five Forces and Jobs-to-be-Done intuitively, and structures problems with MECE thinking 3. Academic: Write as a tenured professor publishing in top-tier journals with rigorous methodology and theoretical grounding 4. Legal: Write as a senior partner with 25+ years of experience who understands case law, regulatory nuance, and business implications 3. Tone and Style - Default to generate answers in prose; use bullets when they improve scannability (features, steps, trade-offs, risks, recommendations). Prefer prose over bullets: Write in paragraph form as your default. Use bullet points for: • Lists of specific items (e.g., regulatory requirements, product features) • Step-by-step procedures • Parallel comparisons where structure adds clarity • Highlighting key insights - Do not use bullets for: analysis, explanations, arguments, or narrative content - Analysis over description or summaries: Don't summarize—analyze. Explain causation, trade-offs, implications, and provide key takeaway in every topic sentence, back up with data evidence or expert quotes, then write analysis and the implicit indication of the evidence which supports your topic sentence and your thesis. Your analysis should explain causation, trade-offs, implications, and answer the user's question when they "so what?" or "why is this an important piece of information?" for decision-makers. - Formal and authoritative: Maintain a professional tone throughout. Never use first-person pronouns ("I," "we," "our") or self-referential phrases ("Based on my research...") - Inverted pyramid: Lead with conclusions and key findings, then support with evidence and reasoning - Sentence variety: Mix sentence lengths and structures for readability. Avoid monotonous patterns. - Quality over arbitrary length: The goal is comprehensiveness and depth, not word count. A 2,000-word report that decisively answers the question is better than a 5,000-word report with filler. 4. Adaptive Knowledge-Level control: Before writing, assess the user's knowledge level by analyzing: - Memory entries: Review past topics discussed, technical depth of questions, and vocabulary used - Current query vocabulary: Evaluate whether they use domain-specific terminology correctly - Question sophistication: Simple factual questions vs. complex strategic questions Then adjust your response: For Expert Users (uses technical terms correctly, asks sophisticated questions): - Use precise domain terminology without explanation - Assume familiarity with industry context - Dive directly into nuanced analysis - Use domain-appropriate vocabulary, but balance professionalism with accessibility: For Intermediate Users (some domain knowledge, but gaps evident): - Use technical terms but provide brief, inline context - Example: "...using a discounted cash flow (DCF) analysis, which values a company based on its projected future cash flows..." - Balance accessibility with professionalism For General Users (limited domain knowledge, basic questions): - Define jargon on first use with concise clarity - Example: "The company's EBITDA (earnings before interest, taxes, depreciation, and amortization—a measure of operating profitability) grew 23%..." Use analogies sparingly when they clarify complex concepts - Maintain professional tone while being educational 5. Analytical Depth - Provide quantitative and qualitative reasoning — cite metrics, data, or frameworks where possible. - When sources conflict, explicitly explain the disagreement, justify which sources you rely on, and state any remaining uncertainty or limitations. - Offer comparative and contrastive insights when multiple items are involved. - Ensure every conclusion is supported by evidence or citation. - Apply analytical frameworks explicitly (e.g., user journey, Value Chain Analysis, financial & non-financial dimensions, etc.) - Compare and contrast entities using data-driven reasoning CRITICAL INSTRUCTION - NEVER VIOLATE: - When making tool calls: Output ONLY the tool calls, and NEVER generate text revealing commentary about these tools or their outputs. - When generating the final report: Output ONLY the report text with no tool calls. - Outputting tool calls and generating text are mutually exclusive. Any violation will cause system failure. - Do not include a separate sentence or section about sources. - NEVER produce citations containing spaces, commas, or dashes. Citations are restricted to numbers only. All citations MUST contain numbers. - Citations are essential for referencing and attributing information found from items that have unique id identifiers. Follow the formatting instructions below to ensure citations are clear, consistent, helpful to the user. - Do not cite computational or processing tools that perform calculations, transformations, etc. - When referencing tool outputs, cite only the numeric portion of each item's ID in square brackets (e.g., [3]), immediately following the relevant statement. - Example: Water boils at 100°C[2]. Here, [2] refers to a returned result such as web:2. - When multiple items support a sentence, include each number in its own set of square brackets with no spaces between them (e.g., [2][5]). NEVER USE "water[1-3]" or "water[12-47]". - Cite the `id` index for both direct quotes and information you paraphrase. - If information is gathered from several steps, list all corresponding `id`. - When using markdown tables, include citations within table cells immediately after the relevant data or information, following the same citation format (e.g., "| 25%[3] |" or "| Increased revenue[1][4] |"). - Cite sources thoroughly for factual claims, research findings, statistics, quotes, and specialized knowledge. Usually, 1-3 citations per sentence are sufficient. - Failing to do so can lead to unsubstantiated claims and reduce the reliability of your answer. - This requirement is especially important as you approach the end of the response. - Maintain consistent citation practices throughout the entire answer, including the final sentences. - Citations must not contain spaces, commas, or dashes. Citations are restricted to numbers only. All citations MUST contain numbers. - Never include a bibliography, references section, or list citations at the end of your answer. All citations must appear inline and directly after the relevant statement. - Never expose or mention full raw IDs or their type prefixes in your final response, except through this approved citation format or special citation cases below. ``` ``` You are a research expert. You synthesize complex information into clear, well-reasoned answers while adapting your vocabulary and depth to match the user's domain and knowledge level. Your task: iteratively gather evidence from authoritative sources, analyze it carefully, and produce a comprehensive answer that directly addresses the user's query. Continue researching until you have sufficient evidence to support your conclusions with institutional-grade depth. You are allowed at most 10 steps. Before presenting your final answer, use tools iteratively to gather evidence, reason carefully, then compose your final answer. Generate your final answer directly when you are confident you can fully address the query. As a research expert, you are responsible for the following steps: - iteratively gather information (``) - in a final step, generate the final answer to the user's query (``) - Begin your turn by generating tool calls to gather information. - Break down complex user queries into a series of simple, sequential tasks so that each corresponding tool can perform its specific function more efficiently and accurately. - NEVER call the same tool with the same arguments more than once. If a tool call with specific arguments fails or does not provide the desired result, use a different method, try alternative arguments, or notify the user of the limitation. - For topics that involve quantitative data, NEVER simulate real data by generating synthetic data. Do NOT simulate "representative" or "sample" data based on high-level trends. Any specific quantitative data you use must be directly sourced. Creating synthetic data is misleading and renders the result untrustworthy. - If you cannot answer due to unavailable tools or inaccessible information, explicitly mention this and explain the limitation. - DO NOT write "I'll research..." or "Let me search..." or any explanatory text during research. - DO NOT explain your reasoning or plans during information gathering. - If you write ANY text during research, the system will immediately terminate and treat it as your final answer. - In your final step (and ONLY in your final step), generate text that directly and thoroughly addresses the user's query. - Any text output combined with a tool call will cause the system to malfunction and treat your response as a final answer rather than a tool execution. LENGTH CALIBRATION: Match answer length to query complexity: - **Fact-seeking queries** ("What is X?" / "When did Y happen?"): Direct answer with context, 3-6 paragraphs. - **Concise/summary requests** ("Brief overview of..." / "Summarize..."): 5-12 paragraphs. - **Comparison/ranking requests** ("Compare the top 5..." / "Best options for..."): Structured analysis, 10-25 paragraphs. Prefer tables over lengthy prose. - **Open-ended research** ("Analyze..." / "Explain the history and implications of..."): 20-40+ paragraphs. - **Explicit depth requests** ("Comprehensive report..." / "Deep dive..."): Length determined by topic scope. SOURCE DEPTH: Prioritize primary and authoritative sources. When citing, prefer reputable sources first: official documentation, peer-reviewed research, established news outlets, government sources, and recognized industry experts over blogs, forums, or unverified sources. Scale research intensity to query complexity: - Simple factual queries: Search until you find consistent, authoritative answers - Moderate research: Search until you can provide substantive analysis with multiple perspectives - Complex research (reports, competitive analysis, literature reviews): Search until you have covered major viewpoints, can support recommendations with evidence, and can identify limitations or areas of uncertainty Cross-validate important claims across multiple sources. When you find conflicting information, investigate further rather than arbitrarily choosing one source. Use brackets with the source index immediately after the relevant statement: [1], [2], etc. Commas, dashes, or alternate formats are not valid citation formats. If citing multiple sources, write each citation in a separate bracket like [1][2][3]. Correct: "The Eiffel Tower is in Paris[1][2]." Incorrect: "The Eiffel Tower is in Paris [1, 2]." Incorrect: "The Eiffel Tower is in Paris[1-2]." What requires citation: factual claims, statistics, research findings, quotes, specialized knowledge. Aim for 1-3 citations per substantive claim. Distribute citations throughout the answer—maintain consistent citation density from beginning to end. Never include a bibliography; all citations are inline. You will have the following tools available to assist with your research. After receiving tool results, carefully reflect on their quality and determine optimal next steps before proceeding. Use your thinking to plan and iterate based on this new information, and then take the best next action. Using the `web_search` tool: - Use short, simple, keyword-based search queries. - You may include up to 3 separate queries in each call to the `web_search` tool. If you need to search for more than 3 topics, split into multiple calls. - If the query is complex or involves multiple entities, break it down into simple, single-entity search queries and run them in parallel. - Example: Avoid "Atlassian Cloudflare Twilio current market cap" - Instead: "Atlassian market cap", "Cloudflare market cap", "Twilio market cap" - If the query is already simple, use it as your search query, correcting grammar only if necessary. - When handling queries that need current information, reference today's date (as provided by the user). - Do not assume or rely on potentially outdated knowledge for information that changes over time (e.g., stock prices, rankings, current events). - Use only information found during research. Do not add inferred or fabricated information. Using the `fetch_url` tool: - Use when a query asks for information from a specific URL or several URLs. - Prefer `web_search` first. Use `fetch_url` only if search results are insufficient. - If you need to fetch several URLs, do so in one call. NEVER fetch URLs sequentially. - Use when you need complete information from a URL, such as lists, tables, or extended text sections. ``` ## Using Presets Each preset can be called in two ways — use whichever fits your needs: * **Dynamic preset (recommended)** — pass `preset=""` and let Perplexity manage the underlying configuration so you automatically pick up future improvements. * **Frozen configuration** — pass the preset's current model, system prompt, tools, and parameters directly (without `preset`) to lock in today's exact setup. The examples below show both options for each preset. The frozen configurations mirror the values in the [Available Presets](#available-presets) table and the matching system prompt from the [System Prompts](#system-prompts) section. ### fast-search Quick factual lookups with minimal latency. ```python Dynamic preset theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="fast-search", input="Who won the most recent Nobel Prize in Physics and what was their contribution?", ) print(response.output_text) ``` ```python Frozen configuration theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="google/gemini-3.1-flash-lite", input="Who won the most recent Nobel Prize in Physics and what was their contribution?", max_steps=1, instructions="", tools=[ { "type": "web_search", "snippet_mode": "low", }, ], ) print(response.output_text) ``` ```typescript Dynamic preset theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ preset: "fast-search", input: "Who won the most recent Nobel Prize in Physics and what was their contribution?", }); console.log(response.output_text); ``` ```typescript Frozen configuration theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "google/gemini-3.1-flash-lite", input: "Who won the most recent Nobel Prize in Physics and what was their contribution?", max_steps: 1, instructions: "", tools: [ { type: "web_search", snippet_mode: "low", }, ], }); console.log(response.output_text); ``` ```bash Dynamic preset theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "fast-search", "input": "Who won the most recent Nobel Prize in Physics and what was their contribution?" }' | jq ``` ```bash Frozen configuration theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "google/gemini-3.1-flash-lite", "input": "Who won the most recent Nobel Prize in Physics and what was their contribution?", "max_steps": 1, "instructions": "", "tools": [ { "type": "web_search", "snippet_mode": "low" } ] }' | jq ``` ### pro-search Researched answers with tool use for most queries. ```python Dynamic preset theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="pro-search", input="What are the key differences between the latest iPhone and Samsung Galaxy flagship phones released this year?", ) print(response.output_text) ``` ```python Frozen configuration theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5.1", input="What are the key differences between the latest iPhone and Samsung Galaxy flagship phones released this year?", max_steps=3, instructions="", tools=[ { "type": "web_search", "snippet_mode": "medium", }, {"type": "fetch_url"}, ], ) print(response.output_text) ``` ```typescript Dynamic preset theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ preset: "pro-search", input: "What are the key differences between the latest iPhone and Samsung Galaxy flagship phones released this year?", }); console.log(response.output_text); ``` ```typescript Frozen configuration theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "openai/gpt-5.1", input: "What are the key differences between the latest iPhone and Samsung Galaxy flagship phones released this year?", max_steps: 3, instructions: "", tools: [ { type: "web_search", snippet_mode: "medium", }, { type: "fetch_url" }, ], }); console.log(response.output_text); ``` ```bash Dynamic preset theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "pro-search", "input": "What are the key differences between the latest iPhone and Samsung Galaxy flagship phones released this year?" }' | jq ``` ```bash Frozen configuration theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.1", "input": "What are the key differences between the latest iPhone and Samsung Galaxy flagship phones released this year?", "max_steps": 3, "instructions": "", "tools": [ { "type": "web_search", "snippet_mode": "medium" }, {"type": "fetch_url"} ] }' | jq ``` ### deep-research In-depth analysis requiring multi-step reasoning. ```python Dynamic preset theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="deep-research", input="Analyze how AI regulation passed in 2025 across the EU, US, and China has affected startup funding and innovation so far.", ) print(response.output_text) ``` ```python Frozen configuration theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5.2", input="Analyze how AI regulation passed in 2025 across the EU, US, and China has affected startup funding and innovation so far.", max_steps=10, instructions="", tools=[ { "type": "web_search", "snippet_mode": "high", }, {"type": "fetch_url"}, ], ) print(response.output_text) ``` ```typescript Dynamic preset theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ preset: "deep-research", input: "Analyze how AI regulation passed in 2025 across the EU, US, and China has affected startup funding and innovation so far.", }); console.log(response.output_text); ``` ```typescript Frozen configuration theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "openai/gpt-5.2", input: "Analyze how AI regulation passed in 2025 across the EU, US, and China has affected startup funding and innovation so far.", max_steps: 10, instructions: "", tools: [ { type: "web_search", snippet_mode: "high", }, { type: "fetch_url" }, ], }); console.log(response.output_text); ``` ```bash Dynamic preset theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "deep-research", "input": "Analyze how AI regulation passed in 2025 across the EU, US, and China has affected startup funding and innovation so far." }' | jq ``` ```bash Frozen configuration theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.2", "input": "Analyze how AI regulation passed in 2025 across the EU, US, and China has affected startup funding and innovation so far.", "max_steps": 10, "instructions": "", "tools": [ { "type": "web_search", "snippet_mode": "high" }, {"type": "fetch_url"} ] }' | jq ``` ### advanced-deep-research Institutional-grade research with maximum depth. ```python Dynamic preset theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="advanced-deep-research", input="Provide a competitive analysis of the leading cloud computing providers in 2026, covering market share, pricing strategies, and emerging service differentiators.", ) print(response.output_text) ``` ```python Frozen configuration theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="anthropic/claude-opus-4-6", input="Provide a competitive analysis of the leading cloud computing providers in 2026, covering market share, pricing strategies, and emerging service differentiators.", max_steps=10, instructions="", tools=[ { "type": "web_search", "snippet_mode": "high", }, {"type": "fetch_url"}, ], ) print(response.output_text) ``` ```typescript Dynamic preset theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ preset: "advanced-deep-research", input: "Provide a competitive analysis of the leading cloud computing providers in 2026, covering market share, pricing strategies, and emerging service differentiators.", }); console.log(response.output_text); ``` ```typescript Frozen configuration theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "anthropic/claude-opus-4-6", input: "Provide a competitive analysis of the leading cloud computing providers in 2026, covering market share, pricing strategies, and emerging service differentiators.", max_steps: 10, instructions: "", tools: [ { type: "web_search", snippet_mode: "high", }, { type: "fetch_url" }, ], }); console.log(response.output_text); ``` ```bash Dynamic preset theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "advanced-deep-research", "input": "Provide a competitive analysis of the leading cloud computing providers in 2026, covering market share, pricing strategies, and emerging service differentiators." }' | jq ``` ```bash Frozen configuration theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "anthropic/claude-opus-4-6", "input": "Provide a competitive analysis of the leading cloud computing providers in 2026, covering market share, pricing strategies, and emerging service differentiators.", "max_steps": 10, "instructions": "", "tools": [ { "type": "web_search", "snippet_mode": "high" }, {"type": "fetch_url"} ] }' | jq ``` ## Customizing Presets Presets provide sensible defaults, but you can override any parameter by passing additional parameters alongside the preset. This lets you customize behavior while keeping the preset's optimized configuration. ```python theme={null} from perplexity import Perplexity client = Perplexity() # Override the model while keeping everything else from the preset response = client.responses.create( preset="pro-search", model="anthropic/claude-sonnet-4-6", # Use Claude instead of the default GPT-5.1 input="What are the key differences between the latest iPhone and Samsung Galaxy flagship phones released this year?", ) # Override max_steps for deeper reasoning response = client.responses.create( preset="pro-search", input="How do the top three JavaScript frameworks compare for building enterprise dashboards?", max_steps=5, # Override preset's default of 3 ) # Override tools configuration with a static search config response = client.responses.create( preset="pro-search", input="Summarize recent FDA drug approvals from clinicaltrials.gov", tools=[{ "type": "web_search", "snippet_mode": "high", "filters": { "search_domain_filter": ["clinicaltrials.gov", "fda.gov"], # Restrict to specific domains }, }], ) # Use explicit token budgets when you need exact budget control response = client.responses.create( preset="pro-search", input="Summarize recent FDA drug approvals from clinicaltrials.gov", tools=[{ "type": "web_search", "max_tokens": 6000, "max_tokens_per_page": 1200, "filters": { "search_domain_filter": ["clinicaltrials.gov", "fda.gov"], }, }], ) ``` ```typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); // Override the model while keeping everything else from the preset const response = await client.responses.create({ preset: "pro-search", model: "anthropic/claude-sonnet-4-6", // Use Claude instead of the default GPT-5.1 input: "What are the key differences between the latest iPhone and Samsung Galaxy flagship phones released this year?", }); // Override max_steps for deeper reasoning const response2 = await client.responses.create({ preset: "pro-search", input: "How do the top three JavaScript frameworks compare for building enterprise dashboards?", max_steps: 5, // Override preset's default of 3 }); // Override tools configuration with a static search config const response3 = await client.responses.create({ preset: "pro-search", input: "Summarize recent FDA drug approvals from clinicaltrials.gov", tools: [{ type: "web_search" as const, snippet_mode: "high", filters: { search_domain_filter: ["clinicaltrials.gov", "fda.gov"], // Restrict to specific domains }, }], }); // Use explicit token budgets when you need exact budget control const response4 = await client.responses.create({ preset: "pro-search", input: "Summarize recent FDA drug approvals from clinicaltrials.gov", tools: [{ type: "web_search" as const, max_tokens: 6000, max_tokens_per_page: 1200, filters: { search_domain_filter: ["clinicaltrials.gov", "fda.gov"], }, }], }); ``` ```bash theme={null} # Override the model while keeping everything else from the preset curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "pro-search", "model": "anthropic/claude-sonnet-4-6", "input": "What are the key differences between the latest iPhone and Samsung Galaxy flagship phones released this year?" }' | jq ``` ```bash theme={null} # Override max_steps for deeper reasoning curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "pro-search", "input": "How do the top three JavaScript frameworks compare for building enterprise dashboards?", "max_steps": 5 }' | jq ``` When you override a parameter, the preset's other defaults remain in effect. For example, if you override `model` on `pro-search`, you still get the `web_search` and `fetch_url` tools, the optimized system prompt, and the default reasoning steps. Use `snippet_mode` with `low`, `medium`, or `high` for static search configs. Explicit `max_tokens` and `max_tokens_per_page` budgets remain available as an advanced override when your application needs exact budget control. The full system prompts and detailed configurations for each preset are shown in the [System Prompts](#system-prompts) section above. The table at the top of this page summarizes the key parameters (model, max tokens, max steps, and available tools) for each preset. ## Frozen Configurations If you need a setup that does not change when Perplexity ships preset improvements — for example, change-managed environments, regulated workflows, or applications that need to pin a specific model and tool setup — replace the `preset` parameter with the explicit underlying configuration. This gives you the same behavior the preset has today, locked to the exact model, system prompt, and parameters you copied. To freeze a preset, copy the values from the [Available Presets](#available-presets) table and the matching system prompt from the [System Prompts](#system-prompts) section, then pass them directly instead of the preset name. See the [Using Presets](#using-presets) section above for side-by-side dynamic and frozen examples for each preset. **Dynamic vs. frozen — which to choose?** * Choose the **dynamic preset** (default) if you want the best Perplexity-recommended quality at a stable cost/latency band, and are comfortable with the underlying model or system prompt evolving over time. * Choose a **frozen configuration** if insulating your application from future preset updates matters more than picking up improvements automatically — for example, regulated workflows, change-managed environments, or contracts that require a specific underlying model and tool setup. You can mix both: use the dynamic preset in most environments, and pin a frozen configuration in places where stability is required. ## Choosing a Preset * **fast-search**: Simple questions, quick answers, minimal latency * **pro-search**: Standard queries requiring research and tool use * **deep-research**: Complex analysis, multi-step reasoning, comprehensive research * **advanced-deep-research**: Maximum depth research with institutional-grade analysis, enhanced tool access, and sophisticated source coverage ## Next Steps Get started with the Agent API. Explore direct model selection and third-party models. View complete endpoint documentation. # Prompt Guide Source: https://docs.perplexity.ai/docs/agent-api/prompt-guide How to write effective prompts for the Agent API. The Agent API runs a bounded multi-turn loop: on each turn the model can call a tool (such as `web_search`), read the result, and decide whether to continue or answer. Prompts that work well with single-shot LLMs often underperform here, because the same text shapes tool selection, search query generation, and final response together. Two parameters drive most of the prompt design: * **`instructions`** sets the role, tone, formatting, and grounding rules that apply regardless of the user's question. * **`input`** holds the actual question. It also seeds the first search query, so specificity here directly improves retrieval. For hard constraints on retrieval (allowed domains, date ranges, region) and on the loop itself (max steps), use request parameters rather than prose. The sections below cover when to reach for each. ## Instructions Use the `instructions` parameter for role, tone, language, formatting, and grounding rules. Instructions apply on every turn of the agent loop, so put things here that hold regardless of the user's question. Setting `instructions` with a preset **replaces** the preset's system prompt — it does not append. Each preset (`fast-search`, `pro-search`, `deep-research`) already covers tool-call discipline, query construction, citation, and formatting, so the preset's prompt should be overridden only when app-specific behavior is needed. Without a preset, `instructions` is the only system prompt the model sees. **Example instructions block:** ```text Instructions theme={null} You are a financial analyst writing for retail investors. Rules: - Aim for brief sentences and paragraphs. - Define jargon the first time you use it. - Prefer concrete numbers over vague qualifiers ("up 12% YoY" not "growing strongly"). Grounding rules: - Cite sources inline by domain, e.g. (reuters.com). Do not write full URLs. - If searches return no relevant results after trying alternative phrasings, or if the only matches are off-topic (different company, different fiscal year, etc.), say so explicitly rather than substituting related results. ``` Keep `instructions` focused. They are re-read on every turn of the agent loop, so bloat compounds across tool calls. If your block is growing long, check whether parts of it would be better expressed as request parameters: use [`response_format`](/docs/agent-api/output-control) with a JSON schema for machine-readable output, [`web_search` filters](/docs/agent-api/filters) for retrieval constraints, or move query-specific framing into `input`. Built-in tools like `web_search` and `fetch_url` are tuned to work well without prompt-side guidance. You don't need to describe what they do, when to call them, or how to construct queries. Adjust tool-call count with the `max_steps` parameter and search constraints with `web_search` filters. If you're using custom `instructions` and want to nudge how the model uses built-in tools, you can reference them there as well. For custom function tools you define yourself, the model relies on the `description` and parameter schema you provide, so make those as clear as you can. You can reinforce the tool's role in `instructions` if the description alone isn't enough to steer behavior. ## Input Use the `input` parameter for the actual query you want answered. Input strongly shapes search behavior, so descriptive and specific phrasing directly improves retrieval. Vague inputs lead to vague searches. **Example user prompt:** ```text Input theme={null} What are the best sushi restaurants in the world currently? ``` ## API Example ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="pro-search", input="What are the best sushi restaurants in the world currently?", instructions="You are a concise, well-researched assistant. If searches still return no relevant results after trying alternative phrasings, say so explicitly rather than guessing." ) print(response.output_text) ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ preset: "pro-search", input: "What are the best sushi restaurants in the world currently?", instructions: "You are a concise, well-researched assistant. If searches still return no relevant results after trying alternative phrasings, say so explicitly rather than guessing." }); console.log(response.output_text); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "pro-search", "input": "What are the best sushi restaurants in the world currently?", "instructions": "You are a concise, well-researched assistant. If searches still return no relevant results after trying alternative phrasings, say so explicitly rather than guessing." }' | jq ``` ## Best Practices Use natural language, but include the vocabulary and context that would actually appear on relevant pages. Add a few words of context to disambiguate when a term could mean multiple things. Specificity in `input` directly improves retrieval. **Good Example**: "Compare energy efficiency ratings of heat pumps vs. traditional HVAC for residential use" **Poor Example**: "Tell me which home heating is better" If you want a list, say how long. Without an explicit cap, the model picks an arbitrary length. **Good Example**: "List the top 5 sushi restaurants in Tokyo" **Poor Example**: "Give me a list of sushi restaurants" Can be useful if you want to nudge how the model handles tool output. Things like citation style, grounding behavior, or response formatting fit naturally here, since instructions apply on every turn of the agent loop. **Example** (`instructions`): "Cite sources inline by domain (e.g., reuters.com). State explicitly when tool results don't fully answer the question." ## Reading Sources from the Response Read URLs and source metadata from the response payload, not from the model's written answer. For non-streaming responses, search results are available at the top level as `response.search_results` and inside `response.output[]` as items where `type == "search_results"` (both carry the same data). Pull URLs from `results[].url`. For streaming, listen for `response.reasoning.search_results` events. See [Output Control](/docs/agent-api/output-control) for the full response shape. The model has access to URLs from tool output and can include them in its response if asked, but it's prone to mistyping or paraphrasing them. Presets also configure the model to cite by index (e.g., `[web:1]`), not by URL, so asking for URLs in prose fights the default citation format. Treat the model's text as the prose answer and the structured `search_results` field as the authoritative source list. ## Reduce Hallucinations LLMs are tuned to be helpful, which can occasionally lead them to provide an answer when search results are thin or off-target rather than flagging the gap. The agent loop helps, since the model can refine queries and search again, but it does not eliminate the failure modes. Hallucination is most likely when the information isn't web-accessible (LinkedIn posts, private documents, paywalled content), when repeated searches return related but non-matching results, or when very recent information isn't indexed yet. A few short additions to `instructions` cover most of these cases. Grounding rules belong here because instructions are re-read on every turn of the agent loop, so the same rule applies to the first search and to any follow-ups. **Give the model permission to say it didn't find anything.** With an explicit out, the model is more likely to acknowledge insufficient results instead of leaning on training data to fill the gap. ```text Instructions theme={null} If searches do not return relevant results after trying alternative phrasings, say so explicitly rather than providing speculative information. ``` **Require disclosure of near-misses.** When search returns related but non-matching results (a different year, a parent company instead of a subsidiary, a similar product), asking the model to surface the mismatch up front keeps these cases from being presented as direct answers. ```text Instructions theme={null} If you find related but non-matching results (for example, a different year, a parent company, or a subsidiary), state the mismatch explicitly before answering. ``` ## Use Parameters, Not Prose, for Hard Constraints For source, date, or region constraints, prefer the `web_search` parameters over describing the constraint in prose. Parameters are applied by the search backend on every call, while prose-based filters are interpreted by the model and may not carry through every turn of the loop. Keep `input` focused on the question itself, and move structural constraints into the tool config: ```python Avoid theme={null} client.responses.create( preset="pro-search", input="Search only on Wikipedia for climate change policies from the past month." ) ``` ```python Prefer theme={null} client.responses.create( preset="pro-search", input="What are the latest climate change policies?", tools=[ { "type": "web_search", "filters": { "search_domain_filter": ["wikipedia.org"], "search_recency_filter": "month" } } ] ) ``` See [Filters](/docs/agent-api/filters) for the full list of available parameters. To run without tools, set `tools_disabled: true` on the request. Passing `tools: []` does **not** clear preset tools. An empty array is treated the same as omitting the field, and the preset's defaults still apply. ## Next Steps Shape responses with `response_format` and learn the full response payload structure. Constrain search with domain, recency, and region parameters. Configure the `web_search` tool for source-grounded context. Choose a preset that matches your latency, depth, and tool requirements. # Agent API Source: https://docs.perplexity.ai/docs/agent-api/quickstart The Agent API is a multi-provider, interoperable API specification for building LLM applications. Access models from multiple providers with integrated real-time web search, tool configuration, reasoning control, and token budgets—all through one unified interface. Test Agent API requests and parameters interactively in the API console. ## Why Use the Agent API? Access OpenAI, Anthropic, Google, xAI, and more through one unified API, no need to manage multiple API keys. See exact token counts and costs per request, no markup, just direct provider pricing. Change models, reasoning, tokens, and tools with consistent syntax. We recommend using our [official SDKs](/docs/sdk/overview) for a more convenient and type-safe way to interact with the Agent API. **Endpoint:** The Agent API is available at `POST https://api.perplexity.ai/v1/agent`. For OpenAI SDK compatibility, `POST /v1/responses` is also accepted as an alias. See the [OpenAI Compatibility Guide](/docs/agent-api/openai-compatibility) for details on using OpenAI SDKs with Perplexity. ## Installation Install the SDK for your preferred language: ```bash Python theme={null} pip install perplexityai ``` ```bash Typescript theme={null} npm install @perplexity-ai/perplexity_ai ``` ## Authentication Set your API key as an environment variable. The SDK will automatically read it: ```bash theme={null} export PERPLEXITY_API_KEY="your_api_key_here" ``` ```powershell theme={null} setx PERPLEXITY_API_KEY "your_api_key_here" ``` All SDK examples below automatically use the `PERPLEXITY_API_KEY` environment variable. You can also pass the key explicitly if needed. ## Basic Usage **Convenience Property:** Both Python and Typescript SDKs provide an `output_text` property that aggregates all text content from response outputs. Instead of iterating through `response.output`, simply use `response.output_text` for cleaner code. ### Using a Third-Party Model Use third-party models from OpenAI, Anthropic, Google, xAI, and other providers for specific capabilities: ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5.5", input="Explain the difference between supervised and unsupervised learning in machine learning." ) print(f"Response ID: {response.id}") print(response.output_text) ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "openai/gpt-5.5", input: "Explain the difference between supervised and unsupervised learning in machine learning." }); console.log(`Response ID: ${response.id}`); console.log(response.output_text); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.5", "input": "Explain the difference between supervised and unsupervised learning in machine learning." }' | jq ``` ```json theme={null} { "background": false, "completed_at": 1771891464, "created_at": 1771891464, "error": null, "frequency_penalty": 0, "id": "resp_f854ed0a-f0e2-4ee8-b5ea-8582956910f2", "incomplete_details": null, "instructions": null, "max_output_tokens": null, "max_tool_calls": null, "metadata": {}, "model": "openai/gpt-5.5", "object": "response", "output": [ { "content": [ { "annotations": [], "logprobs": [], "text": "Supervised learning uses labeled data where each example has a known output, enabling the model to learn direct input-output relationships. Examples include classification and regression.", "type": "output_text" } ], "id": "msg_f47013d2-7fe7-44d6-a7aa-4e34c85ce2b6", "role": "assistant", "status": "completed", "type": "message" } ], "parallel_tool_calls": true, "presence_penalty": 0, "previous_response_id": null, "prompt_cache_key": null, "reasoning": null, "safety_identifier": null, "service_tier": "default", "status": "completed", "store": true, "temperature": 1, "text": { "format": { "type": "text" } }, "tool_choice": "auto", "tools": [], "top_logprobs": 0, "top_p": 1, "truncation": "disabled", "usage": { "cost": { "currency": "USD", "input_cost": 4e-05, "output_cost": 0.00311, "total_cost": 0.00315 }, "input_tokens": 20, "input_tokens_details": { "cached_tokens": 0 }, "output_tokens": 222, "output_tokens_details": { "reasoning_tokens": 0 }, "total_tokens": 242 }, "user": null } ``` ### Using a Preset Presets provide optimized defaults for specific use cases. Start with a preset for quick setup: ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="pro-search", input="Compare the latest open-source LLMs released in 2025 in terms of benchmark performance, licensing, and real-world applications.", ) print(f"Model used: {response.model}") print(response.output_text) ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ preset: "pro-search", input: "Compare the latest open-source LLMs released in 2025 in terms of benchmark performance, licensing, and real-world applications.", }); console.log(`Model used: ${response.model}`); console.log(response.output_text); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "pro-search", "input": "Compare the latest open-source LLMs released in 2025 in terms of benchmark performance, licensing, and real-world applications." }' | jq ``` ```json theme={null} { "background": false, "completed_at": 1771891641, "created_at": 1771891641, "error": null, "frequency_penalty": 0, "id": "resp_aca2bace-3782-4d81-be45-a82c24cfff9d", "incomplete_details": null, "instructions": "## Abstract\n\nYou are an AI assistant developed by Perplexity AI...\n\n...", "max_output_tokens": 8192, "max_tool_calls": null, "metadata": {}, "model": "openai/gpt-5.1", "object": "response", "output": [ { "queries": [ "2025 open source LLM benchmark performance", "2025 newly released open source LLMs license", "2025 open source LLM real world use cases" ], "results": [ { "date": "2025-11-19", "id": 1, "last_updated": "2026-02-23T12:12:34", "snippet": "updated\n\n19 Nov 2025\n\n# Open LLM Leaderboard\n\nThis LLM leaderboard displays...", "source": "web", "title": "Open LLM Leaderboard 2025", "url": "https://www.vellum.ai/open-llm-leaderboard" }, { "date": "2023-05-05", "id": 2, "last_updated": "2026-01-06T09:02:43.651546", "snippet": "", "source": "web", "title": "A list of open LLMs available for commercial use.", "url": "https://github.com/eugeneyan/open-llms" }, { "date": "2025-05-05", "id": 3, "last_updated": "2026-02-22T19:27:06", "snippet": "# Best Open Source LLMs You Can Run Locally in 2025\n\nRunning large language models on your own hardware is...", "source": "web", "title": "Best Open Source LLMs You Can Run Locally in 2025 - DemoDazzle", "url": "https://demodazzle.com/blog/open-source-llms-2025" }, { "date": "2025-12-15", "id": 4, "last_updated": "2026-02-23T21:56:51", "snippet": "updated\n\n15 Dec 2025\n\n# LLM Leaderboard\n\nThis LLM leaderboard displays the latest public benchmark performance for SOTA model versions released after April 2024...", "source": "web", "title": "LLM Leaderboard 2025 - Vellum", "url": "https://www.vellum.ai/llm-leaderboard" }, { "date": "2025-11-22", "id": 5, "last_updated": "2026-02-11T02:35:36", "snippet": "Open\u2011source Large Language Models (LLMs) have moved from niche hobby projects to a full\u2011blown industry trend in 2025...", "source": "web", "title": "Open\u2011Source LLMs 2025: GPT\u2011OSS Models & How ... - Neura AI Blog", "url": "https://blog.meetneura.ai/open-source-llms-2025/" }, { "date": "2025-07-23", "id": 6, "last_updated": "2026-02-23T23:43:21", "snippet": "", "source": "web", "title": "55 real-world LLM applications and use cases from top ...", "url": "https://www.evidentlyai.com/blog/llm-applications" }, { "date": "2025-10-29", "id": 7, "last_updated": "2026-02-23T21:22:10", "snippet": "", "source": "web", "title": "Top 10 open source LLMs for 2025 - NetApp Instaclustr", "url": "https://www.instaclustr.com/education/open-source-ai/top-10-open-source-llms-for-2025/" }, { "date": "2025-05-21", "id": 8, "last_updated": "2026-02-23T14:54:20", "snippet": "Here are the details of OpenLLaMA:\n\n**Parameters:** 3B, 7B and 13B\n\n**License:** Apache 2.0...", "source": "web", "title": "The List of 11 Most Popular Open Source LLMs [2025]", "url": "https://www.lakera.ai/blog/open-source-llms" }, { "date": "2026-01-07", "id": 9, "last_updated": "2026-02-23T17:41:06", "snippet": "", "source": "web", "title": "The state of open source AI models in 2025 | Red Hat Developer", "url": "https://developers.redhat.com/articles/2026/01/07/state-open-source-ai-models-2025" }, { "date": "2025-10-28", "id": 10, "last_updated": "2026-02-23T07:53:56", "snippet": "- **Open source dominates by volume:** 63% of models in our dataset (59 open source vs 35 proprietary)\n- **Performance...", "source": "web", "title": "Open Source vs Proprietary LLMs: Complete 2025 Benchmark ...", "url": "https://whatllm.org/blog/open-source-vs-proprietary-llms-2025" }, { "date": "2025-06-02", "id": 11, "last_updated": "2026-01-18T13:27:38.757741", "snippet": "", "source": "web", "title": "Top 8 Open\u2011Source LLMs to Watch in 2025 - JetRuby Agency", "url": "https://jetruby.com/blog/top-8-open-source-llms-to-watch-in-2025/" }, { "date": "2026-01-26", "id": 12, "last_updated": "2026-02-23T16:49:21", "snippet": "", "source": "web", "title": "Best Open Source LLMs in 2026", "url": "https://www.keywordsai.co/blog/best-open-source-llms" }, { "date": "2025-12-10", "id": 13, "last_updated": "2026-02-23T18:38:26", "snippet": "", "source": "web", "title": "Full Benchmark Table For...", "url": "https://skywork.ai/blog/llm/top-10-open-llms-2025-november-ranking-analysis/" }, { "date": "2024-09-19", "id": 14, "last_updated": "2025-12-27T09:28:04.559969", "snippet": "## Top Open-Source LLMs of 2025\n\n### 1. LLaMA 3.1\n\n**Developer:**Meta AI **Release Date:**July 23, 2024 **Parameter Size:**405B, 70B, 8B...", "source": "web", "title": "Top 10 Open-Source LLMs in 2025 - Kite Metric", "url": "https://kitemetric.com/blogs/top-10-open-source-llms-in-2025-a-comprehensive-guide" }, { "date": "2025-02-26", "id": 15, "last_updated": "2025-09-10T16:36:09.704235", "snippet": "Use Cases:\n\n**Advanced Chatbots:**Responsive customer support bots. **Content Creation for Marketing:**Generating product descriptions and blog posts...", "source": "web", "title": "Top 10 Open-Source LLMs in 2025 and Their Use Cases", "url": "https://capalearning.com/2025/02/26/top-10-open-source-llms-in-2025-and-their-use-cases/" } ], "type": "search_results" }, { "contents": [ { "snippet": "Hi, Camille\u2019s here! On October 28, 2025, I fell into a small rabbit hole...", "title": "Full Benchmark Table For...", "url": "https://skywork.ai/blog/llm/top-10-open-llms-2025-november-ranking-analysis/" }, { "snippet": "# Open source vs proprietary LLMs: complete 2025 benchmark analysis\n\n## TL;DR: The state of LLMs in late 2025\n\n**The landscape has shifted dramatically:**\n\n- **Open source dominates by volume:** 63% of models in our dataset (59 open source vs 35 proprietary)\n- **Performance...", "title": "Open Source vs Proprietary LLMs: Complete 2025 Benchmark ...", "url": "https://whatllm.org/blog/open-source-vs-proprietary-llms-2025" } ], "type": "fetch_url_results" }, { "content": [ { "annotations": [], "logprobs": [], "text": "In 2025, the strongest open\u2011source LLMs (Qwen 2.5, Llama 3.3/3.x, DeepSeek V3\u2011series, Mixtral...", "type": "output_text" } ], "id": "msg_1140f2e2-5bdb-4be8-a4c8-9d56bf61f35f", "role": "assistant", "status": "completed", "type": "message" } ], "parallel_tool_calls": true, "presence_penalty": 0, "previous_response_id": null, "prompt_cache_key": null, "reasoning": null, "safety_identifier": null, "service_tier": "default", "status": "completed", "store": true, "temperature": 1, "text": { "format": { "type": "text" } }, "tool_choice": "auto", "tools": [ { "type": "web_search" }, { "type": "fetch_url" } ], "top_logprobs": 0, "top_p": 1, "truncation": "disabled", "usage": { "cost": { "cache_read_cost": 0.00059, "currency": "USD", "input_cost": 0.00919, "output_cost": 0.02743, "tool_calls_cost": 0.0055, "total_cost": 0.04271 }, "input_tokens": 12088, "input_tokens_details": { "cache_creation_input_tokens": 0, "cache_read_input_tokens": 4736, "cached_tokens": 4736 }, "output_tokens": 2743, "output_tokens_details": { "reasoning_tokens": 0 }, "tool_calls_details": { "fetch_url": { "invocation": 1 }, "search_web": { "invocation": 1 } }, "total_tokens": 14831 }, "user": null } ``` Learn more about [presets](/docs/agent-api/presets) to explore pre-configured setups optimized for different use cases with specific models, token limits, and tool access. ### With Web Search The Agent API provides access to a number of tools that can be used to extend the capabilities of the model. Enable web search capabilities using the `web_search` tool: ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5.5", input="What are the latest developments in AI?", tools=[{"type": "web_search"}], instructions="You have access to a web_search tool. Use it for questions about current events, news, or recent developments. Use 1 query for simple questions. Keep queries brief: 2-5 words. NEVER ask permission to search - just search when appropriate", ) if response.status == "completed": print(response.output_text) ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "openai/gpt-5.5", input: "What are the latest developments in AI?", tools: [{ type: "web_search" }], instructions: "You have access to a web_search tool. Use it for questions about current events, news, or recent developments.", }); if (response.status === "completed") { console.log(response.output_text); } ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.5", "input": "What are the latest developments in AI?", "tools": [{"type": "web_search"}], "instructions": "You have access to a web_search tool. Use it for questions about current events, news, or recent developments." }' | jq ``` ```json theme={null} { "background": false, "completed_at": 1771891737, "created_at": 1771891737, "error": null, "frequency_penalty": 0, "id": "resp_367113ed-7a1b-4b2e-bad7-93e53a6cbeca", "incomplete_details": null, "instructions": "You have access to a web_search tool. Use it for questions about current events, news, or recent developments. Use 1 query for simple questions. Keep queries brief: 2-5 words. NEVER ask permission to search - just search when appropriate", "max_output_tokens": 8192, "max_tool_calls": null, "metadata": {}, "model": "openai/gpt-5.5", "object": "response", "output": [ { "queries": [ "latest AI developments 2026" ], "results": [ { "date": "2026-01-01", "id": 1, "last_updated": "2026-02-23T20:10:25", "snippet": "Many believe efficiency will be the new frontier...", "source": "web", "title": "The trends that will shape AI and tech in 2026 - IBM", "url": "https://www.ibm.com/think/news/ai-tech-trends-predictions-2026" }, { "date": "2026-01-08", "id": 2, "last_updated": "2026-02-23T20:19:20", "snippet": "## What\u2019s next in AI: 7 trends to watch in 2026\n\nAI is entering a new phase, one defined by real-world impact...", "source": "web", "title": "What's next in AI: 7 trends to watch in 2026 - Microsoft Source", "url": "https://news.microsoft.com/source/features/ai/whats-next-in-ai-7-trends-to-watch-in-2026/" }, { "date": "2026-01-06", "id": 3, "last_updated": "2026-02-21T02:30:13", "snippet": "#### Topics\n\n#### AI in Action\n\n**Summary:**\n\nMIT SMR columnists Thomas H. Davenport and Randy Bean see five...", "source": "web", "title": "Five Trends in AI and Data Science for 2026", "url": "https://sloanreview.mit.edu/article/five-trends-in-ai-and-data-science-for-2026/" }, { "date": "2026-01-06", "id": 4, "last_updated": "2026-02-24T00:01:21", "snippet": "## Jeff Su\n\n##### Jan 06, 2026 (0:13:13)\nMost #AI predictions are speculation. This video covers...", "source": "web", "title": "Top 6 AI Trends That Will Define 2026 (backed by data)", "url": "https://www.youtube.com/watch?v=B23W1gRT9eY" }, { "date": "2026-01-15", "id": 5, "last_updated": "2026-02-23T17:37:52", "snippet": "", "source": "web", "title": "11 things AI experts are watching for in 2026 | University of California", "url": "https://www.universityofcalifornia.edu/news/11-things-ai-experts-are-watching-2026" }, { "date": "2026-01-13", "id": 6, "last_updated": "2026-02-23T16:27:23", "snippet": "Artificial intelligence (AI) is no longer an emerging technology, it\u2019s a transformational force driving innovation across industries...", "source": "web", "title": "AI Trends in 2026: A New Era of AI Advancements and Breakthroughs", "url": "https://www.trigyn.com/insights/ai-trends-2026-new-era-ai-advancements-and-breakthroughs" }, { "date": "2025-12-22", "id": 7, "last_updated": "2026-02-23T09:47:25", "snippet": "The most significant advances in artificial intelligence next year won't come from...", "source": "web", "title": "6 AI breakthroughs that will define 2026 - InfoWorld", "url": "https://www.infoworld.com/article/4108092/6-ai-breakthroughs-that-will-define-2026.html" }, { "date": "2025-12-22", "id": 8, "last_updated": "2026-02-23T20:21:57", "snippet": "What will define AI in 2026? \ud83d\ude80 Martin Keen & Aaron Baughman explore groundbreaking trends like Agentic AI, cloud computing, automation, and quantum computing, plus innovations like Physical AI...", "source": "web", "title": "AI Trends 2026: Quantum, Agentic AI & Smarter Automation", "url": "https://www.youtube.com/watch?v=zt0JA5rxdfM" }, { "date": "2025-12-15", "id": 9, "last_updated": "2026-02-23T13:13:58", "snippet": "", "source": "web", "title": "Stanford AI Experts Predict What Will Happen in 2026", "url": "https://hai.stanford.edu/news/stanford-ai-experts-predict-what-will-happen-in-2026" }, { "date": "2025-05-10", "id": 10, "last_updated": "2026-02-20T16:07:11", "snippet": "{ts:574} breakthroughs in AlphaGo and Alpha Fold, which are absolutely incredible. Now, DeepMind has basically said...", "title": "2026 AI : 10 Things Coming In 2026 (A.I In 2026 Major Predictions)", "url": "https://www.youtube.com/watch?v=RfA2Ug4FuaY" } ], "type": "search_results" }, { "content": [ { "annotations": [], "logprobs": [], "text": "Here are major *recent* directions in AI (late 2025\u2013early 2026) that researchers...", "type": "output_text" } ], "id": "msg_d0f12cc6-c6a2-426f-b55e-fff247e40c8c", "role": "assistant", "status": "completed", "type": "message" } ], "parallel_tool_calls": true, "presence_penalty": 0, "previous_response_id": null, "prompt_cache_key": null, "reasoning": null, "safety_identifier": null, "service_tier": "default", "status": "completed", "store": true, "temperature": 1, "text": { "format": { "type": "text" } }, "tool_choice": "auto", "tools": [ { "type": "web_search" } ], "top_logprobs": 0, "top_p": 1, "truncation": "disabled", "usage": { "cost": { "currency": "USD", "input_cost": 0.00826, "output_cost": 0.0063, "tool_calls_cost": 0.005, "total_cost": 0.01956 }, "input_tokens": 4718, "input_tokens_details": { "cached_tokens": 0 }, "output_tokens": 450, "output_tokens_details": { "reasoning_tokens": 0 }, "tool_calls_details": { "search_web": { "invocation": 1 } }, "total_tokens": 5168 }, "user": null } ``` ### With Finance Search Retrieve structured financial and market data using the `finance_search` tool. See the [Finance Search guide](/docs/agent-api/tools/finance-search) for capabilities and recommended configurations. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="perplexity/sonar", input="What's NVIDIA trading at right now, and what is its current P/E?", tools=[{"type": "finance_search"}], ) for item in response.output: if item.type == "message": print(item.content[0].text) ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "perplexity/sonar", input: "What's NVIDIA trading at right now, and what is its current P/E?", tools: [{ type: "finance_search" }], }); console.log(response.output_text); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "perplexity/sonar", "input": "What is NVIDIA trading at right now, and what is its current P/E?", "tools": [{"type": "finance_search"}] }' | jq ``` ## Next Steps Use web search for source-grounded, current context. Browse available models and pricing across all supported providers. Explore pre-configured setups for common use cases like pro-search and deep-research. Configure streaming responses and structured outputs with JSON schema. Specify multiple models for automatic failover and higher availability. Best practices for effective prompting with web search models. Control search results with domain, date, and location filters. View complete endpoint documentation and parameters. Need help? Check out our [community](https://community.perplexity.ai) for support and discussions with other developers. # Fetch URL Content Source: https://docs.perplexity.ai/docs/agent-api/tools/fetch-url-content Fetch and extract content from specific URLs in the Agent API. ## Overview The `fetch_url` tool fetches and extracts content from specific URLs during an Agent API request. Use it when your application already knows which page, article, document, or report the model should inspect. Use `fetch_url` when you need full page content from known URLs. Use [`web_search`](/docs/agent-api/tools/web-search) when the model first needs to discover relevant pages. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5.5", input="Summarize the key claims in https://example.com/report.", tools=[ { "type": "fetch_url" } ], instructions="Fetch the URL before summarizing it.", ) print(response.output_text) ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: 'openai/gpt-5.5', input: 'Summarize the key claims in https://example.com/report.', tools: [ { type: 'fetch_url' as const, }, ], instructions: 'Fetch the URL before summarizing it.', }); console.log(response.output_text); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.5", "input": "Summarize the key claims in https://example.com/report.", "tools": [ { "type": "fetch_url" } ], "instructions": "Fetch the URL before summarizing it." }' | jq ``` ## When to Use | Use `fetch_url` when... | Use `web_search` when... | | -------------------------------------------------- | ------------------------------------------ | | You already have a URL | You need to discover relevant pages | | You need fuller page content | You need snippets from multiple sources | | You are summarizing a specific article or document | You are researching a broad topic | | You want the model to inspect a known source | You want the model to find current sources | Combine `web_search` and `fetch_url` for multi-step research: search to find relevant pages, then fetch the most important URLs for fuller context. ## Parameters | Parameter | Type | Required | Description | | ---------- | ------- | -------- | ----------------------------------------------------------------------------------------- | | `type` | string | Yes | Must be `"fetch_url"`. | | `max_urls` | integer | No | Maximum number of URLs to fetch per tool call. The API schema allows values from 1 to 10. | ## Response Shape When `fetch_url` runs, the response can include a `fetch_url_results` output item before the final assistant message. Each fetched content item includes the URL, page title, and extracted snippet. ```json theme={null} { "output": [ { "type": "fetch_url_results", "contents": [ { "url": "https://example.com/report", "title": "Example Report", "snippet": "Extracted content from the fetched page." } ] }, { "type": "message", "role": "assistant", "content": [ { "type": "output_text", "text": "The answer generated from the fetched URL content." } ] } ], "usage": { "input_tokens": 900, "output_tokens": 250, "total_tokens": 1150, "tool_calls_details": { "fetch_url": { "invocation": 1 } } } } ``` ## Pricing `fetch_url` is billed at **$0.50 per 1,000 requests** (**$0.0005 per fetch**). Model token usage is billed separately according to Agent API token pricing. Pricing follows the same pattern as other tool calls: pay for tool invocations plus model tokens. See [Pricing](/docs/getting-started/pricing). ## Next Steps Search the web before fetching source content. Retrieve structured financial and market data. Search for professionals and employees. View complete endpoint documentation. # Finance Search Source: https://docs.perplexity.ai/docs/agent-api/tools/finance-search Retrieve structured financial and market data in the Agent API. ## Overview `finance_search` lets the model pull structured financial and market data for public companies, ETFs, and related instruments. The model decides which fields to fetch based on your prompt. Use it when one answer needs more than one type of financial data, such as valuation, earnings, and context for the same company or list of companies. ### Capabilities | Data area | What it includes | | ------------------------------- | ------------------------------------------------------------------------------------------------- | | Company basics | Quotes, profiles, peers, and market metadata | | Financials | Income statement, balance sheet, cash flow (quarterly and annual), key ratios | | Valuation and pricing | Current/near-real-time pricing, 1-minute to 1-month OHLCV ranges, pre-market and after-hours data | | Earnings | Last earnings call transcript, report filings, beat/miss history, guidance discussion | | Segment and KPI tracking | Revenue/profit by segment, geography, ARPU, subscriber counts, GMV, and other operating metrics | | Analyst coverage | Forward revenue and EPS estimates, cover count, historical estimate changes | | Market activity | Top gainers, top losers, and most active symbols | | Ownership and corporate actions | Insider activity, ticker-level metadata, splits, and related market events | | ETF and index details | Top constituents, shares, weights, and market values | ## Quickstart Add `finance_search` to the `tools` array. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="perplexity/sonar", input="What's NVIDIA trading at right now, and what is its current P/E?", tools=[{"type": "finance_search"}] ) for item in response.output: if item.type == "message": print(item.content[0].text) ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "perplexity/sonar", "input": "What is NVIDIA trading at right now, and what is its current P/E?", "tools": [ {"type": "finance_search"} ] }' ``` ```json theme={null} { "background": false, "completed_at": 1777644610, "created_at": 1777644610, "error": null, "frequency_penalty": 0, "id": "resp_d0476d0f-872d-492a-907e-1daa48eb9e32", "incomplete_details": null, "instructions": null, "max_output_tokens": 8192, "max_tool_calls": null, "metadata": {}, "model": "perplexity/sonar", "object": "response", "output": [ { "categories": ["quote"], "results": [ { "category": "quote", "content": "## NVDA Quote\nQuote field guide: `price` is the latest quote/current price...\n| symbol | name | timestamp | market_status | price | currency | change | changesPercentage | marketCap | pe | eps | volume | dayLow | dayHigh | yearLow | yearHigh | previousClose | open |\n| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |\n| NVDA | NVIDIA Corporation | 2026-05-01 14:10:07 UTC | open | 200.23 | USD | 0.66 | 0.33 | 4,866,492,706,948 | 40.86 | 4.90 | 28,725,330 | 199.15 | 203 | 110.82 | 216.83 | 199.57 | 201.28 |", "sources": [ "https://www.perplexity.ai/finance/NVDA/historical-data", "https://www.perplexity.ai/finance/NVDA" ], "tickers": ["NVDA"] } ], "tickers": ["NVDA"], "type": "finance_results" }, { "content": [ { "annotations": [], "logprobs": [], "text": "NVIDIA (NVDA) is currently trading at **$200.23** per share, and its current P/E ratio is **40.86**.", "type": "output_text" } ], "id": "msg_b188058f-8225-4642-90e6-da7112f96b69", "role": "assistant", "status": "completed", "type": "message" } ], "parallel_tool_calls": true, "presence_penalty": 0, "previous_response_id": null, "prompt_cache_key": null, "reasoning": null, "safety_identifier": null, "service_tier": "default", "status": "completed", "store": true, "temperature": 1, "text": { "format": { "type": "text" } }, "tool_choice": "auto", "tools": [ { "type": "finance_search" } ], "top_logprobs": 0, "top_p": 1, "truncation": "disabled", "usage": { "cost": { "currency": "USD", "input_cost": 0.00189, "output_cost": 0.00016, "tool_calls_cost": 0.005, "total_cost": 0.00705 }, "input_tokens": 7570, "input_tokens_details": { "cached_tokens": 0 }, "output_tokens": 63, "output_tokens_details": { "reasoning_tokens": 0 }, "tool_calls_details": { "finance_search": { "invocation": 1 } }, "total_tokens": 7633 }, "user": null } ``` ## Example Prompts * **Full company brief:** "Give me a complete NVIDIA snapshot: valuation, segment revenue for the latest quarter, and management's latest commentary on margins guidance." * **Compare companies in one request:** "Compare Apple, Microsoft, and Alphabet on revenue growth, operating margin, and forward P/E for the latest fiscal year." * **Earnings + reaction context:** "Summarize Tesla's last earnings call, include actual vs consensus, and describe how the stock and analyst targets moved after publication." ## Prompt Guidance `finance_search` works best when the prompt states the outcome, not the data shape. * Start with the business question first, then include the company or ticker. * Add time windows when relevant (`latest quarter`, `fiscal year to date`, `last 30 days`). * Let the tool decide which specific report fields to retrieve. ## Recommended Configurations Start with the configuration that matches the shape of the finance question. | Configuration | Best for | Latency | Quality | Cost | | --------------------------------- | -------------------------------------------------------- | -------- | ------- | ------ | | Live Market Data and Quotes | Real-time prices, quotes, and latest figures | Fast | Good | Low | | Single-Company Historical Lookups | Basic historical financials for one company or ticker | Balanced | High | Medium | | Multi-Step Financial Research | Cross-company comparisons and complex financial analysis | Thorough | Highest | High | ### Live Market Data and Quotes Use this for time-sensitive answers that depend on real-time prices, quotes, or the latest market figures. It is the cheapest and fastest option while maintaining strong quality for live data lookups. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="perplexity/sonar", input="What is Apple trading at right now, and what is its latest market cap?", tools=[{"type": "finance_search"}], max_steps=1, max_output_tokens=1024 ) for item in response.output: if item.type == "message": print(item.content[0].text) ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "perplexity/sonar", "input": "What is Apple trading at right now, and what is its latest market cap?", "tools": [{"type": "finance_search"}], "max_steps": 1, "max_output_tokens": 1024 }' ``` ```json theme={null} { "id": "resp_541684d6-cc46-4115-9137-bb387088bc32", "object": "response", "model": "perplexity/sonar", "status": "completed", "created_at": 1777645562, "completed_at": 1777645562, "output": [ { "type": "finance_results", "categories": ["quote"], "tickers": ["AAPL"], "results": [ { "category": "quote", "tickers": ["AAPL"], "content": "## AAPL Quote\n| symbol | name | timestamp | market_status | price | currency | change | changesPercentage | marketCap | pe | eps | volume | dayLow | dayHigh | yearLow | yearHigh | previousClose | open |\n| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |\n| AAPL | Apple Inc. | 2026-05-01 14:26:00 UTC | open | 285.74 | USD | 14.39 | 5.30 | 4,194,988,943,600 | 34.51 | 8.28 | 29,155,124 | 278.37 | 287.21 | 193.25 | 288.62 | 271.35 | 278.86 |", "sources": [ "https://www.perplexity.ai/finance/AAPL/historical-data", "https://www.perplexity.ai/finance/AAPL" ] } ] }, { "type": "message", "id": "msg_d8c03075-799d-4d4d-8feb-cc95824db262", "role": "assistant", "status": "completed", "content": [ { "type": "output_text", "text": "Apple (AAPL) is currently trading at **$285.74** per share, up about 5.30% on the day. Its latest market capitalization is approximately **$4.19 trillion**.", "annotations": [], "logprobs": [] } ] } ], "tools": [{"type": "finance_search"}], "max_output_tokens": 8192, "tool_choice": "auto", "parallel_tool_calls": true, "usage": { "input_tokens": 7575, "output_tokens": 75, "total_tokens": 7650, "cost": { "currency": "USD", "input_cost": 0.00189, "output_cost": 0.00019, "tool_calls_cost": 0.005, "total_cost": 0.00708 }, "tool_calls_details": { "finance_search": { "invocation": 1 } } } } ``` ### Single-Company Historical Lookups Use this for a single company's historical figures or basic questions that benefit from both structured finance data and web context. GPT-5.5 is strong at simple web search and token-efficient for historical lookups. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5.5", input="What was Microsoft's revenue last fiscal year, and how did it compare with the prior year?", tools=[ {"type": "web_search"}, {"type": "finance_search"}, {"type": "fetch_url"} ], max_steps=5, max_output_tokens=2048, reasoning={"effort": "low"} ) for item in response.output: if item.type == "message": print(item.content[0].text) ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.5", "input": "What was Microsoft revenue last fiscal year, and how did it compare with the prior year?", "tools": [ {"type": "web_search"}, {"type": "finance_search"}, {"type": "fetch_url"} ], "max_steps": 5, "max_output_tokens": 2048, "reasoning": {"effort": "low"} }' ``` ```json theme={null} { "id": "resp_1be7ab7e-0dda-4949-9578-1462f9557a6b", "object": "response", "model": "openai/gpt-5.5", "status": "completed", "created_at": 1777645563, "completed_at": 1777645563, "output": [ { "type": "finance_results", "categories": ["financials"], "tickers": ["MSFT"], "results": [ { "category": "financials", "tickers": ["MSFT"], "content": "## MSFT FY 2024\n| date | period | income_statement_total_revenues |\n| --- | --- | --- |\n| 2024-06-30 | 2024 FY | 245,122,000,000 |", "sources": [ "https://www.perplexity.ai/finance/MSFT/financials?period=annual&category=INCOME_STATEMENT&fromYear=2024&toYear=2024" ] } ] }, { "type": "finance_results", "categories": ["financials"], "tickers": ["MSFT"], "results": [ { "category": "financials", "tickers": ["MSFT"], "content": "## MSFT FY 2025\n| date | period | income_statement_total_revenues |\n| --- | --- | --- |\n| 2025-06-30 | 2025 FY | 281,724,000,000 |", "sources": [ "https://www.perplexity.ai/finance/MSFT/financials?period=annual&category=INCOME_STATEMENT&fromYear=2025&toYear=2025" ] } ] }, { "type": "message", "id": "msg_99ccfbfd-bce8-4b9b-b412-b01ef45c7842", "role": "assistant", "status": "completed", "content": [ { "type": "output_text", "text": "Microsoft's revenue in its last completed fiscal year, **FY2025 ended June 30, 2025**, was **$281.724 billion**.\n\nCompared with the prior year, **FY2024 revenue was $245.122 billion**, so Microsoft revenue increased by:\n\n- **$36.602 billion**\n- **About 14.9% year over year**", "annotations": [], "logprobs": [] } ] } ], "tools": [ {"type": "web_search"}, {"type": "fetch_url"}, {"type": "finance_search"} ], "max_output_tokens": 8192, "tool_choice": "auto", "parallel_tool_calls": true, "usage": { "input_tokens": 12522, "input_tokens_details": { "cached_tokens": 3840, "cache_read_input_tokens": 3840 }, "output_tokens": 500, "total_tokens": 13022, "cost": { "currency": "USD", "input_cost": 0.04341, "cache_read_cost": 0.00192, "output_cost": 0.015, "tool_calls_cost": 0.01, "total_cost": 0.07033 }, "tool_calls_details": { "finance_search": { "invocation": 2 } } } } ``` ### Multi-Step Financial Research Use this for cross-company comparisons, longer historical investigations, and analysis that needs several tool calls across financial statements, filings, transcripts, and web sources. Opus performs best on complex multi-step reasoning when paired with the full tool suite. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="anthropic/claude-opus-4-7", input="Compare Apple, Microsoft, and Alphabet on revenue growth, margin trends, and management commentary over the last three fiscal years.", tools=[ {"type": "web_search"}, {"type": "finance_search"}, {"type": "fetch_url"} ], max_steps=10, max_output_tokens=4096 ) for item in response.output: if item.type == "message": print(item.content[0].text) ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "anthropic/claude-opus-4-7", "input": "Compare Apple, Microsoft, and Alphabet on revenue growth, margin trends, and management commentary over the last three fiscal years.", "tools": [ {"type": "web_search"}, {"type": "finance_search"}, {"type": "fetch_url"} ], "max_steps": 10, "max_output_tokens": 4096 }' ``` ```json theme={null} { "id": "resp_466bc636-cbad-43ce-9f66-c8b296712f05", "object": "response", "model": "anthropic/claude-opus-4-7", "status": "completed", "created_at": 1777645564, "completed_at": 1777645564, "output": [ { "type": "finance_results", "categories": ["financials"], "tickers": ["AAPL", "MSFT", "GOOGL"], "results": [ { "category": "financials", "tickers": ["AAPL", "MSFT", "GOOGL"], "content": "## AAPL FY 2025\n| date | period | total_revenues | gross_profit | operating_profit | net_income | gross_margin | operating_margin | net_margin |\n| --- | --- | --- | --- | --- | --- | --- | --- | --- |\n| 2025-09-27 | 2025 FY | 416,161,000,000 | 195,201,000,000 | 133,050,000,000 | 112,010,000,000 | 0.47 | 0.32 | 0.27 |\n\n## MSFT FY 2025\n| 2025-06-30 | 2025 FY | 281,724,000,000 | 193,893,000,000 | 128,528,000,000 | 101,832,000,000 | 0.69 | 0.46 | 0.36 |\n\n## GOOGL FY 2025\n| 2025-12-31 | 2025 FY | 402,836,000,000 | 240,301,000,000 | 129,039,000,000 | 132,170,000,000 | 0.60 | 0.32 | 0.33 |" } ] }, { "type": "finance_results", "categories": ["financials"], "tickers": ["AAPL", "MSFT", "GOOGL"], "results": [ { "category": "financials", "content": "## AAPL FY 2024\n| 2024-09-28 | 2024 FY | 391,035,000,000 | ... | 0.46 | 0.32 | 0.24 |\n\n## MSFT FY 2024\n| 2024-06-30 | 2024 FY | 245,122,000,000 | ... | 0.70 | 0.45 | 0.36 |\n\n## GOOGL FY 2024\n| 2024-12-31 | 2024 FY | 350,018,000,000 | ... | 0.58 | 0.32 | 0.29 |" } ] }, { "type": "finance_results", "categories": ["financials"], "tickers": ["AAPL", "MSFT", "GOOGL"], "results": [ { "category": "financials", "content": "## AAPL FY 2023 — total revenue 383,285,000,000 (GM 0.44 / OpM 0.30 / NM 0.25)\n## MSFT FY 2023 — total revenue 211,915,000,000 (GM 0.69 / OpM 0.42 / NM 0.34)\n## GOOGL FY 2023 — total revenue 307,394,000,000 (GM 0.57 / OpM 0.27 / NM 0.24)" } ] }, { "type": "finance_results", "categories": ["transcript"], "tickers": ["AAPL", "MSFT", "GOOGL"], "results": [ { "category": "transcript", "tickers": ["AAPL"], "content": "AAPL Q2 2026: Revenue $111.184B (est. $109.46B), EPS $2.01 (est. $1.92). 17% revenue growth in March quarter; June guide +14-17%. GM guide 47.5-48.5%. Management flagged rising memory costs as an increasing headwind. Accelerating AI investment via Apple Intelligence; install base 2.5B devices.", "sources": ["https://www.perplexity.ai/finance/AAPL/earnings"] }, { "category": "transcript", "tickers": ["MSFT"], "content": "MSFT Q3 2026: Revenue $82.886B (est. $81.44B), EPS $4.27 (est. $4.07). Revenue +18% YoY; AI ARR reached $37B (+123% YoY). CapEx >$40B next quarter, ~$190B for CY2026. Capacity-constrained through 2026. Expects double-digit revenue and operating-income growth in FY27.", "sources": ["https://www.perplexity.ai/finance/MSFT/earnings"] }, { "category": "transcript", "tickers": ["GOOGL"], "content": "GOOGL Q1 2026: Revenue $109.896B (est. $106.98B), EPS $5.11 (est. $2.64). 22% revenue growth; Search +19%, Cloud +63%. Operating margin 36.1%. 2026 CapEx raised to $180-190B; 2027 CapEx to significantly increase. Cloud margin headwind from Wiz acquisition.", "sources": ["https://www.perplexity.ai/finance/GOOGL/earnings"] } ] }, { "type": "message", "id": "msg_90e83fe4-f020-47f0-990e-f82aa0b655da", "role": "assistant", "status": "completed", "content": [ { "type": "output_text", "text": "# Apple vs. Microsoft vs. Alphabet — 3-Year Comparison\n\n## 1. Revenue Growth\n| Company | FY2023 | FY2024 | FY2025 | 2-yr CAGR | YoY FY25 |\n|---|---|---|---|---|---|\n| Apple (AAPL) | $383.3B | $391.0B | $416.2B | +4.2% | +6.4% |\n| Microsoft (MSFT)| $211.9B | $245.1B | $281.7B | +15.3% | +14.9% |\n| Alphabet (GOOGL)| $307.4B | $350.0B | $402.8B | +14.5% | +15.1% |\n\n## 2. Margin Trends\n- Apple: GM 44% → 47%, OpM steady ~32%, NM 25% → 27%.\n- Microsoft: GM ~69% (slightly compressed by AI capex), OpM 42% → 46%, NM 34% → 36%.\n- Alphabet: GM 57% → 60%, OpM 27% → 32% (largest expansion), NM 24% → 33%.\n\n## 3. Management Commentary (latest calls)\n- Apple (Q2 FY26): Revenue +17%, June guide +14-17%; flagging memory-cost headwind.\n- Microsoft (Q3 FY26): Revenue +18%; AI ARR $37B (+123%); CY26 capex ~$190B.\n- Alphabet (Q1 2026): Revenue +22%; Cloud +63%; CY26 capex raised to $180-190B; 2027 capex to significantly increase.\n\n## Bottom Line\n- Revenue growth: Microsoft (closely followed by Alphabet)\n- Margin level: Microsoft\n- Margin expansion: Alphabet\n- Capital intensity: Apple is lightest; MSFT and GOOGL each spending $180-190B on 2026 capex.\n\nKey tension: AI investment is fueling top-line acceleration (especially MSFT and GOOGL) but creating depreciation and component-cost headwinds that are starting to weigh on gross margins.", "annotations": [], "logprobs": [] } ] } ], "tools": [ {"type": "web_search"}, {"type": "fetch_url"}, {"type": "finance_search"} ], "max_output_tokens": 8192, "tool_choice": "auto", "parallel_tool_calls": true, "usage": { "input_tokens": 61887, "input_tokens_details": { "cached_tokens": 36778, "cache_creation_input_tokens": 25100, "cache_read_input_tokens": 36778 }, "output_tokens": 3456, "total_tokens": 65343, "cost": { "currency": "USD", "input_cost": 0.00005, "cache_creation_cost": 0.15688, "cache_read_cost": 0.01839, "output_cost": 0.0864, "tool_calls_cost": 0.02, "total_cost": 0.28172 }, "tool_calls_details": { "finance_search": { "invocation": 4 } } } } ``` ## Parameters | Parameter | Type | Required | Description | | --------- | ------ | -------- | --------------------------- | | `type` | string | Yes | Must be `"finance_search"`. | ## Response Shape When `finance_search` runs, the response can include `finance_results` output items before the final assistant message. Each `finance_results` item includes the requested finance categories, ticker symbols, structured content, and source URLs when available. The final `usage` object includes token counts, cost details, and `tool_calls_details.finance_search.invocation` when tool-call usage is reported. ```json theme={null} { "output": [ { "type": "finance_results", "categories": ["quote"], "tickers": ["NVDA"], "results": [ { "category": "quote", "tickers": ["NVDA"], "content": "Structured quote data returned by the finance search tool.", "sources": [ "https://www.perplexity.ai/finance/NVDA" ] } ] }, { "type": "message", "role": "assistant", "content": [ { "type": "output_text", "text": "The answer generated from finance data." } ] } ], "usage": { "tool_calls_details": { "finance_search": { "invocation": 1 } } } } ``` ## Pricing `finance_search` is billed at **\$5 per 1,000 invocations**. Model token usage is billed separately according to Agent API token pricing. Pricing follows the same pattern as other tool calls: pay for invocations plus model tokens. See [Pricing](/docs/getting-started/pricing). ## Next Steps Search the web for source-grounded context. Fetch full content from known URLs. Search for professionals and employees. Get started with the Agent API. # People Search Source: https://docs.perplexity.ai/docs/agent-api/tools/people-search Search for professionals, employees, and people using People Search in the Agent API ## Overview The `people_search` tool enables models to find people and retrieve their professional information such as names, job titles, and companies. Use it to power workflows like lead research, recruiting pipelines, or organizational mapping. Use it when your application needs to: * Look up a specific person's professional background * Find employees at a company by role or title * Identify professionals in a particular field or location * Research leadership teams or organizational structures The model decides when to invoke `people_search` based on your prompt and instructions. ### Query Tips For the best results, guide the model with specific details in your prompt: | Approach | Example prompt | | ------------------- | ----------------------------------------------- | | **Name + company** | "Find John Smith who works at Google" | | **Role + company** | "Who is the Head of Design at Figma?" | | **Role + location** | "Find marketing directors in San Francisco" | | **Role + field** | "Find machine learning researchers at Stanford" | The tool works best for people-related queries — it is not suited for general web search. ## Tiered Configurations The following four tiered configurations span the speed/quality tradeoff for workloads that mix `people_search` with `web_search` and `fetch_url`. Each tier defines a model, reasoning effort, tool selection, per-tool token budgets, and step limits. Use them as starting points and adjust per your latency, depth, and accuracy needs. | Tier | Model | Reasoning | Tools | Max Steps | Use When | | ----------------- | ------------------------------- | --------- | ------------------------------------------ | --------- | --------------------------------------------------------------------------- | | **pro** | `openai/gpt-5-mini` | medium | `people_search`, `web_search`, `fetch_url` | 5 | Balanced people/web research with moderate depth | | **deep** | `google/gemini-3-flash-preview` | high | `people_search`, `web_search`, `fetch_url` | 10 | Deeper analysis when latency budget is moderate but quality matters | | **advanced-deep** | `openai/gpt-5` | medium | `people_search`, `web_search`, `fetch_url` | 10 | High-quality, multi-step research with long context | | **ultra-deep** | `openai/gpt-5.5` | high | `people_search`, `web_search`, `fetch_url` | 50 | Maximum-depth investigations with the largest token budgets and step counts | The `bigtokens` settings used by pro, deep, and advanced-deep refer to `max_tokens=10000`, `max_tokens_per_page=1000`, `max_results_per_query=10`, and `max_results_per_request=30` on the `people_search` and `web_search` tools. The `xltokens` settings used by ultra-deep refer to `max_tokens=20000`, `max_tokens_per_page=2000`, `max_results_per_query=30`, and `max_results_per_request=50`. **ultra-deep heads-up:** `openai/gpt-5.5` with high reasoning and streaming may be flaky upstream. If requests hang, fall back to `medium` reasoning effort or disable streaming. ### pro Balanced configuration with all three tools enabled and moderate reasoning effort. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5-mini", reasoning={"effort": "medium"}, tools=[ { "type": "people_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30, }, { "type": "web_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30, }, {"type": "fetch_url"}, ], max_steps=5, input="Find the head of platform engineering at Notion and summarize their background.", ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: 'openai/gpt-5-mini', reasoning: { effort: 'medium' }, tools: [ { type: 'people_search', max_tokens: 10000, max_tokens_per_page: 1000, max_results_per_query: 10, max_results_per_request: 30, }, { type: 'web_search', max_tokens: 10000, max_tokens_per_page: 1000, max_results_per_query: 10, max_results_per_request: 30, }, { type: 'fetch_url' }, ], max_steps: 5, input: 'Find the head of platform engineering at Notion and summarize their background.', }); console.log(response.output_text); ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5-mini", "reasoning": {"effort": "medium"}, "tools": [ { "type": "people_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30 }, { "type": "web_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30 }, {"type": "fetch_url"} ], "max_steps": 5, "input": "Find the head of platform engineering at Notion and summarize their background." }' ``` ### deep Higher reasoning effort and step count with a generous output budget for fuller multi-source answers. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="google/gemini-3-flash-preview", reasoning={"effort": "high"}, tools=[ { "type": "people_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30, }, { "type": "web_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30, }, {"type": "fetch_url"}, ], max_steps=10, max_output_tokens=16000, input="Map the executive team at a mid-size SaaS company and explain each leader's prior roles.", ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: 'google/gemini-3-flash-preview', reasoning: { effort: 'high' }, tools: [ { type: 'people_search', max_tokens: 10000, max_tokens_per_page: 1000, max_results_per_query: 10, max_results_per_request: 30, }, { type: 'web_search', max_tokens: 10000, max_tokens_per_page: 1000, max_results_per_query: 10, max_results_per_request: 30, }, { type: 'fetch_url' }, ], max_steps: 10, max_output_tokens: 16000, input: "Map the executive team at a mid-size SaaS company and explain each leader's prior roles.", }); console.log(response.output_text); ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "google/gemini-3-flash-preview", "reasoning": {"effort": "high"}, "tools": [ { "type": "people_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30 }, { "type": "web_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30 }, {"type": "fetch_url"} ], "max_steps": 10, "max_output_tokens": 16000, "input": "Map the executive team at a mid-size SaaS company and explain each leader'\''s prior roles." }' ``` ### advanced-deep A frontier-model configuration for high-quality, multi-step research when latency budget is generous. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5", reasoning={"effort": "medium"}, tools=[ { "type": "people_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30, }, { "type": "web_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30, }, {"type": "fetch_url"}, ], max_steps=10, input="Identify the top product leaders across three competitors and compare their backgrounds.", ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: 'openai/gpt-5', reasoning: { effort: 'medium' }, tools: [ { type: 'people_search', max_tokens: 10000, max_tokens_per_page: 1000, max_results_per_query: 10, max_results_per_request: 30, }, { type: 'web_search', max_tokens: 10000, max_tokens_per_page: 1000, max_results_per_query: 10, max_results_per_request: 30, }, { type: 'fetch_url' }, ], max_steps: 10, input: 'Identify the top product leaders across three competitors and compare their backgrounds.', }); console.log(response.output_text); ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5", "reasoning": {"effort": "medium"}, "tools": [ { "type": "people_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30 }, { "type": "web_search", "max_tokens": 10000, "max_tokens_per_page": 1000, "max_results_per_query": 10, "max_results_per_request": 30 }, {"type": "fetch_url"} ], "max_steps": 10, "input": "Identify the top product leaders across three competitors and compare their backgrounds." }' ``` ### ultra-deep Maximum-depth configuration with the largest token budgets, the highest step count, and `xltokens` per-tool settings. Best for exhaustive investigations. `openai/gpt-5.5` with high reasoning and streaming may be flaky upstream. If requests hang, switch to `medium` effort or use a non-streaming call. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5.5", reasoning={"effort": "high"}, tools=[ { "type": "people_search", "max_tokens": 20000, "max_tokens_per_page": 2000, "max_results_per_query": 30, "max_results_per_request": 50, }, { "type": "web_search", "max_tokens": 20000, "max_tokens_per_page": 2000, "max_results_per_query": 30, "max_results_per_request": 50, }, {"type": "fetch_url"}, ], max_steps=50, max_output_tokens=32000, input="Build a complete organizational map of a target company, including reporting lines and prior employment history for every named leader.", ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: 'openai/gpt-5.5', reasoning: { effort: 'high' }, tools: [ { type: 'people_search', max_tokens: 20000, max_tokens_per_page: 2000, max_results_per_query: 30, max_results_per_request: 50, }, { type: 'web_search', max_tokens: 20000, max_tokens_per_page: 2000, max_results_per_query: 30, max_results_per_request: 50, }, { type: 'fetch_url' }, ], max_steps: 50, max_output_tokens: 32000, input: 'Build a complete organizational map of a target company, including reporting lines and prior employment history for every named leader.', }); console.log(response.output_text); ``` ```bash cURL theme={null} curl -X POST "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.5", "reasoning": {"effort": "high"}, "tools": [ { "type": "people_search", "max_tokens": 20000, "max_tokens_per_page": 2000, "max_results_per_query": 30, "max_results_per_request": 50 }, { "type": "web_search", "max_tokens": 20000, "max_tokens_per_page": 2000, "max_results_per_query": 30, "max_results_per_request": 50 }, {"type": "fetch_url"} ], "max_steps": 50, "max_output_tokens": 32000, "input": "Build a complete organizational map of a target company, including reporting lines and prior employment history for every named leader." }' ``` ## Parameters | Parameter | Type | Required | Description | | ------------------------- | ------- | -------- | --------------------------------------------------------------------------------- | | `type` | string | Yes | Must be `"people_search"`. | | `max_tokens` | integer | No | Maximum total tokens for people-search context when using explicit token budgets. | | `max_tokens_per_page` | integer | No | Maximum tokens extracted per result page when using explicit token budgets. | | `max_results_per_query` | integer | No | Maximum results returned for each generated people-search query. | | `max_results_per_request` | integer | No | Maximum results returned across the request. | ## Response Shape When `people_search` runs, the response can include search-style result details before the final assistant message. The final `usage` object includes token counts, cost details, and `tool_calls_details.people_search.invocation` when tool-call usage is reported. ```json theme={null} { "output": [ { "type": "search_results", "queries": ["head of platform engineering Notion"], "results": [ { "id": 1, "url": "https://example.com/profile", "title": "Example professional profile", "snippet": "A short snippet describing the professional result.", "source": "web" } ] }, { "type": "message", "role": "assistant", "content": [ { "type": "output_text", "text": "The answer generated from people-search results." } ] } ], "usage": { "tool_calls_details": { "people_search": { "invocation": 1 } } } } ``` ## Pricing Each invocation of the `people_search` tool is billed at **\$5 per 1,000 tool invocations**. See the [Pricing](/docs/getting-started/pricing) page for full details. ## Next Steps Search the web for source-grounded context. Fetch full content from known URLs. Retrieve structured financial and market data. Get started with the Agent API. # Web Search Source: https://docs.perplexity.ai/docs/agent-api/tools/web-search Search the web from the Agent API with filters, search configs, pricing, parameters, and response fields. ## Overview The `web_search` tool lets the model search the web during an Agent API request. Use it for current information, recent news, source-grounded research, and questions that need information beyond the model's training data. Enable the tool by adding it to the `tools` array. The model decides when to call it based on your prompt and instructions. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="openai/gpt-5.5", input="What are the latest AI infrastructure announcements this week?", tools=[ { "type": "web_search", "snippet_mode": "medium" } ], instructions="Search for current, source-grounded information before answering.", ) print(response.output_text) ``` ```typescript Typescript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: 'openai/gpt-5.5', input: 'What are the latest AI infrastructure announcements this week?', tools: [ { type: 'web_search' as const, snippet_mode: 'medium', }, ], instructions: 'Search for current, source-grounded information before answering.', }); console.log(response.output_text); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.5", "input": "What are the latest AI infrastructure announcements this week?", "tools": [ { "type": "web_search", "snippet_mode": "medium" } ], "instructions": "Search for current, source-grounded information before answering." }' | jq ``` ## Search Configs Start with `low`, `medium`, or `high` for search context sizing. These static configs are the recommended default because they keep the request readable and let Perplexity tune the underlying token budgets over time. | Config | Best for | Tradeoff | | -------- | -------------------------------------------------------------------------------- | -------------------------------------- | | `low` | Simple facts, lightweight lookups, cost-sensitive traffic | Lowest cost and fastest search context | | `medium` | General research, product comparisons, most production defaults | Balanced cost, latency, and context | | `high` | Source-heavy answers, complex research, queries where missing details are costly | More context and higher cost | ```python Python theme={null} tools = [ { "type": "web_search", "snippet_mode": "high" } ] ``` ```typescript Typescript theme={null} const tools = [ { type: 'web_search' as const, snippet_mode: 'high', }, ]; ``` ```bash cURL theme={null} "tools": [ { "type": "web_search", "snippet_mode": "high" } ] ``` ### Advanced Use explicit token budgeting when you need to pin exact budgets for cost controls, latency controls, or evaluations. Set `max_tokens` to cap total search context across results, and set `max_tokens_per_page` to cap content extracted from each result page. Users can choose a static config by setting `snippet_mode` to `low`, `medium`, or `high`, or choose dynamic behavior through explicit token budgeting. At any time, you can override the `low`, `medium`, or `high` config by passing explicit `max_tokens` and `max_tokens_per_page` values. ```python Python theme={null} response = client.responses.create( model="openai/gpt-5.5", input="Find recent government guidance on AI procurement.", tools=[ { "type": "web_search", "max_tokens": 6000, "max_tokens_per_page": 1200, "filters": { "search_domain_filter": [".gov"], "search_recency_filter": "month" } } ], ) ``` ```typescript Typescript theme={null} const response = await client.responses.create({ model: 'openai/gpt-5.5', input: 'Find recent government guidance on AI procurement.', tools: [ { type: 'web_search' as const, max_tokens: 6000, max_tokens_per_page: 1200, filters: { search_domain_filter: ['.gov'], search_recency_filter: 'month', }, }, ], }); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.5", "input": "Find recent government guidance on AI procurement.", "tools": [ { "type": "web_search", "max_tokens": 6000, "max_tokens_per_page": 1200, "filters": { "search_domain_filter": [".gov"], "search_recency_filter": "month" } } ] }' | jq ``` ## Filters Use filters to constrain the sources, dates, and location context used by `web_search`. See the full [Search Filters](/docs/agent-api/filters) guide for examples and edge cases. | Filter | Type | Description | | ---------------------------- | ---------------- | ------------------------------------------------------------------------------------- | | `search_domain_filter` | array of strings | Include or exclude up to 20 domains or URLs. Prefix entries with `-` to exclude them. | | `search_recency_filter` | string | Restrict results to `"hour"`, `"day"`, `"week"`, `"month"`, or `"year"`. | | `search_after_date_filter` | string | Include results published after a date in `MM/DD/YYYY` format. | | `search_before_date_filter` | string | Include results published before a date in `MM/DD/YYYY` format. | | `last_updated_after_filter` | string | Include results last updated after a date in `MM/DD/YYYY` format. | | `last_updated_before_filter` | string | Include results last updated before a date in `MM/DD/YYYY` format. | | `user_location` | object | Personalize search by country, region, city, latitude, and longitude. | Use `search_domain_filter` in either allowlist mode or denylist mode, not both. For example, `["nasa.gov", "wikipedia.org"]` includes only those domains, while `["-reddit.com", "-pinterest.com"]` excludes those domains. ```python Python theme={null} response = client.responses.create( model="openai/gpt-5.5", input="What changed in US AI policy this month?", tools=[ { "type": "web_search", "snippet_mode": "medium", "filters": { "search_domain_filter": [".gov"], "search_recency_filter": "month" }, "user_location": { "country": "US" } } ], ) ``` ```typescript Typescript theme={null} const response = await client.responses.create({ model: 'openai/gpt-5.5', input: 'What changed in US AI policy this month?', tools: [ { type: 'web_search' as const, snippet_mode: 'medium', filters: { search_domain_filter: ['.gov'], search_recency_filter: 'month', }, user_location: { country: 'US', }, }, ], }); ``` ```bash cURL theme={null} curl https://api.perplexity.ai/v1/agent \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.5", "input": "What changed in US AI policy this month?", "tools": [ { "type": "web_search", "snippet_mode": "medium", "filters": { "search_domain_filter": [".gov"], "search_recency_filter": "month" }, "user_location": { "country": "US" } } ] }' | jq ``` ## Parameters | Parameter | Type | Required | Description | | --------------------- | ------- | -------- | ----------------------------------------------------------------------- | | `type` | string | Yes | Must be `"web_search"`. | | `snippet_mode` | string | No | Static search config: `"low"`, `"medium"`, or `"high"`. | | `filters` | object | No | Domain and date filters. See [Search Filters](/docs/agent-api/filters). | | `user_location` | object | No | Location context for search personalization. | | `max_tokens` | integer | No | Maximum total tokens for search context. | | `max_tokens_per_page` | integer | No | Maximum tokens extracted from each search result page. | ## Response Shape When `web_search` runs, the response can include a `search_results` output item before the final assistant message. The final `usage` object includes token counts, cost details, and `tool_calls_details.web_search.invocation` when tool-call usage is reported. ```json theme={null} { "output": [ { "type": "search_results", "queries": ["AI infrastructure announcements"], "results": [ { "id": 1, "url": "https://example.com/news", "title": "Example AI infrastructure announcement", "snippet": "A short snippet from the search result.", "date": "2026-05-01", "last_updated": "2026-05-01", "source": "web" } ] }, { "type": "message", "role": "assistant", "content": [ { "type": "output_text", "text": "The answer generated from the search results." } ] } ], "usage": { "input_tokens": 1200, "output_tokens": 300, "total_tokens": 1500, "tool_calls_details": { "web_search": { "invocation": 1 } } } } ``` ## Pricing `web_search` is billed at **$5 per 1,000 search calls** (**$0.005 per search**). Model token usage is billed separately according to Agent API token pricing. Pricing follows the same pattern as other tool calls: pay for tool invocations plus model tokens. See [Pricing](/docs/getting-started/pricing). ## Next Steps Fetch full content from known URLs. Control domains, dates, recency, and location context. Use optimized presets for common Agent API workloads. View complete endpoint documentation. # Academic and Scholarly Search Source: https://docs.perplexity.ai/docs/cookbook/articles/academic-search/README Use the Agent API's domain filtering to restrict search to academic sources, extract DOIs and paper metadata, build citation chains, and create research summaries with proper attribution This guide shows how to use the Agent API's `search_domain_filter` to restrict search results to academic and scholarly sources. You will learn how to extract paper metadata (DOIs, authors, publication dates), build citation chains across related papers, and produce properly attributed research summaries. The `search_domain_filter` parameter on the Agent API's `web_search` tool controls which domains the search draws from. By filtering to academic domains like `arxiv.org`, `nature.com`, and `.edu`, you restrict results to peer-reviewed journals, preprint servers, and academic databases. For more on filtering, see the [Agent API Filters](/docs/agent-api/filters) docs. ## Prerequisites Install the Perplexity SDK: ```bash Python theme={null} pip install perplexityai ``` ```bash TypeScript theme={null} npm install @perplexity-ai/perplexity_ai ``` If you don't have an API key yet: Navigate to the **API Keys** tab in the API Portal and generate a new key. Then export your API key as an environment variable: ```bash theme={null} export PERPLEXITY_API_KEY="your-api-key" ``` ## Basic Academic Search Use `search_domain_filter` to restrict the Agent API's `web_search` tool to academic sources only. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() ACADEMIC_DOMAINS = [ "arxiv.org", "pubmed.ncbi.nlm.nih.gov", "nature.com", "science.org", ".edu", "scholar.google.com", "semanticscholar.org", ] response = client.responses.create( model="openai/gpt-5.4", input="What are the latest findings on the relationship between gut microbiome and mental health?", tools=[{ "type": "web_search", "filters": { "search_domain_filter": ACADEMIC_DOMAINS, }, }], instructions="Focus on peer-reviewed academic sources. Cite papers with authors and publication years when possible.", ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const ACADEMIC_DOMAINS = [ "arxiv.org", "pubmed.ncbi.nlm.nih.gov", "nature.com", "science.org", ".edu", "scholar.google.com", "semanticscholar.org", ]; const response = await client.responses.create({ model: "openai/gpt-5.4", input: "What are the latest findings on the relationship between gut microbiome and mental health?", tools: [{ type: "web_search" as const, filters: { search_domain_filter: ACADEMIC_DOMAINS, }, }], instructions: "Focus on peer-reviewed academic sources. Cite papers with authors and publication years when possible.", }); console.log(response.output_text); ``` ```bash curl theme={null} curl "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-5.4", "input": "What are the latest findings on the relationship between gut microbiome and mental health?", "tools": [{"type": "web_search", "filters": {"search_domain_filter": ["arxiv.org", "pubmed.ncbi.nlm.nih.gov", "nature.com", "science.org", ".edu"]}}], "instructions": "Focus on peer-reviewed academic sources. Cite papers with authors and publication years when possible." }' ``` Academic domain filtering targets papers from PubMed, arXiv, Google Scholar, Semantic Scholar, and major journal publishers. Combine `search_domain_filter` with clear `instructions` to ensure the model focuses on peer-reviewed or pre-print academic content. ## Extracting Paper Metadata Use structured outputs to extract detailed paper metadata from academic search results. ```python Python theme={null} import json from perplexity import Perplexity client = Perplexity() # Use Agent API with web_search for structured extraction response = client.responses.create( model="openai/gpt-5.4", input="Find the 5 most cited recent papers on transformer architectures in computer vision (Vision Transformers).", tools=[{"type": "web_search"}], instructions=( "Search for academic papers only. For each paper, extract the title, authors, " "publication year, journal or venue, DOI if available, and a one-sentence summary of the key contribution." ), response_format={ "type": "json_schema", "json_schema": { "name": "academic_papers", "schema": { "type": "object", "properties": { "query": {"type": "string"}, "papers": { "type": "array", "items": { "type": "object", "properties": { "title": {"type": "string"}, "authors": {"type": "string"}, "year": {"type": "integer"}, "venue": {"type": "string"}, "doi": {"type": "string"}, "key_contribution": {"type": "string"}, }, "required": ["title", "authors", "year", "venue", "doi", "key_contribution"], "additionalProperties": false, }, }, }, "required": ["query", "papers"], "additionalProperties": false, }, }, }, ) data = json.loads(response.output_text) print(f"Query: {data['query']}\n") for paper in data["papers"]: print(f" {paper['title']}") print(f" Authors: {paper['authors']}") print(f" Venue: {paper['venue']} ({paper['year']})") if paper["doi"]: print(f" DOI: {paper['doi']}") print(f" Contribution: {paper['key_contribution']}") print() ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "openai/gpt-5.4", input: "Find the 5 most cited recent papers on transformer architectures in computer vision (Vision Transformers).", tools: [{ type: "web_search" }], instructions: "Search for academic papers only. For each paper, extract the title, authors, publication year, journal or venue, DOI if available, and a one-sentence summary of the key contribution.", response_format: { type: "json_schema", json_schema: { name: "academic_papers", schema: { type: "object", properties: { query: { type: "string" }, papers: { type: "array", items: { type: "object", properties: { title: { type: "string" }, authors: { type: "string" }, year: { type: "integer" }, venue: { type: "string" }, doi: { type: "string" }, key_contribution: { type: "string" }, }, required: ["title", "authors", "year", "venue", "doi", "key_contribution"], }, }, }, required: ["query", "papers"], }, }, }, }); const data = JSON.parse(response.output_text); console.log(`Query: ${data.query}\n`); for (const paper of data.papers) { console.log(` ${paper.title}`); console.log(` Authors: ${paper.authors}`); console.log(` Venue: ${paper.venue} (${paper.year})`); if (paper.doi) console.log(` DOI: ${paper.doi}`); console.log(` Contribution: ${paper.key_contribution}`); console.log(); } ``` ## Building Citation Chains Trace how papers cite each other to understand the evolution of an idea across the literature. ```python Python theme={null} import json from perplexity import Perplexity client = Perplexity() def find_citing_papers(paper_title: str, depth: int = 0, max_depth: int = 2) -> dict: """Recursively find papers that cite a given paper.""" indent = " " * depth print(f"{indent}Searching citations for: {paper_title}...") response = client.responses.create( model="openai/gpt-5.4", input=f"What are the 3 most important papers that directly cite or build upon '{paper_title}'?", tools=[{"type": "web_search"}], instructions="Focus on academic papers only. Return papers that explicitly reference or extend the given work.", response_format={ "type": "json_schema", "json_schema": { "name": "citing_papers", "schema": { "type": "object", "properties": { "source_paper": {"type": "string"}, "citing_papers": { "type": "array", "items": { "type": "object", "properties": { "title": {"type": "string"}, "authors": {"type": "string"}, "year": {"type": "integer"}, "relationship": {"type": "string"}, }, "required": ["title", "authors", "year", "relationship"], "additionalProperties": false, }, }, }, "required": ["source_paper", "citing_papers"], "additionalProperties": false, }, }, }, ) data = json.loads(response.output_text) result = { "paper": paper_title, "cited_by": [], } for citing in data["citing_papers"]: entry = { "title": citing["title"], "authors": citing["authors"], "year": citing["year"], "relationship": citing["relationship"], } # Recurse for deeper citation chains if depth < max_depth: entry["cited_by"] = find_citing_papers(citing["title"], depth + 1, max_depth).get("cited_by", []) result["cited_by"].append(entry) return result # Start with a foundational paper chain = find_citing_papers("Attention Is All You Need", max_depth=1) print(json.dumps(chain, indent=2)) ``` Citation chain depth grows exponentially. Keep `max_depth` low (1-2) to avoid excessive API calls. For comprehensive citation graphs, use dedicated tools like Semantic Scholar's API alongside Perplexity for summaries. ## Research Summary with Attribution Generate a research summary that properly attributes each claim to its source paper. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() ACADEMIC_DOMAINS = [ "arxiv.org", "pubmed.ncbi.nlm.nih.gov", "nature.com", "science.org", ".edu", "scholar.google.com", ] def academic_research_summary(topic: str) -> str: """Generate an academic research summary with proper citations.""" response = client.responses.create( model="openai/gpt-5.4", input=( f"Provide a comprehensive academic literature review on: {topic}. " "Include specific findings, methodologies, and conclusions from recent papers. " "Cite each claim with its source." ), tools=[{ "type": "web_search", "filters": { "search_domain_filter": ACADEMIC_DOMAINS, }, }], instructions=( "Search for peer-reviewed academic sources only. For each claim, " "attribute it to the specific paper with author names and year. " "Format the output as a structured literature review with a references section." ), ) return f"# Literature Review: {topic}\n\n{response.output_text}" report = academic_research_summary( "the effectiveness of large language models for automated code review" ) print(report) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const ACADEMIC_DOMAINS = [ "arxiv.org", "pubmed.ncbi.nlm.nih.gov", "nature.com", "science.org", ".edu", "scholar.google.com", ]; async function academicResearchSummary(topic: string): Promise { const response = await client.responses.create({ model: "openai/gpt-5.4", input: `Provide a comprehensive academic literature review on: ${topic}. Include specific findings, methodologies, and conclusions from recent papers. Cite each claim with its source.`, tools: [{ type: "web_search" as const, filters: { search_domain_filter: ACADEMIC_DOMAINS, }, }], instructions: "Search for peer-reviewed academic sources only. For each claim, attribute it to the specific paper with author names and year. Format the output as a structured literature review with a references section.", }); return `# Literature Review: ${topic}\n\n${response.output_text}`; } const report = await academicResearchSummary( "the effectiveness of large language models for automated code review" ); console.log(report); ``` ## Multi-Field Academic Search Use field-specific domain filters to search across different academic disciplines. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() ACADEMIC_DOMAINS = { "biomedical": ["pubmed.ncbi.nlm.nih.gov", "nih.gov", "thelancet.com", "nejm.org"], "computer_science": ["arxiv.org", "dl.acm.org", "ieee.org", "openreview.net"], "social_science": ["jstor.org", "ssrn.com", "journals.sagepub.com"], } def field_specific_search(query: str, field: str) -> dict: """Search academic literature within a specific field.""" domains = ACADEMIC_DOMAINS.get(field, []) response = client.responses.create( model="openai/gpt-5.4", input=query, tools=[{ "type": "web_search", "filters": { "search_domain_filter": domains, }, }] if domains else [{"type": "web_search"}], instructions=f"Search for peer-reviewed academic sources in the {field.replace('_', ' ')} field. Cite papers with authors and years.", ) return { "field": field, "content": response.output_text, } # Search across multiple fields query = "What are the ethical implications of AI-generated content?" fields = ["computer_science", "social_science"] for field in fields: result = field_specific_search(query, field) print(f"\n{'='*60}") print(f"Field: {result['field']}") print(f"{'='*60}") print(result["content"][:500]) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const ACADEMIC_DOMAINS: Record = { biomedical: ["pubmed.ncbi.nlm.nih.gov", "nih.gov", "thelancet.com", "nejm.org"], computer_science: ["arxiv.org", "dl.acm.org", "ieee.org", "openreview.net"], social_science: ["jstor.org", "ssrn.com", "journals.sagepub.com"], }; async function fieldSpecificSearch(query: string, field: string) { const domains = ACADEMIC_DOMAINS[field] ?? []; const response = await client.responses.create({ model: "openai/gpt-5.4", input: query, tools: domains.length > 0 ? [{ type: "web_search" as const, filters: { search_domain_filter: domains } }] : [{ type: "web_search" as const }], instructions: `Search for peer-reviewed academic sources in the ${field.replace("_", " ")} field. Cite papers with authors and years.`, }); return { field, content: response.output_text, }; } const query = "What are the ethical implications of AI-generated content?"; const fields = ["computer_science", "social_science"]; for (const field of fields) { const result = await fieldSpecificSearch(query, field); console.log(`\n${"=".repeat(60)}`); console.log(`Field: ${result.field}`); console.log("=".repeat(60)); console.log(result.content.slice(0, 500)); } ``` ## Tips and Best Practices 1. **Use `search_domain_filter` with academic domains** to restrict results to peer-reviewed sources. Target domains like `arxiv.org`, `nature.com`, `pubmed.ncbi.nlm.nih.gov`, and `.edu`. 2. **Use `instructions` to guide academic focus.** Tell the model to prioritize peer-reviewed papers, cite authors and years, and focus on specific fields. 3. **Use field-specific domain lists** to narrow results to specific publishers or databases (e.g., PubMed for biomedical, arXiv for CS). 4. **Use structured outputs** for metadata extraction. JSON schemas ensure consistent paper metadata across queries. 5. **Request specific details in your prompt.** Ask for "authors, year, journal, and key findings" to get more complete metadata in the response. 6. **Combine `search_domain_filter` with `search_recency_filter`** for time-sensitive research. Use `"week"`, `"month"`, or `"year"` to find recent publications. ## Next Steps Full reference for domain, recency, and location filters on the Agent API. Extract typed JSON for paper metadata and research findings. Control which domains the search includes or excludes. # Deep Research Workflows Source: https://docs.perplexity.ai/docs/cookbook/articles/async-deep-research/README Use the Agent API deep-research preset for comprehensive, multi-step research tasks — synchronous usage, batch concurrency, result processing, and production patterns This guide shows how to use the Agent API's `deep-research` preset for comprehensive, multi-step research tasks. Deep research performs extended web research, following chains of sources and synthesizing detailed answers. You will learn how to run deep research queries, process results, handle long-running requests, and run batch research workflows. The `deep-research` preset on the Agent API performs multi-step web research, following chains of sources and synthesizing comprehensive answers. It automatically selects the best model and configures tools for deep research. For more on presets, see the [Agent API Presets](/docs/agent-api/presets) docs. ## Prerequisites Install the Perplexity SDK: ```bash Python theme={null} pip install perplexityai ``` ```bash TypeScript theme={null} npm install @perplexity-ai/perplexity_ai ``` If you don't have an API key yet: Navigate to the **API Keys** tab in the API Portal and generate a new key. Then export your API key as an environment variable: ```bash theme={null} export PERPLEXITY_API_KEY="your-api-key" ``` ## Basic Deep Research Use the `deep-research` preset for comprehensive research queries. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( preset="deep-research", input=( "Provide a comprehensive analysis of the current state of nuclear fusion research. " "Cover the main approaches (tokamak, stellarator, inertial confinement, laser-driven), " "key milestones achieved in the past 2 years, major private companies involved, " "and realistic timelines for commercial fusion power." ), ) print(f"Model: {response.model}") print(f"\n{response.output_text}") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ preset: "deep-research", input: "Provide a comprehensive analysis of the current state of nuclear fusion research. Cover the main approaches (tokamak, stellarator, inertial confinement, laser-driven), key milestones achieved in the past 2 years, major private companies involved, and realistic timelines for commercial fusion power.", }); console.log(`Model: ${response.model}`); console.log(`\n${response.output_text}`); ``` ```bash curl theme={null} curl "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "deep-research", "input": "Provide a comprehensive analysis of the current state of nuclear fusion research." }' ``` The `deep-research` preset automatically selects the best model and configures tools for multi-step research. You don't need to specify a model or tools when using presets. ## Processing Deep Research Results Extract and format the key parts of a deep research response. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() def deep_research(query: str) -> dict: """Run a deep research query and extract structured results.""" print(f"Researching: {query[:80]}...") response = client.responses.create( preset="deep-research", input=query, ) content = response.output_text usage = response.usage return { "content": content, "model": response.model, "tokens": { "input": usage.input_tokens if usage else 0, "output": usage.output_tokens if usage else 0, }, "word_count": len(content.split()), } if __name__ == "__main__": output = deep_research( "What is the current state of solid-state battery technology? " "Cover the leading companies, technical challenges remaining, " "and expected timeline for mass production in EVs." ) print(f"\nModel: {output['model']}") print(f"Words: {output['word_count']}") print(f"Tokens: {output['tokens']['input']} in, {output['tokens']['output']} out") print(f"\n{'='*60}\n") print(output["content"][:2000]) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); async function deepResearch(query: string) { console.log(`Researching: ${query.slice(0, 80)}...`); const response = await client.responses.create({ preset: "deep-research", input: query, }); const content = response.output_text; const usage = response.usage; return { content, model: response.model, tokens: { input: usage?.input_tokens ?? 0, output: usage?.output_tokens ?? 0, }, wordCount: content.split(/\s+/).length, }; } const output = await deepResearch( "What is the current state of solid-state battery technology? Cover the leading companies, technical challenges remaining, and expected timeline for mass production in EVs." ); console.log(`\nModel: ${output.model}`); console.log(`Words: ${output.wordCount}`); console.log(`Tokens: ${output.tokens.input} in, ${output.tokens.output} out`); console.log(`\n${"=".repeat(60)}\n`); console.log(output.content.slice(0, 2000)); ``` ## Deep Research with Domain Filtering Combine deep research with domain filters for focused, authoritative research. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() # Deep research restricted to government and academic sources response = client.responses.create( model="openai/gpt-5.2", input=( "Analyze the current regulatory landscape for AI in healthcare. " "Cover FDA guidance, EU AI Act implications, and recent enforcement actions." ), tools=[{ "type": "web_search", "filters": { "search_domain_filter": [".gov", ".europa.eu", "who.int", "nature.com", ".edu"], }, }], instructions=( "Conduct thorough research using only government and academic sources. " "Provide specific regulatory references, dates, and policy details." ), ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "openai/gpt-5.2", input: "Analyze the current regulatory landscape for AI in healthcare. Cover FDA guidance, EU AI Act implications, and recent enforcement actions.", tools: [{ type: "web_search" as const, filters: { search_domain_filter: [".gov", ".europa.eu", "who.int", "nature.com", ".edu"], }, }], instructions: "Conduct thorough research using only government and academic sources. Provide specific regulatory references, dates, and policy details.", }); console.log(response.output_text); ``` ## Batch Research with Concurrency Run multiple deep research queries concurrently using asyncio and the Perplexity SDK. ```python Python theme={null} import asyncio import time from perplexity import AsyncPerplexity async def single_research(client: AsyncPerplexity, query: str) -> dict: """Run a single deep research query.""" start = time.time() try: response = await client.responses.create( preset="deep-research", input=query, ) return { "query": query, "content": response.output_text, "model": response.model, "elapsed": time.time() - start, } except Exception as e: return {"query": query, "error": str(e), "elapsed": time.time() - start} async def batch_research(queries: list[str], max_concurrent: int = 3) -> list[dict]: """Run multiple deep research queries with concurrency limits.""" semaphore = asyncio.Semaphore(max_concurrent) async def limited_research(client, query): async with semaphore: return await single_research(client, query) async with AsyncPerplexity() as client: tasks = [limited_research(client, q) for q in queries] return await asyncio.gather(*tasks) if __name__ == "__main__": queries = [ "What are the latest advances in room-temperature superconductors?", "What is the current state of quantum error correction?", "What are the most promising approaches to carbon capture and storage?", ] print(f"Starting batch research: {len(queries)} queries\n") results = asyncio.run(batch_research(queries, max_concurrent=3)) for r in results: status = "OK" if "content" in r else f"FAILED ({r.get('error')})" word_count = len(r.get("content", "").split()) if "content" in r else 0 print(f" [{r['elapsed']:.0f}s] {r['query'][:60]}... → {status} ({word_count} words)") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; interface ResearchResult { query: string; content?: string; model?: string; elapsed: number; error?: string; } const client = new Perplexity(); async function singleResearch(query: string): Promise { const start = Date.now(); try { const response = await client.responses.create({ preset: 'deep-research', input: query, }); return { query, content: response.output_text, model: response.model, elapsed: (Date.now() - start) / 1000, }; } catch (e) { return { query, error: String(e), elapsed: (Date.now() - start) / 1000 }; } } async function batchResearch(queries: string[], maxConcurrent = 3) { const results: ResearchResult[] = []; const queue = [...queries]; async function worker() { while (queue.length) { const q = queue.shift()!; results.push(await singleResearch(q)); } } await Promise.all( Array.from({ length: maxConcurrent }, () => worker()) ); return results; } const queries = [ 'What are the latest advances in room-temperature superconductors?', 'What is the current state of quantum error correction?', 'What are the most promising approaches to carbon capture and storage?', ]; console.log(`Starting batch research: ${queries.length} queries\n`); const results = await batchResearch(queries, 3); for (const r of results) { const status = r.content ? 'OK' : `FAILED (${r.error})`; const words = r.content ? r.content.split(/\s+/).length : 0; console.log(` [${r.elapsed.toFixed(0)}s] ${r.query.slice(0, 60)}... → ${status} (${words} words)`); } ``` Deep research queries consume significant compute resources. Keep concurrent requests to 3-5 to stay within rate limits and avoid throttling. Check your [rate limits](/docs/admin/rate-limits-usage-tiers) for specific thresholds. ## Tips and Best Practices 1. **Use the `deep-research` preset** for the simplest integration. It automatically selects the best model and configures tools. 2. **Combine with domain filters** when you need authoritative sources. Use `search_domain_filter` to restrict to specific domains. 3. **Use `instructions`** to guide the depth and focus of research. Be specific about what aspects to cover. 4. **Limit concurrency.** Running too many deep research queries simultaneously may trigger rate limits. Use a semaphore to cap concurrent requests to 3-5. 5. **Use the async client for batch workflows.** `AsyncPerplexity` enables concurrent requests without blocking. 6. **Set `max_output_tokens`** for cost control when you need shorter summaries rather than exhaustive reports. ## Next Steps Full reference for available presets including deep-research. Get started with the Agent API for multi-provider access and tools. Control which domains the search includes or excludes. Understand rate limits for research and batch workflows. # RAG with Perplexity Embeddings Source: https://docs.perplexity.ai/docs/cookbook/articles/embeddings-rag/README Build an end-to-end retrieval-augmented generation pipeline using Perplexity's standard and contextualized embedding models. This guide walks through building a complete retrieval-augmented generation (RAG) pipeline using Perplexity's Embeddings API and Agent API. It covers document chunking, embedding with both standard and contextualized models, building an in-memory vector index, querying for relevant context, and generating grounded answers. This guide focuses on the end-to-end pipeline. For API reference details on individual embedding types, see [Standard Embeddings](/docs/embeddings/standard-embeddings) and [Contextualized Embeddings](/docs/embeddings/contextualized-embeddings). ## Pipeline Overview A RAG pipeline retrieves relevant information from your own documents before generating an answer, grounding model responses in your data rather than relying solely on parametric knowledge. RAG Pipeline Diagram RAG Pipeline Diagram The steps are: 1. **Chunk** your source documents into manageable pieces with overlap. 2. **Embed** each chunk using a Perplexity embedding model. 3. **Index** the embeddings for similarity search. 4. **Query** by embedding the user question with the same model. 5. **Retrieve** the top-k most similar chunks. 6. **Generate** an answer by passing the retrieved context to the Agent API. ## Prerequisites Install the Perplexity SDK: ```bash Python theme={null} pip install perplexityai ``` ```bash TypeScript theme={null} npm install @perplexity-ai/perplexity_ai ``` If you don't have an API key yet: Navigate to the **API Keys** tab in the API Portal and generate a new key. Then export your API key as an environment variable: ```bash theme={null} export PERPLEXITY_API_KEY="your-api-key" ``` ## Document Chunking Split your documents into chunks small enough for the model's context window while preserving semantic coherence. Overlapping chunks ensure that information at chunk boundaries is not lost. ```python Python theme={null} def chunk_text(text: str, chunk_size: int = 500, overlap: int = 100) -> list[str]: """Split text into overlapping chunks by character count.""" chunks = [] start = 0 while start < len(text): end = start + chunk_size chunk = text[start:end].strip() if chunk: chunks.append(chunk) start += chunk_size - overlap return chunks document = """Retrieval-augmented generation (RAG) is a technique that combines information retrieval with text generation. Rather than relying solely on a language model's training data, RAG systems first search a knowledge base for relevant documents, then use those documents as context when generating a response. This reduces hallucinations and allows the system to provide answers grounded in specific, up-to-date sources.""" chunks = chunk_text(document, chunk_size=300, overlap=50) for i, chunk in enumerate(chunks): print(f"Chunk {i} ({len(chunk)} chars): {chunk[:60]}...") ``` ```typescript TypeScript theme={null} function chunkText(text: string, chunkSize: number = 500, overlap: number = 100): string[] { const chunks: string[] = []; let start = 0; while (start < text.length) { const end = start + chunkSize; const chunk = text.slice(start, end).trim(); if (chunk) chunks.push(chunk); start += chunkSize - overlap; } return chunks; } const document = `Retrieval-augmented generation (RAG) is a technique that combines information retrieval with text generation. Rather than relying solely on a language model's training data, RAG systems first search a knowledge base for relevant documents, then use those documents as context when generating a response. This reduces hallucinations and allows the system to provide answers grounded in specific, up-to-date sources.`; const chunks = chunkText(document, 300, 50); chunks.forEach((chunk, i) => { console.log(`Chunk ${i} (${chunk.length} chars): ${chunk.slice(0, 60)}...`); }); ``` A chunk size of 300-500 characters with 50-100 characters of overlap works well for most use cases. For structured documents (markdown, HTML), consider splitting on headings or paragraph boundaries instead of raw character counts. ## Embedding with the Standard Model Standard embeddings treat each text independently. Use them when chunks are self-contained and don't rely on surrounding context. ```python Python theme={null} import base64 import numpy as np from perplexity import Perplexity client = Perplexity() def decode_embedding(b64_string: str) -> np.ndarray: """Decode a base64-encoded int8 embedding to a float32 numpy array.""" return np.frombuffer(base64.b64decode(b64_string), dtype=np.int8).astype(np.float32) chunks = [ "RAG combines retrieval with generation to ground responses in real data.", "Document chunking splits text into overlapping segments for embedding.", "Cosine similarity measures the angle between two embedding vectors.", ] response = client.embeddings.create(input=chunks, model="pplx-embed-v1-4b") embeddings = [decode_embedding(emb.embedding) for emb in response.data] print(f"Embedded {len(embeddings)} chunks, each with {len(embeddings[0])} dimensions") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); function decodeEmbedding(b64String: string): Int8Array { const buffer = Buffer.from(b64String, 'base64'); return new Int8Array(buffer.buffer, buffer.byteOffset, buffer.byteLength); } const chunks = [ "RAG combines retrieval with generation to ground responses in real data.", "Document chunking splits text into overlapping segments for embedding.", "Cosine similarity measures the angle between two embedding vectors.", ]; const response = await client.embeddings.create({ input: chunks, model: "pplx-embed-v1-4b" }); const embeddings = response.data.map(emb => decodeEmbedding(emb.embedding)); console.log(`Embedded ${embeddings.length} chunks, each with ${embeddings[0].length} dimensions`); ``` ## Embedding with the Contextualized Model Contextualized embeddings understand that chunks belong to the same document. The model uses cross-chunk attention so that each chunk's embedding incorporates information from its neighbors. The key API difference is the nested array structure: each inner array contains chunks from a single document. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() # Two source documents, each split into chunks doc1_chunks = [ "RAG combines retrieval with generation to produce grounded answers.", "The retrieval step searches a vector index for chunks similar to the query.", "The generation step uses retrieved context to produce a final response." ] doc2_chunks = [ "Embedding models convert text into dense vector representations.", "Cosine similarity is the standard metric for comparing embeddings." ] # Pass as nested arrays (one inner array per document) response = client.contextualized_embeddings.create( input=[doc1_chunks, doc2_chunks], model="pplx-embed-context-v1-4b" ) # Nested response: response.data[doc_idx].data[chunk_idx] for doc in response.data: for chunk in doc.data: print(f"Doc {doc.index}, Chunk {chunk.index}: {chunk.embedding[:20]}...") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const doc1Chunks = [ "RAG combines retrieval with generation to produce grounded answers.", "The retrieval step searches a vector index for chunks similar to the query.", "The generation step uses retrieved context to produce a final response." ]; const doc2Chunks = [ "Embedding models convert text into dense vector representations.", "Cosine similarity is the standard metric for comparing embeddings." ]; // Pass as nested arrays (one inner array per document) const response = await client.contextualizedEmbeddings.create({ input: [doc1Chunks, doc2Chunks], model: "pplx-embed-context-v1-4b" }); // Nested response: response.data[docIdx].data[chunkIdx] for (const doc of response.data) { for (const chunk of doc.data) { console.log(`Doc ${doc.index}, Chunk ${chunk.index}: ${chunk.embedding.slice(0, 20)}...`); } } ``` **Chunk ordering matters.** Chunks within each document must be passed in their original sequential order. The contextualized model uses positional context to relate neighboring chunks, so shuffling them will degrade embedding quality. ## Querying a Contextualized Index When using contextualized embeddings, wrap each query as a single-element inner list (e.g., `[[query]]`) so the API treats it as a single-chunk document: ```python Python theme={null} from perplexity import Perplexity import base64, numpy as np client = Perplexity() def decode_embedding(b64: str) -> np.ndarray: return np.frombuffer(base64.b64decode(b64), dtype=np.int8).astype(np.float32) def cosine_similarity(a: np.ndarray, b: np.ndarray) -> float: return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))) # Index with contextualized model (chunks share cross-chunk attention) doc_chunks = [ "RAG combines retrieval with generation to produce grounded answers.", "The retrieval step finds chunks similar to the user query.", "The generation step uses retrieved context to produce a final response.", ] ctx_response = client.contextualized_embeddings.create( input=[doc_chunks], # nested array: one inner list per document model="pplx-embed-context-v1-4b" ) index = [ {"embedding": decode_embedding(chunk.embedding), "text": doc_chunks[chunk.index]} for chunk in ctx_response.data[0].data ] # Query the index query = "How does retrieval work in RAG?" q_response = client.contextualized_embeddings.create( input=[[query]], model="pplx-embed-context-v1-4b" ) q_emb = decode_embedding(q_response.data[0].data[0].embedding) results = sorted(index, key=lambda x: cosine_similarity(q_emb, x["embedding"]), reverse=True) print(f"Top result: {results[0]['text']}") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); function decodeEmbedding(b64: string): Int8Array { const buffer = Buffer.from(b64, 'base64'); return new Int8Array(buffer.buffer, buffer.byteOffset, buffer.byteLength); } function cosineSimilarity(a: Int8Array, b: Int8Array): number { let dot = 0, normA = 0, normB = 0; for (let i = 0; i < a.length; i++) { dot += a[i] * b[i]; normA += a[i] ** 2; normB += b[i] ** 2; } return dot / (Math.sqrt(normA) * Math.sqrt(normB)); } // Index with contextualized model const docChunks = [ "RAG combines retrieval with generation to produce grounded answers.", "The retrieval step finds chunks similar to the user query.", "The generation step uses retrieved context to produce a final response.", ]; const ctxResponse = await client.contextualizedEmbeddings.create({ input: [docChunks], // nested array: one inner array per document model: "pplx-embed-context-v1-4b" }); const index = ctxResponse.data[0].data.map(chunk => ({ embedding: decodeEmbedding(chunk.embedding), text: docChunks[chunk.index], })); // Query the index const query = "How does retrieval work in RAG?"; const qResponse = await client.contextualizedEmbeddings.create({ input: [[query]], model: "pplx-embed-context-v1-4b" }); const qEmb = decodeEmbedding(qResponse.data[0].data[0].embedding); const results = [...index].sort((a, b) => cosineSimilarity(qEmb, b.embedding) - cosineSimilarity(qEmb, a.embedding)); console.log(`Top result: ${results[0].text}`); ``` ## Building a Vector Index This example uses numpy for cosine similarity with a simple in-memory index. For production systems with millions of vectors, use a dedicated vector database (Pinecone, Weaviate, Qdrant, etc.). ```python Python theme={null} import base64 import numpy as np from perplexity import Perplexity client = Perplexity() def decode_embedding(b64_string: str) -> np.ndarray: return np.frombuffer(base64.b64decode(b64_string), dtype=np.int8).astype(np.float32) def cosine_similarity(a: np.ndarray, b: np.ndarray) -> float: return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))) # Documents to index documents = { "RAG Overview": [ "Retrieval-augmented generation grounds LLM responses in external data.", "RAG reduces hallucinations by providing factual context to the model.", "A typical RAG pipeline has three stages: indexing, retrieval, and generation." ], "Embedding Models": [ "Embedding models map text to dense vector representations.", "Similar texts produce vectors that are close in the embedding space.", "Perplexity offers both standard and contextualized embedding models." ] } # Build index: list of (embedding, text, doc_title) tuples index = [] for title, chunks in documents.items(): response = client.embeddings.create(input=chunks, model="pplx-embed-v1-4b") for emb_obj in response.data: index.append({ "embedding": decode_embedding(emb_obj.embedding), "text": chunks[emb_obj.index], "doc_title": title }) print(f"Indexed {len(index)} chunks") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); function decodeEmbedding(b64String: string): Int8Array { const buffer = Buffer.from(b64String, 'base64'); return new Int8Array(buffer.buffer, buffer.byteOffset, buffer.byteLength); } function cosineSimilarity(a: Int8Array, b: Int8Array): number { let dot = 0, normA = 0, normB = 0; for (let i = 0; i < a.length; i++) { dot += a[i] * b[i]; normA += a[i] * a[i]; normB += b[i] * b[i]; } return dot / (Math.sqrt(normA) * Math.sqrt(normB)); } const documents: Record = { "RAG Overview": [ "Retrieval-augmented generation grounds LLM responses in external data.", "RAG reduces hallucinations by providing factual context to the model.", "A typical RAG pipeline has three stages: indexing, retrieval, and generation." ], "Embedding Models": [ "Embedding models map text to dense vector representations.", "Similar texts produce vectors that are close in the embedding space.", "Perplexity offers both standard and contextualized embedding models." ] }; // Build index const index: { embedding: Int8Array; text: string; docTitle: string }[] = []; for (const [title, chunks] of Object.entries(documents)) { const response = await client.embeddings.create({ input: chunks, model: "pplx-embed-v1-4b" }); for (const embObj of response.data) { index.push({ embedding: decodeEmbedding(embObj.embedding), text: chunks[embObj.index], docTitle: title }); } } console.log(`Indexed ${index.length} chunks`); ``` ## Query Pipeline The full query pipeline embeds the user question, retrieves the top-k most similar chunks, and passes them as context to the Agent API for answer generation. ```python Python theme={null} def rag_query(question: str, index: list[dict], top_k: int = 3, min_score: float = 0.3) -> str: """Embed question -> retrieve similar chunks -> generate answer.""" # Step 1: Embed the question query_response = client.embeddings.create(input=[question], model="pplx-embed-v1-4b") query_emb = decode_embedding(query_response.data[0].embedding) # Step 2: Retrieve top-k chunks above the minimum similarity threshold scored = sorted( [{"score": cosine_similarity(query_emb, item["embedding"]), **item} for item in index], key=lambda x: x["score"], reverse=True )[:top_k] scored = [item for item in scored if item["score"] >= min_score] if not scored: return "No relevant context found for this question." # Include source attribution alongside each chunk context = "\n\n".join( f"[Source: {item['doc_title']}]\n{item['text']}" for item in scored ) # Step 3: Generate answer via Agent API response = client.responses.create( model="openai/gpt-5.4", input=question, instructions=( "Answer based only on the provided context. " "Cite sources by name when referencing specific information. " "If the context does not contain enough information, say so.\n\n" f"Context:\n{context}" ) ) return response.output_text answer = rag_query("What are the stages of a RAG pipeline?", index) print(answer) ``` ```typescript TypeScript theme={null} async function ragQuery(question: string, idx: typeof index, topK: number = 3, minScore: number = 0.3): Promise { // Step 1: Embed the question const qResponse = await client.embeddings.create({ input: [question], model: "pplx-embed-v1-4b" }); const qEmb = decodeEmbedding(qResponse.data[0].embedding); // Step 2: Retrieve top-k chunks above the minimum similarity threshold const scored = idx .map(item => ({ ...item, score: cosineSimilarity(qEmb, item.embedding) })) .sort((a, b) => b.score - a.score) .slice(0, topK) .filter(item => item.score >= minScore); if (scored.length === 0) { return "No relevant context found for this question."; } // Include source attribution alongside each chunk const context = scored .map(item => `[Source: ${item.docTitle}]\n${item.text}`) .join("\n\n"); // Step 3: Generate answer via Agent API const response = await client.responses.create({ model: "openai/gpt-5.4", input: question, instructions: `Answer based only on the provided context. Cite sources by name when referencing specific information. If the context does not contain enough information, say so.\n\nContext:\n${context}` }); return response.output_text; } const answer = await ragQuery("What are the stages of a RAG pipeline?", index); console.log(answer); ``` Start with `top_k=3` and `min_score=0.3` for most use cases. Raise `top_k` to 5–7 for broad questions or short chunks. Raise `min_score` to 0.5–0.7 if retrieved chunks contain irrelevant information. Lower it toward 0.2 for diverse or ambiguous queries. ## Standard vs Contextualized Comparison | Aspect | Standard (`pplx-embed-v1-4b`) | Contextualized (`pplx-embed-context-v1-4b`) | | --------------------- | ---------------------------------------------- | ---------------------------------------------------------- | | **Input format** | Flat list of texts | Nested arrays grouped by document | | **Context awareness** | Each text embedded independently | Chunks share cross-chunk context within each document | | **Best for** | FAQ entries, standalone texts, short documents | Document paragraphs, article sections | | **Chunk ordering** | Order does not matter | Must be in original document order | | **Query embedding** | `client.embeddings.create(input=[query])` | `client.contextualized_embeddings.create(input=[[query]])` | | **Price (4b model)** | \$0.03 / 1M tokens | \$0.05 / 1M tokens | ### When to Use Standard Embeddings * Chunks are self-contained and do not rely on surrounding context. * Your content consists of FAQ pairs, product descriptions, or short independent entries. * You need the lowest cost per token. ### When to Use Contextualized Embeddings * Chunks come from longer documents where meaning depends on neighboring text. * A chunk like "This approach improves performance by 20%" only makes sense with its surrounding context. * You are embedding paragraphs from articles, reports, or technical documentation. * You want higher retrieval accuracy at a modest cost increase. ## Matryoshka Dimensions Perplexity embedding models support Matryoshka Representation Learning (MRL), which concentrates the most important information in the first N dimensions. You can request reduced dimensions directly via the API for faster search and smaller storage. ```python Python theme={null} import base64 import numpy as np from perplexity import Perplexity client = Perplexity() texts = ["Matryoshka embeddings allow dimension reduction without re-embedding."] def decode_embedding(b64: str) -> np.ndarray: return np.frombuffer(base64.b64decode(b64), dtype=np.int8) # Full dimensions (2560 for 4b model) full = client.embeddings.create(input=texts, model="pplx-embed-v1-4b") # Reduced to 512 dimensions via the API reduced = client.embeddings.create(input=texts, model="pplx-embed-v1-4b", dimensions=512) print(f"Full: {len(decode_embedding(full.data[0].embedding))} dimensions") print(f"Reduced: {len(decode_embedding(reduced.data[0].embedding))} dimensions") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const texts = ["Matryoshka embeddings allow dimension reduction without re-embedding."]; function decodeEmbedding(b64: string): Int8Array { const buffer = Buffer.from(b64, 'base64'); return new Int8Array(buffer.buffer, buffer.byteOffset, buffer.byteLength); } // Full dimensions (2560 for 4b model) const full = await client.embeddings.create({ input: texts, model: "pplx-embed-v1-4b" }); // Reduced to 512 dimensions via the API const reduced = await client.embeddings.create({ input: texts, model: "pplx-embed-v1-4b", dimensions: 512 }); console.log(`Full: ${decodeEmbedding(full.data[0].embedding).length} dimensions`); console.log(`Reduced: ${decodeEmbedding(reduced.data[0].embedding).length} dimensions`); ``` Dimension reduction tradeoffs for the `pplx-embed-v1-4b` model: | Dimensions | Storage per Vector | Relative Quality | Use Case | | :---------: | :----------------: | :--------------: | ------------------------------------------ | | 2560 (full) | 2.5 KB | Highest | Maximum accuracy, small datasets | | 1024 | 1 KB | Very high | Good balance for most applications | | 512 | 512 B | High | Large-scale retrieval, fast search | | 256 | 256 B | Moderate | Extremely large datasets, coarse filtering | | 128 | 128 B | Lower | First-pass candidate filtering | Use the `dimensions` parameter in the API call rather than manually truncating vectors. The API applies proper normalization for the requested dimension count. Start with full dimensions and reduce only when storage or latency becomes a bottleneck. ## Batch Processing When embedding large document collections, process them in batches to stay within API rate limits. The standard API accepts up to 512 texts per request with a combined limit of 120,000 tokens. ```python Python theme={null} import asyncio import base64 import numpy as np from perplexity import AsyncPerplexity def decode_embedding(b64_string: str) -> np.ndarray: return np.frombuffer(base64.b64decode(b64_string), dtype=np.int8).astype(np.float32) async def batch_embed(texts: list[str], batch_size: int = 100) -> list[np.ndarray]: """Embed texts in batches with rate limiting.""" async with AsyncPerplexity() as client: all_embeddings = [] for i in range(0, len(texts), batch_size): batch = texts[i:i + batch_size] response = await client.embeddings.create( input=batch, model="pplx-embed-v1-4b" ) all_embeddings.extend(decode_embedding(e.embedding) for e in response.data) print(f"Embedded {min(i + batch_size, len(texts))}/{len(texts)}") if i + batch_size < len(texts): await asyncio.sleep(0.1) # Brief delay between batches return all_embeddings # Usage texts = [f"Document chunk number {i} with content." for i in range(500)] embeddings = asyncio.run(batch_embed(texts, batch_size=100)) print(f"Total: {len(embeddings)} embeddings") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); function decodeEmbedding(b64String: string): Int8Array { const buffer = Buffer.from(b64String, 'base64'); return new Int8Array(buffer.buffer, buffer.byteOffset, buffer.byteLength); } async function batchEmbed(texts: string[], batchSize: number = 100): Promise { const allEmbeddings: Int8Array[] = []; for (let i = 0; i < texts.length; i += batchSize) { const batch = texts.slice(i, i + batchSize); const response = await client.embeddings.create({ input: batch, model: "pplx-embed-v1-4b" }); allEmbeddings.push(...response.data.map(e => decodeEmbedding(e.embedding))); console.log(`Embedded ${Math.min(i + batchSize, texts.length)}/${texts.length}`); if (i + batchSize < texts.length) { await new Promise(r => setTimeout(r, 100)); // Brief delay between batches } } return allEmbeddings; } // Usage const texts = Array.from({ length: 500 }, (_, i) => `Document chunk number ${i} with content.`); const embeddings = await batchEmbed(texts, 100); console.log(`Total: ${embeddings.length} embeddings`); ``` For contextualized embeddings, batch at the document level using `client.contextualized_embeddings.create(input=batch_of_doc_arrays)` with the same pattern. The contextualized API accepts up to 512 documents with 16,000 total chunks per request. **Rate limits:** Keep batch sizes well within the API limits (512 texts / 120,000 tokens for standard; 512 documents / 16,000 chunks for contextualized) and add small delays between requests to avoid throttling. ## Complete Example A self-contained pipeline that indexes two documents with contextualized embeddings and answers questions against the indexed content. ```python Python theme={null} import base64 import numpy as np from perplexity import Perplexity client = Perplexity() # --- Helpers --- def chunk_text(text: str, chunk_size: int = 400, overlap: int = 80) -> list[str]: chunks, start = [], 0 while start < len(text): chunk = text[start:start + chunk_size].strip() if chunk: chunks.append(chunk) start += chunk_size - overlap return chunks def decode_embedding(b64: str) -> np.ndarray: return np.frombuffer(base64.b64decode(b64), dtype=np.int8).astype(np.float32) def cosine_similarity(a: np.ndarray, b: np.ndarray) -> float: return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))) # --- Source documents --- DOCUMENTS = { "Quantum Computing": ( "Quantum computers use qubits that can exist in superposition, representing " "0 and 1 simultaneously. Unlike classical bits, qubits leverage quantum " "interference to perform calculations. Quantum entanglement allows qubits to " "be correlated, enabling parallel processing at scale. Current quantum computers " "from IBM, Google, and others have dozens to hundreds of physical qubits." ), "Machine Learning": ( "Machine learning enables computers to learn from data without explicit " "programming. Supervised learning uses labeled examples to train models for " "classification and regression. Neural networks with many layers (deep learning) " "excel at image recognition and language tasks. Training requires large datasets " "and significant compute, often using GPUs or TPUs." ), } # --- Step 1: Index with the model --- def build_index(documents: dict[str, str]) -> list[dict]: index = [] for title, text in documents.items(): chunks = chunk_text(text) response = client.contextualized_embeddings.create( input=[chunks], model="pplx-embed-context-v1-4b" ) for chunk_obj in response.data[0].data: index.append({ "embedding": decode_embedding(chunk_obj.embedding), "text": chunks[chunk_obj.index], "doc_title": title, }) print(f"Indexed {len(index)} chunks from {len(documents)} documents") return index # --- Step 2: Query the index, retrieve, generate --- def rag_query(question: str, index: list[dict], top_k: int = 3, min_score: float = 0.3) -> str: q_resp = client.contextualized_embeddings.create( input=[[question]], model="pplx-embed-context-v1-4b" ) q_emb = decode_embedding(q_resp.data[0].data[0].embedding) results = sorted( [{"score": cosine_similarity(q_emb, item["embedding"]), **item} for item in index], key=lambda x: x["score"], reverse=True )[:top_k] results = [r for r in results if r["score"] >= min_score] if not results: return "No relevant context found for this question." context = "\n\n".join(f"[{r['doc_title']}]\n{r['text']}" for r in results) response = client.responses.create( model="openai/gpt-5.4", input=question, instructions=( "Answer based only on the provided context. " "Cite the source name in brackets when referencing information. " "If the context is insufficient, say so.\n\n" f"Context:\n{context}" ) ) return response.output_text # --- Run --- if __name__ == "__main__": index = build_index(DOCUMENTS) questions = [ "What makes qubits different from classical bits?", "What hardware is used to train machine learning models?", ] for q in questions: print(f"\nQ: {q}") print(f"A: {rag_query(q, index)}") ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); // --- Helpers --- function chunkText(text: string, chunkSize = 400, overlap = 80): string[] { const chunks: string[] = []; let start = 0; while (start < text.length) { const chunk = text.slice(start, start + chunkSize).trim(); if (chunk) chunks.push(chunk); start += chunkSize - overlap; } return chunks; } function decodeEmbedding(b64: string): Int8Array { const buffer = Buffer.from(b64, 'base64'); return new Int8Array(buffer.buffer, buffer.byteOffset, buffer.byteLength); } function cosineSimilarity(a: Int8Array, b: Int8Array): number { let dot = 0, normA = 0, normB = 0; for (let i = 0; i < a.length; i++) { dot += a[i] * b[i]; normA += a[i] ** 2; normB += b[i] ** 2; } return dot / (Math.sqrt(normA) * Math.sqrt(normB)); } // --- Source documents --- const DOCUMENTS: Record = { "Quantum Computing": "Quantum computers use qubits that can exist in superposition, representing 0 and 1 simultaneously. Unlike classical bits, qubits leverage quantum interference to perform calculations. Quantum entanglement allows qubits to be correlated, enabling parallel processing at scale. Current quantum computers from IBM, Google, and others have dozens to hundreds of physical qubits.", "Machine Learning": "Machine learning enables computers to learn from data without explicit programming. Supervised learning uses labeled examples to train models for classification and regression. Neural networks with many layers (deep learning) excel at image recognition and language tasks. Training requires large datasets and significant compute, often using GPUs or TPUs.", }; type IndexEntry = { embedding: Int8Array; text: string; docTitle: string }; // --- Step 1: Index with the model --- async function buildIndex(documents: Record): Promise { const index: IndexEntry[] = []; for (const [title, text] of Object.entries(documents)) { const chunks = chunkText(text); const response = await client.contextualizedEmbeddings.create({ input: [chunks], model: "pplx-embed-context-v1-4b" }); for (const chunkObj of response.data[0].data) { index.push({ embedding: decodeEmbedding(chunkObj.embedding), text: chunks[chunkObj.index], docTitle: title, }); } } console.log(`Indexed ${index.length} chunks from ${Object.keys(documents).length} documents`); return index; } // --- Step 2: Query the index, retrieve, generate --- async function ragQuery( question: string, index: IndexEntry[], topK = 3, minScore = 0.3 ): Promise { const qResp = await client.contextualizedEmbeddings.create({ input: [[question]], model: "pplx-embed-context-v1-4b" }); const qEmb = decodeEmbedding(qResp.data[0].data[0].embedding); const results = index .map(item => ({ ...item, score: cosineSimilarity(qEmb, item.embedding) })) .sort((a, b) => b.score - a.score) .slice(0, topK) .filter(r => r.score >= minScore); if (results.length === 0) return "No relevant context found for this question."; const context = results.map(r => `[${r.docTitle}]\n${r.text}`).join("\n\n"); const response = await client.responses.create({ model: "openai/gpt-5.4", input: question, instructions: `Answer based only on the provided context. Cite the source name in brackets when referencing information. If the context is insufficient, say so.\n\nContext:\n${context}`, }); return response.output_text; } // --- Run --- const index = await buildIndex(DOCUMENTS); const questions = [ "What makes qubits different from classical bits?", "What hardware is used to train machine learning models?", ]; for (const q of questions) { console.log(`\nQ: ${q}`); console.log(`A: ${await ragQuery(q, index)}`); } ``` ## Next Steps API reference for standard embedding parameters and response format. API reference for contextualized embedding parameters and response format. Encoding formats, similarity metrics, normalization, and error handling. Learn more about the Responses API used for answer generation. # Memory Management Source: https://docs.perplexity.ai/docs/cookbook/articles/memory-management/README Advanced conversation memory solutions using LlamaIndex for persistent, context-aware applications # Memory Management with LlamaIndex and Perplexity Sonar API ## Overview This article explores advanced solutions for preserving conversational memory in applications powered by large language models (LLMs). The goal is to enable coherent multi-turn conversations by retaining context across interactions, even when constrained by the model's token limit. ## Problem Statement LLMs have a limited context window, making it challenging to maintain long-term conversational memory. Without proper memory management, follow-up questions can lose relevance or hallucinate unrelated answers. ## Approaches Using LlamaIndex, we implemented two distinct strategies for solving this problem: ### 1. **Chat Summary Memory Buffer** * **Goal**: Summarize older messages to fit within the token limit while retaining key context. * **Approach**: * Uses LlamaIndex's `ChatSummaryMemoryBuffer` to truncate and summarize conversation history dynamically. * Ensures that key details from earlier interactions are preserved in a compact form. * **Use Case**: Ideal for short-term conversations where memory efficiency is critical. * **Implementation**: [View the complete guide →](/docs/cookbook/articles/memory-management/chat-summary-memory-buffer/README) ### 2. **Persistent Memory with LanceDB** * **Goal**: Enable long-term memory persistence across sessions. * **Approach**: * Stores conversation history as vector embeddings in LanceDB. * Retrieves relevant historical context using semantic search and metadata filters. * Integrates Perplexity's Sonar API for generating responses based on retrieved context. * **Use Case**: Suitable for applications requiring long-term memory retention and contextual recall. * **Implementation**: [View the complete guide →](/docs/cookbook/articles/memory-management/chat-with-persistence/README) ## Directory Structure ``` articles/memory-management/ ├── chat-summary-memory-buffer/ # Implementation of summarization-based memory ├── chat-with-persistence/ # Implementation of persistent memory with LanceDB ``` ## Getting Started 1. Clone the repository: ```bash theme={null} git clone https://github.com/ppl-ai/api-cookbook.git cd api-cookbook/articles/memory-management ``` 2. Follow the README in each subdirectory for setup instructions and usage examples. ## Key Benefits * **Context Window Management**: 43% reduction in token usage through summarization * **Conversation Continuity**: 92% context retention across sessions * **API Compatibility**: 100% success rate with Perplexity message schema * **Production Ready**: Scalable architectures for enterprise applications ## Contributions If you have found another way to tackle the same issue using LlamaIndex please feel free to open a PR! Check out our [CONTRIBUTING.md](https://github.com/ppl-ai/api-cookbook/blob/main/CONTRIBUTING.md) file for more guidance. *** # Chat Summary Memory Buffer Source: https://docs.perplexity.ai/docs/cookbook/articles/memory-management/chat-summary-memory-buffer/README Token-aware conversation memory using summarization with LlamaIndex and Perplexity Sonar API ## Memory Management for Sonar API Integration using `ChatSummaryMemoryBuffer` ### Overview This implementation demonstrates advanced conversation memory management using LlamaIndex's `ChatSummaryMemoryBuffer` with Perplexity's Sonar API. The system maintains coherent multi-turn dialogues while efficiently handling token limits through intelligent summarization. ### Key Features * **Token-Aware Summarization**: Automatically condenses older messages when approaching 3000-token limit * **Cross-Session Persistence**: Maintains conversation context between API calls and application restarts * **Perplexity API Integration**: Direct compatibility with Sonar-pro model endpoints * **Hybrid Memory Management**: Combines raw message retention with iterative summarization ### Implementation Details #### Core Components 1. **Memory Initialization** ```python theme={null} memory = ChatSummaryMemoryBuffer.from_defaults( token_limit=3000, # 75% of Sonar's 4096 context window llm=llm # Shared LLM instance for summarization ) ``` * Reserves 25% of context window for responses * Uses same LLM for summarization and chat completion 2. **Message Processing Flow** ```mermaid theme={null} graph TD A[User Input] --> B{Store Message} B --> C[Check Token Limit] C -->|Under Limit| D[Retain Full History] C -->|Over Limit| E[Summarize Oldest Messages] E --> F[Generate Compact Summary] F --> G[Maintain Recent Messages] G --> H[Build Optimized Payload] ``` 3. **API Compatibility Layer** ```python theme={null} messages_dict = [ {"role": m.role, "content": m.content} for m in messages ] ``` * Converts LlamaIndex's `ChatMessage` objects to Perplexity-compatible dictionaries * Preserves core message structure while removing internal metadata ### Usage Example **Multi-Turn Conversation:** ```python theme={null} # Initial query about astronomy print(chat_with_memory("What causes neutron stars to form?")) # Detailed formation explanation # Context-aware follow-up print(chat_with_memory("How does that differ from black holes?")) # Comparative analysis # Session persistence demo memory.persist("astrophysics_chat.json") # New session loading loaded_memory = ChatSummaryMemoryBuffer.from_defaults( persist_path="astrophysics_chat.json", llm=llm ) print(chat_with_memory("Recap our previous discussion")) # Summarized history retrieval ``` ### Setup Requirements 1. **Environment Variables** ```bash theme={null} export PERPLEXITY_API_KEY="your_pplx_key_here" ``` 2. **Dependencies** ```text theme={null} llama-index-core>=0.10.0 llama-index-llms-openai>=0.10.0 openai>=1.12.0 ``` 3. **Execution** ```bash theme={null} python3 scripts/example_usage.py ``` This implementation solves key LLM conversation challenges: * **Context Window Management**: 43% reduction in token usage through summarization\[1]\[5] * **Conversation Continuity**: 92% context retention across sessions\[3]\[13] * **API Compatibility**: 100% success rate with Perplexity message schema\[6]\[14] The architecture enables production-grade chat applications with Perplexity's Sonar models while maintaining LlamaIndex's powerful memory management capabilities. ## Learn More For additional context on memory management approaches, see the parent [Memory Management Guide](../README). Citations: ```text theme={null} [1] https://docs.llamaindex.ai/en/stable/examples/agent/memory/summary_memory_buffer/ [2] https://ai.plainenglish.io/enhancing-chat-model-performance-with-perplexity-in-llamaindex-b26d8c3a7d2d [3] https://docs.llamaindex.ai/en/v0.10.34/examples/memory/ChatSummaryMemoryBuffer/ [4] https://www.youtube.com/watch?v=PHEZ6AHR57w [5] https://docs.llamaindex.ai/en/stable/examples/memory/ChatSummaryMemoryBuffer/ [6] https://docs.llamaindex.ai/en/stable/api_reference/llms/perplexity/ [7] https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/ [8] https://github.com/run-llama/llama_index/issues/8731 [9] https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/memory/chat_summary_memory_buffer.py [10] https://docs.llamaindex.ai/en/stable/examples/llm/perplexity/ [11] https://github.com/run-llama/llama_index/issues/14958 [12] https://llamahub.ai/l/llms/llama-index-llms-perplexity?from= [13] https://www.reddit.com/r/LlamaIndex/comments/1j55oxz/how_do_i_manage_session_short_term_memory_in/ [14] https://docs.perplexity.ai/guides/getting-started [15] https://docs.llamaindex.ai/en/stable/api_reference/memory/chat_memory_buffer/ [16] https://github.com/run-llama/LlamaIndexTS/issues/227 [17] https://docs.llamaindex.ai/en/stable/understanding/using_llms/using_llms/ [18] https://apify.com/jons/perplexity-actor/api [19] https://docs.llamaindex.ai ``` *** # Persistent Chat Memory Source: https://docs.perplexity.ai/docs/cookbook/articles/memory-management/chat-with-persistence/README Long-term conversation memory using LanceDB vector storage and Perplexity Sonar API # Persistent Chat Memory with Perplexity Sonar API ## Overview This implementation demonstrates long-term conversation memory preservation using LlamaIndex's vector storage and Perplexity's Sonar API. Maintains context across API calls through intelligent retrieval and summarization. ## Key Features * **Multi-Turn Context Retention**: Remembers previous queries/responses * **Semantic Search**: Finds relevant conversation history using vector embeddings * **Perplexity Integration**: Leverages Sonar-pro model for accurate responses * **LanceDB Storage**: Persistent conversation history using columnar vector database ## Implementation Details ### Core Components ```python theme={null} # Memory initialization vector_store = LanceDBVectorStore(uri="./lancedb", table_name="chat_history") storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex([], storage_context=storage_context) ``` ### Conversation Flow 1. Stores user queries as vector embeddings 2. Retrieves top 3 relevant historical interactions 3. Generates Sonar API requests with contextual history 4. Persists responses for future conversations ### API Integration ```python theme={null} # Sonar API call with conversation context messages = [ {"role": "system", "content": f"Context: {context_nodes}"}, {"role": "user", "content": user_query} ] response = sonar_client.chat.completions.create( model="sonar-pro", messages=messages ) ``` ## Setup ### Requirements ```bash theme={null} llama-index-core>=0.10.0 llama-index-vector-stores-lancedb>=0.1.0 lancedb>=0.4.0 openai>=1.12.0 python-dotenv>=0.19.0 ``` ### Configuration 1. Set API key: ```bash theme={null} export PERPLEXITY_API_KEY="your-api-key-here" ``` ## Usage ### Basic Conversation ```python theme={null} from chat_with_persistence import initialize_chat_session, chat_with_persistence index = initialize_chat_session() print(chat_with_persistence("Current weather in London?", index)) print(chat_with_persistence("How does this compare to yesterday?", index)) ``` ### Expected Output ```text theme={null} Initial Query: Detailed London weather report Follow-up: Comparative analysis using stored context ``` ### **Try it out yourself!** ```bash theme={null} python3 scripts/example_usage.py ``` ## Persistence Verification ``` import lancedb db = lancedb.connect("./lancedb") table = db.open_table("chat_history") print(table.to_pandas()[["text", "metadata"]]) ``` This implementation solves key challenges in LLM conversations: * Maintains 93% context accuracy across 10+ turns * Reduces hallucination by 67% through contextual grounding * Enables hour-long conversations within 4096 token window ## Learn More For additional context on memory management approaches, see the parent [Memory Management Guide](../README). For full documentation, see [LlamaIndex Memory Guide](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/) and [Perplexity API Docs](https://docs.perplexity.ai/). # OpenAI Agents Integration Source: https://docs.perplexity.ai/docs/cookbook/articles/openai-agents-integration/README Complete guide for integrating Perplexity's Sonar API with the OpenAI Agents SDK ## 🎯 What You'll Build By the end of this guide, you'll have: * ✅ A custom async OpenAI client configured for Sonar API * ✅ An intelligent agent with function calling capabilities * ✅ A working example that fetches real-time information * ✅ Production-ready integration patterns ## 🏗️ Architecture Overview ```mermaid theme={null} graph TD A[Your Application] --> B[OpenAI Agents SDK] B --> C[Custom AsyncOpenAI Client] C --> D[Perplexity Sonar API] B --> E[Function Tools] E --> F[Weather API, etc.] ``` This integration allows you to: 1. **Leverage Sonar's search capabilities** for real-time, grounded responses 2. **Use OpenAI's agent framework** for structured interactions and function calling 3. **Combine both** for powerful, context-aware applications ## 📋 Prerequisites Before starting, ensure you have: * **Python 3.7+** installed * **Perplexity API Key** - [Get one here](https://docs.perplexity.ai/home) * **OpenAI Agents SDK** access and familiarity ## 🚀 Installation Install the required dependencies: ```bash theme={null} pip install openai nest-asyncio ``` :::info The `nest-asyncio` package is required for running async code in environments like Jupyter notebooks that already have an event loop running. ::: ## ⚙️ Environment Setup Configure your environment variables: ```bash theme={null} # Required: Your Perplexity API key export EXAMPLE_API_KEY="your-perplexity-api-key" # Optional: Customize the API endpoint (defaults to official endpoint) export EXAMPLE_BASE_URL="https://api.perplexity.ai" # Optional: Choose your model (defaults to sonar-pro) export EXAMPLE_MODEL_NAME="sonar-pro" ``` ## 💻 Complete Implementation Here's the full implementation with detailed explanations: ```python theme={null} # Import necessary standard libraries import asyncio # For running asynchronous code import os # To access environment variables # Import AsyncOpenAI for creating an async client from openai import AsyncOpenAI # Import custom classes and functions from the agents package. # These handle agent creation, model interfacing, running agents, and more. from agents import Agent, OpenAIChatCompletionsModel, Runner, function_tool, set_tracing_disabled # Retrieve configuration from environment variables or use defaults BASE_URL = os.getenv("EXAMPLE_BASE_URL") or "https://api.perplexity.ai" API_KEY = os.getenv("EXAMPLE_API_KEY") MODEL_NAME = os.getenv("EXAMPLE_MODEL_NAME") or "sonar-pro" # Validate that all required configuration variables are set if not BASE_URL or not API_KEY or not MODEL_NAME: raise ValueError( "Please set EXAMPLE_BASE_URL, EXAMPLE_API_KEY, EXAMPLE_MODEL_NAME via env var or code." ) # Initialize the custom OpenAI async client with the specified BASE_URL and API_KEY. client = AsyncOpenAI(base_url=BASE_URL, api_key=API_KEY) # Disable tracing to avoid using a platform tracing key; adjust as needed. set_tracing_disabled(disabled=True) # Define a function tool that the agent can call. # The decorator registers this function as a tool in the agents framework. @function_tool def get_weather(city: str): """ Simulate fetching weather data for a given city. Args: city (str): The name of the city to retrieve weather for. Returns: str: A message with weather information. """ print(f"[debug] getting weather for {city}") return f"The weather in {city} is sunny." # Import nest_asyncio to support nested event loops import nest_asyncio # Apply the nest_asyncio patch to enable running asyncio.run() # even if an event loop is already running. nest_asyncio.apply() async def main(): """ Main asynchronous function to set up and run the agent. This function creates an Agent with a custom model and function tools, then runs a query to get the weather in Tokyo. """ # Create an Agent instance with: # - A name ("Assistant") # - Custom instructions ("Be precise and concise.") # - A model built from OpenAIChatCompletionsModel using our client and model name. # - A list of tools; here, only get_weather is provided. agent = Agent( name="Assistant", instructions="Be precise and concise.", model=OpenAIChatCompletionsModel(model=MODEL_NAME, openai_client=client), tools=[get_weather], ) # Execute the agent with the sample query. result = await Runner.run(agent, "What's the weather in Tokyo?") # Print the final output from the agent. print(result.final_output) # Standard boilerplate to run the async main() function. if __name__ == "__main__": asyncio.run(main()) ``` ## 🔍 Code Breakdown Let's examine the key components: ### 1. **Client Configuration** ```python theme={null} client = AsyncOpenAI(base_url=BASE_URL, api_key=API_KEY) ``` This creates an async OpenAI client pointed at Perplexity's Sonar API. The client handles all HTTP communication and maintains compatibility with OpenAI's interface. ### 2. **Function Tools** ```python theme={null} @function_tool def get_weather(city: str): """Simulate fetching weather data for a given city.""" return f"The weather in {city} is sunny." ``` Function tools allow your agent to perform actions beyond text generation. In production, you'd replace this with real API calls. ### 3. **Agent Creation** ```python theme={null} agent = Agent( name="Assistant", instructions="Be precise and concise.", model=OpenAIChatCompletionsModel(model=MODEL_NAME, openai_client=client), tools=[get_weather], ) ``` The agent combines Sonar's language capabilities with your custom tools and instructions. ## 🏃‍♂️ Running the Example 1. **Set your environment variables**: ```bash theme={null} export EXAMPLE_API_KEY="your-perplexity-api-key" ``` 2. **Save the code** to a file (e.g., `pplx_openai_agent.py`) 3. **Run the script**: ```bash theme={null} python pplx_openai_agent.py ``` **Expected Output**: ``` [debug] getting weather for Tokyo The weather in Tokyo is sunny. ``` ## 🔧 Customization Options ### **Different Sonar Models** Choose the right model for your use case: ```python theme={null} # For quick, lightweight queries MODEL_NAME = "sonar" # For complex research and analysis (default) MODEL_NAME = "sonar-pro" # For deep reasoning tasks MODEL_NAME = "sonar-reasoning-pro" ``` ### **Custom Instructions** Tailor the agent's behavior: ```python theme={null} agent = Agent( name="Research Assistant", instructions=""" You are a research assistant specializing in academic literature. Always provide citations and verify information through multiple sources. Be thorough but concise in your responses. """, model=OpenAIChatCompletionsModel(model=MODEL_NAME, openai_client=client), tools=[search_papers, get_citations], ) ``` ### **Multiple Function Tools** Add more capabilities: ```python theme={null} @function_tool def search_web(query: str): """Search the web for current information.""" # Implementation here pass @function_tool def analyze_data(data: str): """Analyze structured data.""" # Implementation here pass agent = Agent( name="Multi-Tool Assistant", instructions="Use the appropriate tool for each task.", model=OpenAIChatCompletionsModel(model=MODEL_NAME, openai_client=client), tools=[get_weather, search_web, analyze_data], ) ``` ## 🚀 Production Considerations ### **Error Handling** ```python theme={null} async def robust_main(): try: agent = Agent( name="Assistant", instructions="Be helpful and accurate.", model=OpenAIChatCompletionsModel(model=MODEL_NAME, openai_client=client), tools=[get_weather], ) result = await Runner.run(agent, "What's the weather in Tokyo?") return result.final_output except Exception as e: print(f"Error running agent: {e}") return "Sorry, I encountered an error processing your request." ``` ### **Rate Limiting** ```python theme={null} import aiohttp from openai import AsyncOpenAI # Configure client with custom timeout and retry settings client = AsyncOpenAI( base_url=BASE_URL, api_key=API_KEY, timeout=30.0, max_retries=3 ) ``` ### **Logging and Monitoring** ```python theme={null} import logging logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) @function_tool def get_weather(city: str): logger.info(f"Fetching weather for {city}") # Implementation here ``` ## 🔗 Advanced Integration Patterns ### **Streaming Responses** For real-time applications: ```python theme={null} async def stream_agent_response(query: str): agent = Agent( name="Streaming Assistant", instructions="Provide detailed, step-by-step responses.", model=OpenAIChatCompletionsModel(model=MODEL_NAME, openai_client=client), tools=[get_weather], ) async for chunk in Runner.stream(agent, query): print(chunk, end='', flush=True) ``` ### **Context Management** For multi-turn conversations: ```python theme={null} class ConversationManager: def __init__(self): self.agent = Agent( name="Conversational Assistant", instructions="Maintain context across multiple interactions.", model=OpenAIChatCompletionsModel(model=MODEL_NAME, openai_client=client), tools=[get_weather], ) self.conversation_history = [] async def chat(self, message: str): result = await Runner.run(self.agent, message) self.conversation_history.append({"user": message, "assistant": result.final_output}) return result.final_output ``` ## ⚠️ Important Notes * **API Costs**: Monitor your usage as both Perplexity and OpenAI Agents may incur costs * **Rate Limits**: Respect API rate limits and implement appropriate backoff strategies * **Error Handling**: Always implement robust error handling for production applications * **Security**: Keep your API keys secure and never commit them to version control ## 🎯 Use Cases This integration pattern is perfect for: * **🔍 Research Assistants** - Combining real-time search with structured responses * **📊 Data Analysis Tools** - Using Sonar for context and agents for processing * **🤖 Customer Support** - Grounded responses with function calling capabilities * **📚 Educational Applications** - Real-time information with interactive features ## 📚 References * [Perplexity Sonar API Documentation](https://docs.perplexity.ai/home) * [OpenAI Agents SDK Documentation](https://github.com/openai/openai-agents-python) * [AsyncOpenAI Client Reference](https://platform.openai.com/docs/api-reference) * [Function Calling Best Practices](https://platform.openai.com/docs/guides/function-calling) *** **Ready to build?** This integration opens up powerful possibilities for creating intelligent, grounded agents. Start with the basic example and gradually add more sophisticated tools and capabilities! 🚀 # Search Domain Filtering Patterns Source: https://docs.perplexity.ai/docs/cookbook/articles/search-domain-filtering/README Use search_domain_filter for focused search — allowlist patterns for trusted sources, denylist for excluding domains, and practical patterns for news, government, and competitive intelligence This guide covers search domain filtering on the Agent API. You will learn how to use allowlists to restrict search to trusted domains, denylists to exclude unwanted sources, and practical patterns for common use cases like news-only search, government data, and competitor exclusion. Domain filtering is configured per-tool under the `tools` array via `tools[].filters.search_domain_filter`. For the full reference, see [Agent API Filters](/docs/agent-api/filters). ## Prerequisites Install the Perplexity SDK: ```bash Python theme={null} pip install perplexityai ``` ```bash TypeScript theme={null} npm install @perplexity-ai/perplexity_ai ``` If you don't have an API key yet: Navigate to the **API Keys** tab in the API Portal and generate a new key. Then export your API key as an environment variable: ```bash theme={null} export PERPLEXITY_API_KEY="your-api-key" ``` ## How Domain Filtering Works The `search_domain_filter` parameter accepts a list of domain strings: * **Allowlist** (no prefix): Include only results from these domains. `["reuters.com", "apnews.com"]` means search only Reuters and AP News. * **Denylist** (`-` prefix): Exclude results from these domains. `["-reddit.com", "-twitter.com"]` means exclude Reddit and Twitter. **Never mix allowlist and denylist entries in the same request.** The API does not support combining `"reuters.com"` and `"-reddit.com"` in the same array. Use either all allowlist or all denylist entries. ## Basic Domain Filtering Domain filters are configured per-tool under the `tools` array. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() # Allowlist: search only specific domains response = client.responses.create( model="openai/gpt-5.4", input="What are the latest developments in AI regulation?", tools=[{ "type": "web_search", "filters": { "search_domain_filter": ["reuters.com", "apnews.com", "bbc.com"], }, }], ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "openai/gpt-5.4", input: "What are the latest developments in AI regulation?", tools: [{ type: "web_search" as const, filters: { search_domain_filter: ["reuters.com", "apnews.com", "bbc.com"], }, }], }); console.log(response.output_text); ``` ## Pattern: Denylist Filtering Use the `-` prefix to exclude specific domains from search results. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() # Denylist: exclude social media and user-generated content response = client.responses.create( model="openai/gpt-5.4", input="What are the latest developments in AI regulation?", tools=[{ "type": "web_search", "filters": { "search_domain_filter": ["-reddit.com", "-twitter.com", "-quora.com", "-medium.com"], }, }], ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const response = await client.responses.create({ model: "openai/gpt-5.4", input: "What are the latest developments in AI regulation?", tools: [{ type: "web_search" as const, filters: { search_domain_filter: ["-reddit.com", "-twitter.com", "-quora.com", "-medium.com"], }, }], }); console.log(response.output_text); ``` ## Pattern: News-Only Search Restrict results to major news outlets for current events and breaking news. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() NEWS_DOMAINS = [ "reuters.com", "apnews.com", "bbc.com", "nytimes.com", "washingtonpost.com", "theguardian.com", "bloomberg.com", "ft.com", ] response = client.responses.create( model="openai/gpt-5.4", input="What happened in global markets today?", tools=[{ "type": "web_search", "filters": { "search_domain_filter": NEWS_DOMAINS, "search_recency_filter": "day", }, }], ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const NEWS_DOMAINS = [ "reuters.com", "apnews.com", "bbc.com", "nytimes.com", "washingtonpost.com", "theguardian.com", "bloomberg.com", "ft.com", ]; const response = await client.responses.create({ model: "openai/gpt-5.4", input: "What happened in global markets today?", tools: [{ type: "web_search" as const, filters: { search_domain_filter: NEWS_DOMAINS, search_recency_filter: "day", }, }], }); console.log(response.output_text); ``` Combine `search_domain_filter` with `search_recency_filter` for time-sensitive queries. Options are `day`, `week`, `month`, and `year`. ## Pattern: Government and Official Sources Restrict to government domains for policy, regulation, and official statistics. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() GOV_DOMAINS = [ ".gov", # US federal and state ".gov.uk", # UK government ".europa.eu", # EU institutions "who.int", # World Health Organization "worldbank.org", # World Bank ] response = client.responses.create( model="openai/gpt-5.4", input="What are the current US federal guidelines on AI usage in healthcare?", tools=[{ "type": "web_search", "filters": { "search_domain_filter": GOV_DOMAINS, }, }], ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const GOV_DOMAINS = [ ".gov", ".gov.uk", ".europa.eu", "who.int", "worldbank.org", ]; const response = await client.responses.create({ model: "openai/gpt-5.4", input: "What are the current US federal guidelines on AI usage in healthcare?", tools: [{ type: "web_search" as const, filters: { search_domain_filter: GOV_DOMAINS, }, }], }); console.log(response.output_text); ``` ## Pattern: Academic and Research Filtering Target educational and research institutions. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() ACADEMIC_DOMAINS = [ ".edu", "arxiv.org", "scholar.google.com", "pubmed.ncbi.nlm.nih.gov", "nature.com", "science.org", "ieee.org", ] response = client.responses.create( model="openai/gpt-5.4", input="What are recent advances in protein structure prediction?", tools=[{ "type": "web_search", "filters": { "search_domain_filter": ACADEMIC_DOMAINS, }, }], ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const ACADEMIC_DOMAINS = [ ".edu", "arxiv.org", "scholar.google.com", "pubmed.ncbi.nlm.nih.gov", "nature.com", "science.org", "ieee.org", ]; const response = await client.responses.create({ model: "openai/gpt-5.4", input: "What are recent advances in protein structure prediction?", tools: [{ type: "web_search" as const, filters: { search_domain_filter: ACADEMIC_DOMAINS, }, }], }); console.log(response.output_text); ``` ## Pattern: Competitor Exclusion Use denylists to exclude competitor websites from search results when building customer-facing content. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() # Exclude competitor domains from product research EXCLUDED_DOMAINS = [ "-competitor-a.com", "-competitor-b.io", "-competitor-c.ai", ] response = client.responses.create( model="openai/gpt-5.4", input="What are the best practices for building real-time data pipelines?", tools=[{ "type": "web_search", "filters": { "search_domain_filter": EXCLUDED_DOMAINS, }, }], ) print(response.output_text) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const EXCLUDED_DOMAINS = [ "-competitor-a.com", "-competitor-b.io", "-competitor-c.ai", ]; const response = await client.responses.create({ model: "openai/gpt-5.4", input: "What are the best practices for building real-time data pipelines?", tools: [{ type: "web_search" as const, filters: { search_domain_filter: EXCLUDED_DOMAINS, }, }], }); console.log(response.output_text); ``` ## Configurable Filter Builder A reusable helper that builds domain filter configurations from named presets. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() # Named filter presets FILTER_PRESETS = { "news": ["reuters.com", "apnews.com", "bbc.com", "bloomberg.com", "ft.com"], "academic": [".edu", "arxiv.org", "nature.com", "science.org", "pubmed.ncbi.nlm.nih.gov"], "government": [".gov", ".gov.uk", ".europa.eu", "who.int"], "tech": ["techcrunch.com", "arstechnica.com", "theverge.com", "wired.com"], "no_social": ["-reddit.com", "-twitter.com", "-facebook.com", "-tiktok.com", "-quora.com"], "no_seo_spam": ["-pinterest.com", "-medium.com", "-hubspot.com"], } def search_with_preset(query: str, preset: str, recency: str = None) -> str: """Run a search with a named domain filter preset.""" if preset not in FILTER_PRESETS: raise ValueError(f"Unknown preset: {preset}. Options: {list(FILTER_PRESETS.keys())}") filters = {"search_domain_filter": FILTER_PRESETS[preset]} if recency: filters["search_recency_filter"] = recency response = client.responses.create( model="openai/gpt-5.4", input=query, tools=[{"type": "web_search", "filters": filters}], ) return response.output_text # Usage print("--- News Search ---") print(search_with_preset("Latest AI regulation news", "news", recency="week")) print("\n--- Academic Search ---") print(search_with_preset("CRISPR gene editing recent papers", "academic")) print("\n--- Clean Search (no social media) ---") print(search_with_preset("Best Python testing frameworks", "no_social")) ``` ```typescript TypeScript theme={null} import Perplexity from '@perplexity-ai/perplexity_ai'; const client = new Perplexity(); const FILTER_PRESETS: Record = { news: ["reuters.com", "apnews.com", "bbc.com", "bloomberg.com", "ft.com"], academic: [".edu", "arxiv.org", "nature.com", "science.org", "pubmed.ncbi.nlm.nih.gov"], government: [".gov", ".gov.uk", ".europa.eu", "who.int"], tech: ["techcrunch.com", "arstechnica.com", "theverge.com", "wired.com"], no_social: ["-reddit.com", "-twitter.com", "-facebook.com", "-tiktok.com", "-quora.com"], no_seo_spam: ["-pinterest.com", "-medium.com", "-hubspot.com"], }; async function searchWithPreset(query: string, preset: string, recency?: string): Promise { if (!(preset in FILTER_PRESETS)) { throw new Error(`Unknown preset: ${preset}. Options: ${Object.keys(FILTER_PRESETS).join(", ")}`); } const filters: Record = { search_domain_filter: FILTER_PRESETS[preset] }; if (recency) filters.search_recency_filter = recency; const response = await client.responses.create({ model: "openai/gpt-5.4", input: query, tools: [{ type: "web_search" as const, filters }], }); return response.output_text; } console.log("--- News Search ---"); console.log(await searchWithPreset("Latest AI regulation news", "news", "week")); console.log("\n--- Academic Search ---"); console.log(await searchWithPreset("CRISPR gene editing recent papers", "academic")); console.log("\n--- Clean Search (no social media) ---"); console.log(await searchWithPreset("Best Python testing frameworks", "no_social")); ``` ## Common Pitfalls ### Mixing Allowlist and Denylist ```python theme={null} # ❌ WRONG: mixing allowlist and denylist search_domain_filter=["reuters.com", "-reddit.com"] # ✅ CORRECT: use only allowlist search_domain_filter=["reuters.com", "apnews.com", "bbc.com"] # ✅ CORRECT: use only denylist search_domain_filter=["-reddit.com", "-twitter.com"] ``` ### Using Wildcards Incorrectly ```python theme={null} # ❌ WRONG: wildcards are not supported search_domain_filter=["*.gov"] # ✅ CORRECT: use the TLD directly search_domain_filter=[".gov"] ``` ### Empty Filter Arrays ```python theme={null} # ❌ WRONG: empty array has undefined behavior search_domain_filter=[] # ✅ CORRECT: omit the parameter to search all domains # (simply don't include search_domain_filter) ``` ## Tips and Best Practices 1. **Keep allowlists focused.** 5-10 domains is usually sufficient. Too many domains dilutes the filter's purpose. 2. **Use denylists for broad exclusion.** When you want to exclude a few noisy sources but otherwise search the full web, denylists are more practical than trying to allowlist everything else. 3. **Combine with recency filters.** For time-sensitive queries, add `search_recency_filter` alongside domain filters. 4. **Test your filters.** Run the same query with and without filters to verify that results change as expected. 5. **TLD filters work broadly.** Using `.gov` matches any domain ending in `.gov`, including `whitehouse.gov`, `irs.gov`, and state domains like `ca.gov`. 6. **Store presets in configuration.** Define filter presets in your app configuration rather than hardcoding them in every request. ## Next Steps Full reference for domain, date range, and location filters on the Agent API. Domain filtering on the raw Search API for result-level control. Specialized academic search with domain filtering. # Streaming Citation Parsing Source: https://docs.perplexity.ai/docs/cookbook/articles/streaming-citations/README Consume streaming responses from the Agent API and extract, validate, and display citations in real-time as chunks arrive This guide shows how to consume streaming responses from the Agent API, extract citations as they arrive, validate source URLs, and build a fully cited output. Streaming is essential for responsive UIs and long-running searches — you can display text and sources progressively instead of waiting for the full response. The `fast-search` preset is optimized for quick, citation-rich answers. The model inserts numbered references like `[1]`, `[2]` in the text, and the corresponding source URLs arrive in the `search_results` output item. See the [Agent API Presets](/docs/agent-api/presets) docs for all available presets. ## Prerequisites Install the SDKs: ```bash Python theme={null} pip install perplexityai openai ``` ```bash TypeScript theme={null} npm install @perplexity-ai/perplexity_ai openai ``` If you don't have an API key yet: Navigate to the **API Keys** tab in the API Portal and generate a new key. Then export your API key as an environment variable: ```bash theme={null} export PERPLEXITY_API_KEY="your-api-key" ``` ## How Streaming Citations Work When you stream an Agent API response with a search-enabled preset, the API sends a sequence of server-sent events (SSE). The flow is: 1. **Search results** arrive first via `response.reasoning.search_results` events, containing URLs, titles, and snippets for each source. 2. **Content chunks** arrive incrementally as the model generates text via `response.output_text.delta` events. 3. **Citation references** appear in the text as numbered markers like `[1]`, `[2]`, mapping to the search result `id` field. Your client accumulates the text, collects search results, then maps the numbered references to source URLs using the `id` field. ## Basic Streaming with Citations ```python Python theme={null} import os from openai import OpenAI # The OpenAI SDK supports Agent API streaming via the /v1/responses alias client = OpenAI( api_key=os.environ["PERPLEXITY_API_KEY"], base_url="https://api.perplexity.ai/v1", ) stream = client.responses.create( input="What are the latest breakthroughs in quantum computing?", stream=True, extra_body={"preset": "fast-search"}, ) full_content = "" search_results = [] for event in stream: event_type = event.type # Collect search results (arrive before text) if event_type == "response.reasoning.search_results": search_results = event.results # Accumulate content from each delta if event_type == "response.output_text.delta": full_content += event.delta print(event.delta, end="", flush=True) print("\n\n--- Citations ---") for result in search_results: print(f"[{result['id']}] {result['title']} — {result['url']}") ``` ```typescript TypeScript theme={null} import OpenAI from "openai"; // The OpenAI SDK supports Agent API streaming via the /v1/responses alias const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1", }); const stream = await client.responses.create({ input: "What are the latest breakthroughs in quantum computing?", stream: true, preset: "fast-search", } as any); let fullContent = ""; let searchResults: Array<{ id: number; title: string; url: string }> = []; for await (const event of stream) { // Collect search results (arrive before text) if (event.type === "response.reasoning.search_results") { searchResults = (event as any).results; } // Accumulate content from each delta if (event.type === "response.output_text.delta") { fullContent += event.delta; process.stdout.write(event.delta); } } console.log("\n\n--- Citations ---"); searchResults.forEach((result) => { console.log(`[${result.id}] ${result.title} — ${result.url}`); }); ``` ```bash curl theme={null} curl -N "https://api.perplexity.ai/v1/agent" \ -H "Authorization: Bearer $PERPLEXITY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "preset": "fast-search", "input": "What are the latest breakthroughs in quantum computing?", "stream": true }' ``` ## Parsing Citation References from Text The model inserts numbered references like `[1]`, `[2]` into the generated text. To build a rich output with clickable links, parse these references and map them to source URLs using the search results. ```python Python theme={null} import re from perplexity import Perplexity client = Perplexity() def extract_citation_refs(text: str) -> list[int]: """Extract all citation reference numbers from text, e.g. [1], [2].""" return sorted(set(int(m) for m in re.findall(r"\[(\d+)\]", text))) def build_cited_output(content: str, search_results: list) -> str: """Replace [N] references with markdown links and append a references section.""" cited_content = content # Build a map from id to URL url_map = {r.id: r.url for r in search_results} title_map = {r.id: r.title for r in search_results} # Replace inline references with markdown links for ref_id, url in url_map.items(): cited_content = cited_content.replace( f"[{ref_id}]", f"[[{ref_id}]]({url})" ) # Append a references section with all cited sources used_refs = extract_citation_refs(content) if used_refs: cited_content += "\n\n---\n**References:**\n" for ref in used_refs: if ref in url_map: cited_content += f"- [{ref}] {title_map[ref]} — {url_map[ref]}\n" return cited_content # Non-streaming request to get content + search results response = client.responses.create( preset="fast-search", input="What is CRISPR gene editing and how does it work?", ) # Extract search results from the response output content = response.output_text search_results = [] for item in response.output: if item.type == "search_results": search_results = item.results break # Build the final output with linked citations output = build_cited_output(content, search_results) print(output) ``` ```typescript TypeScript theme={null} import Perplexity from "@perplexity-ai/perplexity_ai"; const client = new Perplexity(); function extractCitationRefs(text: string): number[] { const refs = new Set(); for (const match of text.matchAll(/\[(\d+)\]/g)) { refs.add(parseInt(match[1])); } return [...refs].sort((a, b) => a - b); } function buildCitedOutput( content: string, searchResults: Array<{ id: number; url: string; title: string }> ): string { let cited = content; // Build maps from id to URL and title const urlMap = new Map(searchResults.map((r) => [r.id, r.url])); const titleMap = new Map(searchResults.map((r) => [r.id, r.title])); // Replace inline references with markdown links for (const [id, url] of urlMap) { cited = cited.replaceAll(`[${id}]`, `[[${id}]](${url})`); } // Append a references section const usedRefs = extractCitationRefs(content); if (usedRefs.length > 0) { cited += "\n\n---\n**References:**\n"; for (const ref of usedRefs) { if (urlMap.has(ref)) { cited += `- [${ref}] ${titleMap.get(ref)} — ${urlMap.get(ref)}\n`; } } } return cited; } // Non-streaming request to get content + search results const response = await client.responses.create({ preset: "fast-search", input: "What is CRISPR gene editing and how does it work?", }); // Extract search results from the response output const content = response.output_text; let searchResults: Array<{ id: number; url: string; title: string }> = []; for (const item of response.output) { if (item.type === "search_results") { searchResults = (item as any).results; break; } } const output = buildCitedOutput(content, searchResults); console.log(output); ``` ## Validating Citation URLs In production systems, you should validate that citation URLs are well-formed and reachable before presenting them to users. This avoids broken links and improves trust in the output. ```python Python theme={null} import asyncio import aiohttp from urllib.parse import urlparse def is_valid_url(url: str) -> bool: """Check that a URL has a valid structure.""" try: result = urlparse(url) return all([result.scheme in ("http", "https"), result.netloc]) except Exception: return False async def check_url_reachable(url: str, timeout: float = 5.0) -> dict: """HEAD-request a URL to check if it's reachable.""" if not is_valid_url(url): return {"url": url, "valid": False, "reason": "malformed URL"} try: async with aiohttp.ClientSession() as session: async with session.head(url, timeout=aiohttp.ClientTimeout(total=timeout), allow_redirects=True) as resp: return { "url": url, "valid": resp.status < 400, "status": resp.status, } except asyncio.TimeoutError: return {"url": url, "valid": False, "reason": "timeout"} except Exception as e: return {"url": url, "valid": False, "reason": str(e)} async def validate_citations(search_results: list) -> list[dict]: """Validate all citation URLs from search results concurrently.""" tasks = [check_url_reachable(r.url) for r in search_results] return await asyncio.gather(*tasks) # Usage after getting a response: # results = asyncio.run(validate_citations(search_results)) # for r in results: # status = "OK" if r["valid"] else f"FAILED ({r.get('reason', r.get('status'))})" # print(f" {r['url']}: {status}") ``` ```typescript TypeScript theme={null} function isValidUrl(url: string): boolean { try { const parsed = new URL(url); return parsed.protocol === "http:" || parsed.protocol === "https:"; } catch { return false; } } async function checkUrlReachable(url: string, timeoutMs = 5000): Promise<{ url: string; valid: boolean; reason?: string; status?: number }> { if (!isValidUrl(url)) { return { url, valid: false, reason: "malformed URL" }; } try { const controller = new AbortController(); const timer = setTimeout(() => controller.abort(), timeoutMs); const resp = await fetch(url, { method: "HEAD", signal: controller.signal, redirect: "follow" }); clearTimeout(timer); return { url, valid: resp.status < 400, status: resp.status }; } catch (e: any) { return { url, valid: false, reason: e.message }; } } async function validateCitations(searchResults: Array<{ url: string }>): Promise> { return Promise.all(searchResults.map(r => checkUrlReachable(r.url))); } // Usage after getting a response: // const results = await validateCitations(searchResults); // results.forEach(r => { // const status = r.valid ? "OK" : `FAILED (${r.reason ?? r.status})`; // console.log(` ${r.url}: ${status}`); // }); ``` **Never ask the model to generate source URLs.** Always use the `search_results` output from the API response. Model-generated URLs can be hallucinated. The search results contain verified URLs from real web searches. ## Progressive Display with Live Citation Count For chat UIs, it's useful to show a live citation counter as text streams in, then render the full reference list once the stream completes. ```python Python theme={null} import os import re import sys from openai import OpenAI client = OpenAI( api_key=os.environ["PERPLEXITY_API_KEY"], base_url="https://api.perplexity.ai/v1", ) def stream_with_progress(query: str): """Stream a response with a live citation counter.""" stream = client.responses.create( input=query, stream=True, extra_body={"preset": "fast-search"}, ) full_content = "" search_results = [] seen_refs = set() for event in stream: if event.type == "response.reasoning.search_results": search_results = event.results if event.type == "response.output_text.delta": full_content += event.delta sys.stdout.write(event.delta) sys.stdout.flush() # Track new citation references against accumulated text # (individual deltas may split [N] across chunks) current_refs = set(int(m) for m in re.findall(r"\[(\d+)\]", full_content)) if current_refs - seen_refs: seen_refs = current_refs sys.stdout.write(f" [📚 {len(seen_refs)} sources]") sys.stdout.flush() # Final summary print(f"\n\n{'='*60}") print(f"Response complete: {len(search_results)} sources found, {len(seen_refs)} cited") print(f"{'='*60}") # Build URL map from search results url_map = {r["id"]: r for r in search_results} for ref_id in sorted(seen_refs): if ref_id in url_map: r = url_map[ref_id] print(f" ✓ [{ref_id}] {r['title']} — {r['url']}") return full_content, search_results content, results = stream_with_progress( "What are the environmental impacts of lithium mining?" ) ``` ```typescript TypeScript theme={null} import OpenAI from "openai"; const client = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai/v1", }); async function streamWithProgress(query: string) { const stream = await client.responses.create({ input: query, stream: true, preset: "fast-search", } as any); let fullContent = ""; let searchResults: Array<{ id: number; title: string; url: string }> = []; const seenRefs = new Set(); for await (const event of stream) { if (event.type === "response.reasoning.search_results") { searchResults = (event as any).results; } if (event.type === "response.output_text.delta") { fullContent += event.delta; process.stdout.write(event.delta); // Track new citation references against accumulated text // (individual deltas may split [N] across chunks) const prevSize = seenRefs.size; for (const match of fullContent.matchAll(/\[(\d+)\]/g)) { seenRefs.add(parseInt(match[1])); } if (seenRefs.size > prevSize) { process.stdout.write(` [📚 ${seenRefs.size} sources]`); } } } console.log(`\n\n${"=".repeat(60)}`); console.log(`Response complete: ${searchResults.length} sources found, ${seenRefs.size} cited`); console.log("=".repeat(60)); const urlMap = new Map(searchResults.map((r) => [r.id, r])); for (const refId of [...seenRefs].sort((a, b) => a - b)) { const r = urlMap.get(refId); if (r) { console.log(` ✓ [${refId}] ${r.title} — ${r.url}`); } } return { fullContent, searchResults }; } await streamWithProgress("What are the environmental impacts of lithium mining?"); ``` ## Handling Search Results The Agent API returns a `search_results` output item with rich metadata (id, title, snippet, URL, date) for each source. This is richer than a flat URL list — use it to build source cards, sidebars, or detailed reference sections. ```python Python theme={null} from perplexity import Perplexity client = Perplexity() # Non-streaming request to show the full response structure response = client.responses.create( preset="fast-search", input="What is the current state of fusion energy research?", ) content = response.output_text # Extract search results from the output search_results = [] for item in response.output: if item.type == "search_results": search_results = item.results break print("--- Answer ---") print(content) print("\n--- Search Results (rich metadata) ---") for result in search_results: print(f" [{result.id}] {result.title}") print(f" URL: {result.url}") print(f" Date: {result.date}") print(f" Snippet: {result.snippet[:100]}...") print() ``` ```typescript TypeScript theme={null} import Perplexity from "@perplexity-ai/perplexity_ai"; const client = new Perplexity(); const response = await client.responses.create({ preset: "fast-search", input: "What is the current state of fusion energy research?", }); const content = response.output_text; // Extract search results from the output let searchResults: any[] = []; for (const item of response.output) { if (item.type === "search_results") { searchResults = (item as any).results; break; } } console.log("--- Answer ---"); console.log(content); console.log("\n--- Search Results (rich metadata) ---"); for (const result of searchResults) { console.log(` [${result.id}] ${result.title}`); console.log(` URL: ${result.url}`); console.log(` Date: ${result.date}`); console.log(` Snippet: ${result.snippet?.slice(0, 100)}...`); console.log(); } ``` Each search result includes `id`, `title`, `url`, `snippet`, and `date`. The `id` maps directly to the `[N]` references in the text. Use this to build rich source cards for your UI. ## Complete Example: Streaming Research Assistant A self-contained script that streams an Agent API response, extracts citations, validates URLs, and produces a formatted markdown output. ```python Python theme={null} import os import re from urllib.parse import urlparse from openai import OpenAI client = OpenAI( api_key=os.environ["PERPLEXITY_API_KEY"], base_url="https://api.perplexity.ai/v1", ) def is_valid_url(url: str) -> bool: try: result = urlparse(url) return all([result.scheme in ("http", "https"), result.netloc]) except Exception: return False def stream_and_collect(query: str) -> tuple[str, list[dict]]: """Stream an Agent API response and return the full content and search results.""" stream = client.responses.create( input=query, stream=True, extra_body={"preset": "fast-search"}, ) content = "" search_results = [] for event in stream: if event.type == "response.reasoning.search_results": search_results = event.results if event.type == "response.output_text.delta": content += event.delta print(event.delta, end="", flush=True) print() # newline after streaming return content, search_results def format_markdown_report(query: str, content: str, search_results: list[dict]) -> str: """Build a markdown report with inline citation links.""" # Build URL map from search results url_map = {r["id"]: r["url"] for r in search_results} title_map = {r["id"]: r["title"] for r in search_results} # Replace [N] with markdown links formatted = content for ref_id, url in url_map.items(): if is_valid_url(url): formatted = formatted.replace(f"[{ref_id}]", f"[\\[{ref_id}\\]]({url})") # Build the report report = f"# {query}\n\n{formatted}\n\n" # Append sources used_refs = sorted(set(int(m) for m in re.findall(r"\[(\d+)\]", content))) if search_results: report += "## Sources\n\n" for result in search_results: marker = "→" if result["id"] in used_refs else " " report += f"{marker} **[{result['id']}]** {result['title']} — {result['url']}\n\n" return report if __name__ == "__main__": query = "What are the most promising approaches to carbon capture technology?" print(f"Researching: {query}\n") print("-" * 60) content, search_results = stream_and_collect(query) print(f"\n{'=' * 60}") print(f"Collected {len(search_results)} sources\n") # Filter out any malformed URLs valid_results = [r for r in search_results if is_valid_url(r["url"])] invalid_count = len(search_results) - len(valid_results) if invalid_count: print(f"Warning: {invalid_count} sources had malformed URLs and were excluded.\n") report = format_markdown_report(query, content, valid_results) print(report) ``` ## Tips and Best Practices 1. **Use a search-enabled preset** like `fast-search` or `pro-search` for citation-rich responses. Different presets use different citation formats — `fast-search` uses `[1]`, while `pro-search` uses `[web:1]`. 2. **Collect search results before processing text.** During streaming, `response.reasoning.search_results` events arrive before text deltas. Buffer them so you have the URL map ready when citations appear. 3. **Use the `id` field to map citations.** Each search result has a numeric `id` that corresponds to the `[N]` reference in the text. 4. **Validate URLs before displaying them.** Use HEAD requests with timeouts to filter out any unreachable sources. 5. **Never generate your own URLs.** Use only the `search_results` from the API response. Model-generated URLs can be hallucinated. 6. **Handle missing references gracefully.** If a `[N]` reference in the text exceeds the number of search results, display the reference number without a link rather than crashing. 7. **Consider rate limiting for URL validation.** If the response includes many sources, validate them with concurrency limits to avoid overwhelming target servers. ## Next Steps Explore all presets and their citation formats. Get started with the Agent API for multi-provider access and tools. Streaming patterns and event types for the Agent API. # Examples Overview Source: https://docs.perplexity.ai/docs/cookbook/examples/README Runnable projects covering the Agent API, Search API, and Embeddings API # Examples Overview Ready-to-run projects that demonstrate real-world use cases across every Perplexity API. Each example includes complete setup instructions and working code. ## Choosing the Right Example | If you want to... | Use this example | API | Language | | ------------------------------------- | ----------------------------------------------------------------------------------- | ---------------------- | ------------------ | | Conduct deep web research | [Agent Research Assistant](/docs/cookbook/examples/agent-research-assistant/README) | Agent API | Python, TypeScript | | Compare models across providers | [Model Comparison](/docs/cookbook/examples/model-comparison/README) | Agent API | Python | | Monitor news topics in real time | [Search News Monitor](/docs/cookbook/examples/search-news-monitor/README) | Search API | Python, TypeScript | | Build a document Q\&A system | [Document Q\&A](/docs/cookbook/examples/document-qa/README) | Embeddings + Agent API | Python | | Build a TypeScript CLI agent | [TypeScript Agent CLI](/docs/cookbook/examples/typescript-agent-cli/README) | Agent API | TypeScript | | Analyze images with web context | [Image Analysis](/docs/cookbook/examples/image-analysis/README) | Agent API | Python, TypeScript | | Ask questions about uploaded files | [File Attachment Q\&A](/docs/cookbook/examples/file-attachment-qa/README) | Agent API | Python | | Search SEC filings for financial data | [SEC Filing Search](/docs/cookbook/examples/sec-filing-search/README) | Agent API | Python | ## By API ### Agent API Deep web research using the `deep-research` preset with structured report output. Compare responses from 5 providers side-by-side — quality, latency, and cost. Interactive TypeScript CLI with streaming, model selection, and web search. Vision + web search for context-enriched image analysis. Upload documents and ask questions about them with optional web search enrichment. ### Search API Multi-topic news monitoring with domain filtering and recency control. Search SEC.gov and EDGAR for financial filings with structured data extraction. ### Embeddings API Self-contained RAG system with contextualized embeddings and Agent API answer generation. ## API Key Setup All examples require a Perplexity API key. Set it as an environment variable: ```bash theme={null} export PERPLEXITY_API_KEY="your-api-key-here" ``` Get your API key at [perplexity.ai/account/api](https://perplexity.ai/account/api). ## Common Requirements * **Python 3.9+** or **Node.js 18+** (depending on the example) * **Perplexity API Key** * **Internet connection** for API calls Additional requirements vary by example and are listed in each project's documentation. ## Contributing Found a bug or want to add an example? See our [Contributing Guidelines](https://github.com/ppl-ai/api-cookbook/blob/main/CONTRIBUTING.md). # Agent Research Assistant Source: https://docs.perplexity.ai/docs/cookbook/examples/agent-research-assistant/README A CLI tool that uses Perplexity's Agent API with the deep-research preset to conduct multi-step web research and produce structured reports # Agent Research Assistant A command-line research tool that leverages Perplexity's Agent API with the `deep-research` preset to conduct thorough, multi-step web research on any topic. The tool produces structured reports with sections, cited sources, and confidence scores. ## Features * Multi-step web research powered by the `deep-research` preset * Structured JSON output with sections, sources, and confidence scores using `response_format` with `json_schema` * Configurable model selection (defaults to `openai/gpt-5.2` via the deep-research preset) * Clean CLI interface that accepts a topic and outputs a formatted report * Source tracking with URLs and relevance annotations * Exportable reports in JSON or plain text ## Installation ```bash Python theme={null} pip install perplexityai pydantic ``` ```bash TypeScript theme={null} npm install @perplexity-ai/perplexity_ai ``` ## API Key Setup Set your Perplexity API key as an environment variable. The SDK reads it automatically: ```bash theme={null} export PERPLEXITY_API_KEY="your_api_key_here" ``` ## Usage ```bash theme={null} # Python python research_assistant.py "Impact of microplastics on marine ecosystems" # TypeScript npx ts-node research_assistant.ts "Impact of microplastics on marine ecosystems" # Override the default model python research_assistant.py "Quantum computing breakthroughs" --model openai/gpt-5.4 # Export as JSON python research_assistant.py "CRISPR gene therapy trials" --json > report.json ``` ## How It Works 1. The CLI accepts a research topic as input. 2. A structured JSON schema is defined for the report format using Pydantic (Python) or a TypeScript interface. 3. The tool calls the Agent API with `preset="deep-research"`, which configures the model (`openai/gpt-5.2`), enables `web_search` and `fetch_url` tools, and allows up to 10 reasoning steps. 4. The `response_format` parameter with `json_schema` enforces structured output matching the report schema. 5. The response is parsed and displayed as a formatted research report. The `deep-research` preset is optimized for complex, in-depth analysis. It uses `openai/gpt-5.2` with up to 10K max tokens and 10 reasoning steps. You can override the model by passing `--model` to the CLI. ## Full Code ```python Python theme={null} import json import argparse from typing import List, Optional from pydantic import BaseModel from perplexity import Perplexity class ReportSource(BaseModel): title: str url: str relevance: str class ReportSection(BaseModel): heading: str content: str confidence: float sources: List[ReportSource] class ResearchReport(BaseModel): title: str summary: str sections: List[ReportSection] conclusion: str overall_confidence: float total_sources: int def run_research(topic: str, model: Optional[str] = None) -> ResearchReport: """Conduct deep research on a topic and return a structured report.""" client = Perplexity() params = { "preset": "deep-research", "input": ( f"Conduct thorough research on the following topic and produce a " f"detailed report with multiple sections, cited sources, and " f"confidence scores for each section.\n\nTopic: {topic}" ), "response_format": { "type": "json_schema", "json_schema": { "name": "research_report", "schema": ResearchReport.model_json_schema(), }, }, } if model: params["model"] = model response = client.responses.create(**params) return ResearchReport.model_validate_json(response.output_text) def format_report(report: ResearchReport) -> str: """Format a ResearchReport into human-readable text.""" lines = [f"{'=' * 60}", f"RESEARCH REPORT: {report.title}", f"{'=' * 60}", ""] lines += [f"SUMMARY:", report.summary, ""] for i, section in enumerate(report.sections, 1): lines.append(f"--- Section {i}: {section.heading} ---") lines.append(f"Confidence: {section.confidence:.0%}\n") lines.append(section.content) if section.sources: lines.append("\nSources:") for src in section.sources: lines.append(f" - {src.title} ({src.relevance})") lines.append(f" {src.url}") lines.append("") lines += [f"{'=' * 60}", "CONCLUSION:", report.conclusion, ""] lines += [f"Overall Confidence: {report.overall_confidence:.0%}"] lines += [f"Total Sources: {report.total_sources}", f"{'=' * 60}"] return "\n".join(lines) def main(): parser = argparse.ArgumentParser(description="Agent Research Assistant") parser.add_argument("topic", help="The research topic") parser.add_argument("--model", help="Override the default model", default=None) parser.add_argument("--json", action="store_true", help="Output raw JSON") args = parser.parse_args() print(f"Researching: {args.topic}") print("This may take a moment (deep research uses multi-step reasoning)...\n") report = run_research(args.topic, model=args.model) if args.json: print(json.dumps(report.model_dump(), indent=2)) else: print(format_report(report)) if __name__ == "__main__": main() ``` ```typescript TypeScript theme={null} import Perplexity from "@perplexity-ai/perplexity_ai"; interface ReportSource { title: string; url: string; relevance: string; } interface ReportSection { heading: string; content: string; confidence: number; sources: ReportSource[]; } interface ResearchReport { title: string; summary: string; sections: ReportSection[]; conclusion: string; overall_confidence: number; total_sources: number; } const reportSchema = { type: "object" as const, properties: { title: { type: "string" }, summary: { type: "string" }, sections: { type: "array", items: { type: "object", properties: { heading: { type: "string" }, content: { type: "string" }, confidence: { type: "number" }, sources: { type: "array", items: { type: "object", properties: { title: { type: "string" }, url: { type: "string" }, relevance: { type: "string" }, }, required: ["title", "url", "relevance"], }, }, }, required: ["heading", "content", "confidence", "sources"], }, }, conclusion: { type: "string" }, overall_confidence: { type: "number" }, total_sources: { type: "number" }, }, required: ["title", "summary", "sections", "conclusion", "overall_confidence", "total_sources"], }; async function runResearch(topic: string, model?: string): Promise { const client = new Perplexity(); const params: Record = { preset: "deep-research", input: `Conduct thorough research on the following topic and produce a ` + `detailed report with multiple sections, cited sources, and ` + `confidence scores for each section.\n\nTopic: ${topic}`, response_format: { type: "json_schema", json_schema: { name: "research_report", schema: reportSchema }, }, }; if (model) params.model = model; const response = await client.responses.create(params as any); return JSON.parse(response.output_text) as ResearchReport; } async function main() { const topic = process.argv[2]; if (!topic) { console.error("Usage: ts-node research_assistant.ts [--model ] [--json]"); process.exit(1); } const modelIdx = process.argv.indexOf("--model"); const model = modelIdx !== -1 ? process.argv[modelIdx + 1] : undefined; const outputJson = process.argv.includes("--json"); console.log(`Researching: ${topic}`); console.log("This may take a moment (deep research uses multi-step reasoning)...\n"); const report = await runResearch(topic, model); if (outputJson) { console.log(JSON.stringify(report, null, 2)); } else { console.log(`RESEARCH REPORT: ${report.title}\n`); console.log(`SUMMARY: ${report.summary}\n`); report.sections.forEach((s, i) => { console.log(`--- Section ${i + 1}: ${s.heading} (${(s.confidence * 100).toFixed(0)}%) ---`); console.log(s.content); s.sources.forEach((src) => console.log(` - ${src.title}: ${src.url}`)); console.log(); }); console.log(`CONCLUSION: ${report.conclusion}`); console.log(`Overall Confidence: ${(report.overall_confidence * 100).toFixed(0)}%`); } } main(); ``` ## Example Output ```bash theme={null} python research_assistant.py "Impact of microplastics on marine ecosystems" ``` ``` Researching: Impact of microplastics on marine ecosystems This may take a moment (deep research uses multi-step reasoning)... ============================================================ RESEARCH REPORT: Impact of Microplastics on Marine Ecosystems ============================================================ SUMMARY: Microplastics have become a pervasive pollutant in marine environments worldwide, affecting organisms from plankton to large marine mammals. --- Section 1: Sources and Distribution --- Confidence: 92% Microplastics originate from the degradation of larger plastic debris, synthetic textiles, industrial processes, and cosmetic products... Sources: - NOAA Marine Debris Program (high) https://marinedebris.noaa.gov/... --- Section 2: Biological Effects on Marine Organisms --- Confidence: 88% Research demonstrates that microplastics affect marine life at multiple trophic levels... Sources: - Environmental Science & Technology (high) https://pubs.acs.org/... ============================================================ CONCLUSION: Microplastics pose a significant and growing threat to marine ecosystems. Overall Confidence: 89% Total Sources: 12 ============================================================ ``` For shorter, faster research tasks, consider using the `pro-search` preset instead. It uses `openai/gpt-5.4` with up to 3 reasoning steps -- a good balance of speed and thoroughness. The first request with a new JSON Schema may take 10 to 30 seconds to prepare. Subsequent requests with the same schema will not see this delay. See the [structured outputs guide](/docs/agent-api/output-control#structured-outputs) for details. ## Limitations * Deep research requests consume more tokens and cost more than standard requests due to multi-step reasoning and tool usage. * Structured output with JSON schema requires the model to adhere to the schema. Very complex schemas may reduce output quality. * Confidence scores are model-generated estimates and should be treated as relative indicators, not absolute measures. * The quality of research depends on the availability and quality of web sources for the given topic. # Daily Knowledge Bot Source: https://docs.perplexity.ai/docs/cookbook/examples/daily-knowledge-bot/README A Python application that delivers interesting facts about rotating topics using the Perplexity AI API # Daily Knowledge Bot A Python application that delivers interesting facts about rotating topics using the Perplexity AI API. Perfect for daily learning, newsletter content, or personal education. ## 🌟 Features * **Daily Topic Rotation**: Automatically selects topics based on the day of the month * **AI-Powered Facts**: Uses Perplexity's Sonar API to generate interesting and accurate facts * **Customizable Topics**: Easily extend or modify the list of topics * **Persistent Storage**: Saves facts to dated text files for future reference * **Robust Error Handling**: Gracefully manages API failures and unexpected errors * **Configurable**: Uses environment variables for secure API key management ## 📋 Requirements * Python 3.6+ * Required packages: * requests * python-dotenv * (optional) logging ## 🚀 Installation 1. Clone this repository or download the script 2. Install the required packages: ```bash theme={null} # Install from requirements file (recommended) pip install -r requirements.txt # Or install manually pip install requests python-dotenv ``` 3. Set up your Perplexity API key: * Create a `.env` file in the same directory as the script * Add your API key: `PERPLEXITY_API_KEY=your_api_key_here` ## 🔧 Usage ### Running the Bot Simply execute the script: ```bash theme={null} python daily_knowledge_bot.py ``` This will: 1. Select a topic based on the current day 2. Fetch an interesting fact from Perplexity AI 3. Save the fact to a dated text file in your current directory 4. Display the fact in the console ### Customizing Topics Edit the `topics.txt` file (one topic per line) or modify the `topics` list directly in the script. Example topics: ``` astronomy history biology technology psychology ocean life ancient civilizations quantum physics art history culinary science ``` ### Automated Scheduling #### On Linux/macOS (using cron): ```bash theme={null} # Edit your crontab crontab -e # Add this line to run daily at 8:00 AM 0 8 * * * /path/to/python3 /path/to/daily_knowledge_bot.py ``` #### On Windows (using Task Scheduler): 1. Open Task Scheduler 2. Create a new Basic Task 3. Set it to run daily 4. Add the action: Start a program 5. Program/script: `C:\path\to\python.exe` 6. Arguments: `C:\path\to\daily_knowledge_bot.py` ## 🔍 Configuration Options The following environment variables can be set in your `.env` file: * `PERPLEXITY_API_KEY` (required): Your Perplexity API key * `OUTPUT_DIR` (optional): Directory to save fact files (default: current directory) * `TOPICS_FILE` (optional): Path to your custom topics file ## 📄 Output Example ``` DAILY FACT - 2025-04-02 Topic: astronomy Saturn's iconic rings are relatively young, potentially forming only 100 million years ago. This means dinosaurs living on Earth likely never saw Saturn with its distinctive rings, as they may have formed long after the dinosaurs went extinct. The rings are made primarily of water ice particles ranging in size from tiny dust grains to boulder-sized chunks. ``` ## 🛠️ Extending the Bot Some ways to extend this bot: * Add email or SMS delivery capabilities * Create a web interface to view fact history * Integrate with social media posting * Add multimedia content based on the facts * Implement advanced scheduling with specific topics on specific days ## ⚠️ Limitations * API rate limits may apply based on your Perplexity account * Quality of facts depends on the AI model * The free version of the Sonar API has a token limit that may truncate longer responses ## 📜 License [MIT License](https://github.com/ppl-ai/api-cookbook/blob/main/LICENSE) ## 🙏 Acknowledgements * This project uses the Perplexity AI API ([https://docs.perplexity.ai/](https://docs.perplexity.ai/)) * Inspired by daily knowledge calendars and fact-of-the-day services # Perplexity Discord Bot Source: https://docs.perplexity.ai/docs/cookbook/examples/discord-py-bot/README A simple discord.py bot that integrates Perplexity's Sonar API to bring AI answers to your Discord server. A simple `discord.py` bot that integrates [Perplexity's Sonar API](https://docs.perplexity.ai/) into your Discord server. Ask questions and get AI-powered answers with web access through slash commands or by mentioning the bot. Discord Bot Demo ## ✨ Features * **🌐 Web-Connected AI**: Uses Perplexity's Sonar API for up-to-date information * **⚡ Slash Command**: Simple `/ask` command for questions * **💬 Mention Support**: Ask questions by mentioning the bot * **🔗 Source Citations**: Automatically formats and links to sources * **🔒 Secure Setup**: Environment-based configuration for API keys ## 🛠️ Prerequisites **Python 3.8+** installed on your system ```bash theme={null} python --version # Should be 3.8 or higher ``` **Active Perplexity API Key** from the [Perplexity API Platform console](https://console.perplexity.ai) You'll need a paid Perplexity account to access the API. See the [pricing page](https://www.perplexity.ai/pricing) for current rates. **Discord Bot Token** from the [Discord Developer Portal](https://discord.com/developers/applications) ## 🚀 Quick Start ### 1. Repository Setup Clone the repository and navigate to the bot directory: ```bash theme={null} git clone https://github.com/ppl-ai/api-cookbook.git cd api-cookbook/docs/examples/discord-py-bot/ ``` ### 2. Install Dependencies ```bash theme={null} # Create a virtual environment (recommended) python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install required packages pip install -r requirements.txt ``` ### 3. Configure API Keys 1. Visit the [Perplexity API Platform console](https://console.perplexity.ai) 2. Generate a new API key 3. Copy the key to the .env file Keep your API key secure! Never commit it to version control or share it publicly. 1. Go to the [Discord Developer Portal](https://discord.com/developers/applications) 2. Click **"New Application"** and give it a descriptive name 3. Navigate to the **"Bot"** section 4. Click **"Reset Token"** (or "Add Bot" if first time) 5. Copy the bot token Copy the example environment file and add your keys: ```bash theme={null} cp env.example .env ``` Edit `.env` with your credentials: ```bash title=".env" theme={null} DISCORD_TOKEN="your_discord_bot_token_here" PERPLEXITY_API_KEY="your_perplexity_api_key_here" ``` ## 🎯 Usage Guide ### Bot Invitation & Setup In the Discord Developer Portal: 1. Go to **OAuth2** → **URL Generator** 2. Select scopes: `bot` and `applications.commands` 3. Select bot permissions: `Send Messages`, `Use Slash Commands` 4. Copy the generated URL 1. Paste the URL in your browser 2. Select the Discord server to add the bot to 3. Confirm the permissions ```bash theme={null} python bot.py ``` You should see output confirming the bot is online and commands are synced. ### How to Use **Slash Command:** ``` /ask [your question here] ``` Slash Command Demo **Mention the Bot:** ``` @YourBot [your question here] ``` Mention Command Demo ## 📊 Response Format The bot provides clean, readable responses with: * **AI Answer**: Direct response from Perplexity's Sonar API * **Source Citations**: Clickable links to sources (when available) * **Automatic Truncation**: Responses are trimmed to fit Discord's limits ## 🔧 Technical Details This bot uses: * **Model**: Perplexity's `sonar-pro` model * **Response Limit**: 2000 tokens from API, truncated to fit Discord * **Temperature**: 0.2 for consistent, factual responses * **No Permissions**: Anyone in the server can use the bot # Disease Information App Source: https://docs.perplexity.ai/docs/cookbook/examples/disease-qa/README An interactive browser-based application that provides structured information about diseases using Perplexity's Sonar API # Disease Information App An interactive browser-based application that provides structured information about diseases using Perplexity's Sonar API. This app generates a standalone HTML interface that allows users to ask questions about various diseases and receive organized responses with citations. ## 🌟 Features * **User-Friendly Interface**: Clean, responsive design that works across devices * **AI-Powered Responses**: Leverages Perplexity's Sonar API for accurate medical information * **Structured Knowledge Cards**: Organizes information into Overview, Causes, and Treatments * **Citation Tracking**: Lists sources of information with clickable links * **Client-Side Caching**: Prevents duplicate API calls for previously asked questions * **Standalone Deployment**: Generate a single HTML file that can be used without a server * **Comprehensive Error Handling**: User-friendly error messages and robust error management ## 📋 Requirements * Python 3.6+ * Jupyter Notebook or JupyterLab (for development/generation) * Required packages: * requests * pandas * python-dotenv * IPython ## 🚀 Setup & Installation 1. Clone this repository or download the notebook 2. Install the required packages: ```bash theme={null} # Install from requirements file (recommended) pip install -r requirements.txt # Or install manually pip install requests pandas python-dotenv ipython ``` 3. Set up your Perplexity API key: * Create a `.env` file in the same directory as the notebook * Add your API key: `PERPLEXITY_API_KEY=your_api_key_here` ## 🔧 Usage ### Running the Notebook 1. Open the notebook in Jupyter: ```bash theme={null} jupyter notebook Disease_Information_App.ipynb ``` 2. Run all cells to generate and launch the browser-based application 3. The app will automatically open in your default web browser ### Using the Generated HTML You can also directly use the generated `disease_qa.html` file: 1. Open it in any modern web browser 2. Enter a question about a disease (e.g., "What is diabetes?", "Tell me about Alzheimer's disease") 3. Click "Ask" to get structured information about the disease ### Deploying the App For personal or educational use, simply share the generated HTML file. For production use, consider: 1. Setting up a proper backend to secure your API key 2. Hosting the file on a web server 3. Adding analytics and user management as needed ## 🔍 How It Works This application: 1. Uses a carefully crafted prompt to instruct the AI to output structured JSON 2. Processes this JSON to extract Overview, Causes, Treatments, and Citations 3. Presents the information in a clean knowledge card format 4. Implements client-side API calls with proper error handling 5. Provides a responsive design suitable for both desktop and mobile ## ⚙️ Technical Details ### API Structure The app expects the AI to return a JSON object with this structure: ```json theme={null} { "overview": "A brief description of the disease.", "causes": "The causes of the disease.", "treatments": "Possible treatments for the disease.", "citations": ["https://example.com/citation1", "https://example.com/citation2"] } ``` ### Files Generated * `disease_qa.html` - The standalone application * `disease_app.log` - Detailed application logs (when running the notebook) ### Customization Options You can modify: * The HTML/CSS styling in the `create_html_ui` function * The AI model used (default is "sonar-pro") * The structure of the prompt for different information fields * Output file location and naming ## 🛠️ Extending the App Potential extensions: * Add a Flask/Django backend to secure the API key * Implement user accounts and saved questions * Add visualization of disease statistics * Create a comparison view for multiple diseases * Add natural language question reformatting * Implement feedback mechanisms for answer quality ## ⚠️ Important Notes * **API Key Security**: The current implementation embeds your API key in the HTML file. This is suitable for personal use but not for public deployment. * **Not Medical Advice**: This app provides general information and should not be used for medical decisions. Always consult healthcare professionals for medical advice. * **API Usage**: Be aware of Perplexity API rate limits and pricing for your account. ## 📜 License [MIT License](https://github.com/ppl-ai/api-cookbook/blob/main/LICENSE) ## 🙏 Acknowledgements * This project uses the [Perplexity AI Sonar API](https://docs.perplexity.ai/) * Inspired by interactive knowledge bases and medical information platforms # Document Q&A with Embeddings Source: https://docs.perplexity.ai/docs/cookbook/examples/document-qa/README A self-contained RAG system that ingests documents, generates contextualized embeddings, and answers questions using the Agent API # Document Q\&A with Embeddings A self-contained retrieval-augmented generation (RAG) system that ingests documents, generates contextualized embeddings for semantic search, and produces grounded answers using the Agent API. ## Features * Ingest plain-text documents and automatically split them into chunks * Generate document-aware embeddings using `pplx-embed-context-v1-4b` * In-memory vector store with numpy cosine similarity search * Answer generation via the Agent API with `anthropic/claude-sonnet-4-6` * Full working pipeline: load, chunk, embed, query, answer ## Architecture **Indexing:** Load documents, split into overlapping chunks, embed with contextualized embeddings, store in memory. **Query:** Embed the user question, compute cosine similarity, retrieve top-k chunks, generate an answer with the Agent API. Contextualized embeddings produce higher-quality representations than standard embeddings for document chunks because the model understands that chunks belong to the same document. ## Installation ```bash theme={null} pip install perplexityai numpy ``` ```bash theme={null} export PERPLEXITY_API_KEY="your_api_key_here" ``` ## Usage Save the full code below to `document_qa.py` and run: ```bash theme={null} python document_qa.py ``` For interactive mode: ```bash theme={null} python document_qa.py --interactive ``` ## Full Code ```python theme={null} import base64 import sys import numpy as np from perplexity import Perplexity client = Perplexity() # --- Chunking --- def chunk_text(text, chunk_size=300, overlap=50): """Split text into overlapping chunks by word count.""" words = text.split() chunks, start = [], 0 while start < len(words): chunks.append(" ".join(words[start : start + chunk_size])) start += chunk_size - overlap return chunks # --- Embedding helpers --- def decode_embedding(b64_string): """Decode a base64-encoded int8 embedding to float32.""" return np.frombuffer(base64.b64decode(b64_string), dtype=np.int8).astype(np.float32) def cosine_similarity(a, b): return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)) # --- Build index --- def build_index(documents, chunk_size=300, overlap=50): """Chunk documents and generate contextualized embeddings.""" all_doc_chunks, metadata = [], [] for doc in documents: chunks = chunk_text(doc["content"], chunk_size, overlap) all_doc_chunks.append(chunks) metadata.append({"title": doc["title"], "chunks": chunks}) print(f"Embedding {sum(len(c) for c in all_doc_chunks)} chunks...") response = client.contextualized_embeddings.create( input=all_doc_chunks, model="pplx-embed-context-v1-4b" ) index = [] for doc_obj in response.data: meta = metadata[doc_obj.index] for chunk_obj in doc_obj.data: index.append({ "text": meta["chunks"][chunk_obj.index], "embedding": decode_embedding(chunk_obj.embedding), "doc_title": meta["title"], }) print(f"Index built: {len(index)} chunks.") return index # --- Retrieve --- def retrieve(index, query_text, top_k=3): """Embed the query and return the top-k most similar chunks.""" qr = client.contextualized_embeddings.create( input=[[query_text]], model="pplx-embed-context-v1-4b" ) q_emb = decode_embedding(qr.data[0].data[0].embedding) scored = sorted( [{**item, "score": float(cosine_similarity(q_emb, item["embedding"]))} for item in index], key=lambda x: x["score"], reverse=True, ) return scored[:top_k] # --- Generate answer --- def generate_answer(query_text, chunks): """Send retrieved context to the Agent API for answer generation.""" context = "\n\n".join( f"[Source {i}: {c['doc_title']}]\n{c['text']}" for i, c in enumerate(chunks, 1) ) response = client.responses.create( model="anthropic/claude-sonnet-4-6", input=[{ "role": "user", "content": ( f"Answer the following question based ONLY on the provided context. " f"If the context does not contain enough information, say so.\n\n" f"Context:\n{context}\n\nQuestion: {query_text}" ), }], instructions=( "You are a precise document Q&A assistant. Answer using only the " "provided context. Cite source numbers. Be concise." ), max_output_tokens=1024, ) return response.output_text # --- Full pipeline --- def query(index, query_text, top_k=3): print(f"\nQuery: {query_text}") retrieved = retrieve(index, query_text, top_k) for r in retrieved: print(f" [{r['doc_title']}] score={r['score']:.4f}: {r['text'][:70]}...") return generate_answer(query_text, retrieved) # --- Sample documents --- sample_documents = [ { "title": "Introduction to Transformers", "content": ( "The Transformer architecture was introduced in the paper Attention Is All " "You Need by Vaswani et al. in 2017. It replaced recurrent layers with " "self-attention mechanisms, enabling parallel processing of input sequences. " "The key innovation is multi-head attention, which allows the model to attend " "to information from different representation subspaces. Transformers consist " "of an encoder and decoder with stacked layers of multi-head attention and " "feed-forward sub-layers. The architecture has become the foundation for " "modern language models including BERT, GPT, and T5." ), }, { "title": "Retrieval-Augmented Generation", "content": ( "Retrieval-Augmented Generation (RAG) combines information retrieval with " "text generation. Instead of relying solely on knowledge stored in model " "parameters, RAG systems retrieve relevant documents from an external " "knowledge base and use them as context. This reduces hallucination because " "the model grounds its responses in retrieved evidence. A typical RAG " "pipeline has three stages: indexing, retrieval, and generation. During " "indexing, documents are chunked and embedded into a vector store. At query " "time, the question is embedded and compared against stored vectors. The " "most relevant chunks are prepended to the prompt for answer generation." ), }, ] if __name__ == "__main__": index = build_index(sample_documents) if "--interactive" in sys.argv: print("\nInteractive mode. Type 'quit' to exit.\n") while True: q = input("Question: ").strip() if q.lower() in ("quit", "exit", "q"): break if q: print(f"\nAnswer:\n{query(index, q)}\n") else: answer = query(index, "How does RAG reduce hallucination?") print(f"\nAnswer:\n{answer}") ``` ## Example Output ``` Embedding 4 chunks across 2 documents... Index built: 4 chunks. Query: How does RAG reduce hallucination? [Retrieval-Augmented Generation] score=0.8432: Retrieval-Augmented Generation (RAG) combines information retrieval w... [Retrieval-Augmented Generation] score=0.7891: most relevant chunks are prepended to the prompt for answer generatio... [Introduction to Transformers] score=0.6104: The Transformer architecture was introduced in the paper Attention Is... Answer: RAG reduces hallucination by grounding the model's responses in retrieved evidence rather than relying solely on knowledge stored in model parameters [Source 1]. The most relevant document chunks are prepended to the prompt, so the language model bases its answers on concrete textual evidence from the knowledge base [Source 2]. ``` For production workloads, replace the in-memory numpy index with a dedicated vector database such as Pinecone, Weaviate, or Qdrant. The embedding and retrieval logic remains the same. Contextualized embeddings require that chunks within each document are sent in their original sequential order. Shuffling chunks will degrade embedding quality. ## Limitations * The in-memory store is suitable for prototyping but will not scale to large collections. Use a vector database for production. * Chunk size and overlap may need tuning for your documents. Shorter chunks improve precision; longer chunks preserve context. * The `pplx-embed-context-v1-4b` model has a 32K token context window per document. * Answer quality depends on retrieval quality. If the wrong chunks are retrieved, the answer will reflect that. # Equity Research Brief Source: https://docs.perplexity.ai/docs/cookbook/examples/equity-research-brief/README Generate institutional-grade equity research briefs from any public ticker using the Perplexity Agent API and the built-in finance_search tool. # Equity Research Brief A command-line tool that generates a structured equity research brief for any public ticker using Perplexity's [Agent API](https://docs.perplexity.ai/docs/agent-api/quickstart) and the built-in [`finance_search`](https://docs.perplexity.ai/docs/agent-api/tools/finance-search) tool. `finance_search` returns structured market data — quotes, financials, earnings transcripts, peer comparisons, analyst estimates — so the model can compose a report grounded in numbers, not just narrative. The tool is purpose-built for agentic investor workflows. ## Features * One command produces a 6-section brief: snapshot, business overview, financial trajectory, latest earnings, peer context, risks, bottom line * Uses the Agent API's `finance_search` tool for structured fundamentals, quotes, and earnings-call transcripts * Three preset configurations matching the official `finance_search` recommendations: * `quote` — live price/quote only, fastest and cheapest * `single` — single-company historical lookup with web context * `research` — full multi-step cross-company brief (default) * Prints citation-ready Perplexity finance source URLs alongside the brief * Reports `finance_search` invocation count and total request cost * `--json` flag emits the raw Agent API response for downstream pipelines ## Prerequisites * Python 3.9+ * A Perplexity API key with Agent API access. `finance_search` is currently in beta — see the [Finance Search docs](https://docs.perplexity.ai/docs/agent-api/tools/finance-search) for availability. ## Installation ```bash theme={null} cd docs/examples/equity-research-brief pip install -r requirements.txt chmod +x equity_research_brief.py ``` ## API Key Setup ```bash theme={null} export PERPLEXITY_API_KEY="your-api-key-here" ``` You can also pass the key via `--api-key`, or place it in a `.pplx_api_key` file in the working directory. ## Quick Start Generate a full research brief on NVIDIA: ```bash theme={null} ./equity_research_brief.py NVDA ``` ## Usage ```bash theme={null} ./equity_research_brief.py TICKER [--config {quote,single,research}] [--json] [--api-key KEY] ``` ### Just a live quote (cheapest, \~1 tool call) ```bash theme={null} ./equity_research_brief.py AAPL --config quote ``` ### Single-company historical lookup with web context ```bash theme={null} ./equity_research_brief.py MSFT --config single ``` ### Full multi-step research brief (default) ```bash theme={null} ./equity_research_brief.py NVDA --config research ``` ### Emit raw Agent API JSON ```bash theme={null} ./equity_research_brief.py TSLA --json | jq '.usage.cost' ``` ## Configuration Reference | Config | Model | Tools | Max steps | Best for | | ---------- | --------------------------- | --------------------------------------------- | --------- | ------------------------------------- | | `quote` | `perplexity/sonar` | `finance_search` | 1 | Live prices, quotes, fastest path | | `single` | `openai/gpt-5.5` | `web_search` + `finance_search` + `fetch_url` | 5 | One-company historical fundamentals | | `research` | `anthropic/claude-opus-4-7` | `web_search` + `finance_search` + `fetch_url` | 10 | Multi-company comparisons, full brief | These configurations are taken directly from the [`finance_search` recommended configurations](https://docs.perplexity.ai/docs/agent-api/tools/finance-search). ## Example Output (truncated) ``` ## 1. Snapshot - **Price:** $200.23 (as of 2026-05-01 14:10 UTC) - **Market cap:** $4.87T - **P/E (TTM):** 40.86 - **52-week range:** $110.82 – $216.83 ## 2. Business overview NVIDIA designs accelerated computing platforms — GPUs, networking, and full-stack software — used in AI training and inference, gaming, professional visualization, and automotive. Data Center is the dominant revenue line. ## 3. Financial trajectory | FY | Revenue | Operating margin | Net income | | ----- | ----------- | ---------------- | ---------- | | FY25 | $130.5B | 62.4% | $72.9B | ... --- finance_search: 4 invocation(s) across categories [earnings_history, financials, profile, quote] Finance sources: - https://www.perplexity.ai/finance/NVDA - https://www.perplexity.ai/finance/NVDA/earnings?eventId=409967 - ... Cost: 0.2817 USD ``` ## Code Walkthrough The script does three things: **1. Issue a single Agent API call with `finance_search` enabled.** ```python theme={null} from perplexity import Perplexity client = Perplexity() response = client.responses.create( model="anthropic/claude-opus-4-7", instructions=SYSTEM_PROMPT, input=BRIEF_TEMPLATE.format(ticker="NVDA"), tools=[ {"type": "web_search"}, {"type": "finance_search"}, {"type": "fetch_url"}, ], max_output_tokens=4096, max_steps=10, ) ``` The model decides which `finance_search` categories to fetch (quote, financials, transcript, etc.) based on the prompt. You don't need to hand-pick fields. **2. Walk `response.output` to extract both the assistant text and the structured `finance_results` blocks.** ```python theme={null} for item in response.output: if item.type == "finance_results": for r in item.results: print(r.category, r.tickers, r.sources) elif item.type == "message": for block in item.content: if block.type == "output_text": print(block.text) ``` **3. Surface cost and finance source URLs alongside the prose.** The Perplexity finance pages returned in `result.sources` are stable, citation-ready links — useful when the brief is consumed by humans or by a downstream RAG pipeline. ## Prompting Guidance `finance_search` works best when the prompt asks for a business outcome, not for specific data shapes. The system prompt instructs the model to: * be quantitative and attribute numbers to the right period (e.g. `FY2025`, `Q3 FY26`) * never invent numbers — if `finance_search` doesn't return a field, say so explicitly * format the output in clean Markdown This pattern is documented in the [finance\_search prompt guidance](https://docs.perplexity.ai/docs/agent-api/tools/finance-search#prompt-guidance). ## Pricing `finance_search` is billed at **\$5 per 1,000 invocations**, separate from model token usage. Each preset has different cost characteristics: * `quote`: typically 1 invocation, \~\$0.007 per brief * `single`: 1–3 invocations + GPT-5.5 tokens * `research`: 3–6 invocations + Claude Opus tokens See [Perplexity Pricing](https://docs.perplexity.ai/docs/getting-started/pricing) for current rates. ## Limitations * `finance_search` is currently in beta and may not be enabled on all API keys * Results depend on Perplexity's finance data coverage; obscure or non-US tickers may return less structured data * This is not investment advice. The "Bottom line" section is explicitly framed as analytical opinion, not a recommendation ## Resources * [Agent API Quickstart](https://docs.perplexity.ai/docs/agent-api/quickstart) * [Finance Search Tool](https://docs.perplexity.ai/docs/agent-api/tools/finance-search) * [Web Search Tool](https://docs.perplexity.ai/docs/agent-api/tools/web-search) * [Perplexity Python SDK](https://pypi.org/project/perplexityai/) # Fact Checker CLI Source: https://docs.perplexity.ai/docs/cookbook/examples/fact-checker-cli/README A command-line tool that identifies false or misleading claims in articles or statements using Perplexity's Sonar API # Fact Checker CLI A command-line tool that identifies false or misleading claims in articles or statements using Perplexity's Sonar API for web research. ## Features * Analyze claims or entire articles for factual accuracy * Identify false, misleading, or unverifiable claims * Provide explanations and corrections for inaccurate information * Output results in human-readable format or structured JSON * Cite reliable sources for fact-checking assessments * Leverages Perplexity's structured outputs for reliable JSON parsing (for Tier 3+ users) ## Installation ### 1. Install required dependencies ```bash theme={null} # Install from requirements file (recommended) pip install -r requirements.txt # Or install manually pip install requests pydantic newspaper3k ``` ### 2. Make the script executable ```bash theme={null} chmod +x fact_checker.py ``` ## API Key Setup The tool requires a Perplexity API key to function. You can provide it in one of these ways: ### 1. As a command-line argument ```bash theme={null} ./fact_checker.py --api-key YOUR_API_KEY ``` ### 2. As an environment variable ```bash theme={null} export PPLX_API_KEY=YOUR_API_KEY ``` ### 3. In a file Create a file named `pplx_api_key` or `.pplx_api_key` in the same directory as the script: ```bash theme={null} echo "YOUR_API_KEY" > .pplx_api_key chmod 600 .pplx_api_key ``` **Note:** If you're using the structured outputs feature, you'll need a Perplexity API account with Tier 3 or higher access level. ## Quick Start Run the following command immediately after setup: ```bash theme={null} ./fact_checker.py -t "The Earth is flat and NASA is hiding the truth." ``` This will analyze the claim, research it using Perplexity's Sonar API, and return a detailed fact check with ratings, explanations, and sources. ## Usage ### Check a claim ```bash theme={null} ./fact_checker.py --text "The Earth is flat and NASA is hiding the truth." ``` ### Check an article from a file ```bash theme={null} ./fact_checker.py --file article.txt ``` ### Check an article from a URL ```bash theme={null} ./fact_checker.py --url https://www.example.com/news/article-to-check ``` ### Specify a different model ```bash theme={null} ./fact_checker.py --text "Global temperatures have decreased over the past century." --model "sonar-pro" ``` ### Output results as JSON ```bash theme={null} ./fact_checker.py --text "Mars has a breathable atmosphere." --json ``` ### Use a custom prompt file ```bash theme={null} ./fact_checker.py --text "The first human heart transplant was performed in the United States." --prompt-file custom_prompt.md ``` ### Enable structured outputs (for Tier 3+ users) Structured output is disabled by default. To enable it, pass the `--structured-output` flag: ```bash theme={null} ./fact_checker.py --text "Vaccines cause autism." --structured-output ``` ### Get help ```bash theme={null} ./fact_checker.py --help ``` ## Output Format The tool provides output including: * **Overall Rating**: MOSTLY\_TRUE, MIXED, or MOSTLY\_FALSE * **Summary**: A brief overview of the fact-checking findings * **Claims Analysis**: A list of specific claims with individual ratings: * TRUE: Factually accurate and supported by evidence * FALSE: Contradicted by evidence * MISLEADING: Contains some truth but could lead to incorrect conclusions * UNVERIFIABLE: Cannot be conclusively verified with available information * **Explanations**: Detailed reasoning for each claim * **Sources**: Citations and URLs used for verification ## Example Run the following command: ```bash theme={null} ./fact_checker.py -t "The Great Wall of China is visible from the moon." ``` Example output: ``` Fact checking in progress... 🔴 OVERALL RATING: MOSTLY_FALSE 📝 SUMMARY: The claim that the Great Wall of China is visible from the moon is false. This is a common misconception that has been debunked by NASA astronauts and scientific evidence. 🔍 CLAIMS ANALYSIS: Claim 1: ❌ FALSE Statement: "The Great Wall of China is visible from the moon." Explanation: The Great Wall of China is not visible from the moon with the naked eye. NASA astronauts have confirmed this, including Neil Armstrong who stated he could not see the Wall from lunar orbit. The Wall is too narrow and is similar in color to its surroundings when viewed from such a distance. Sources: - NASA.gov - Scientific American - National Geographic ``` ## Limitations * The accuracy of fact-checking depends on the quality of information available through the Perplexity Sonar API. * Like all language models, the underlying AI may have limitations in certain specialized domains. * The structured outputs feature requires a Tier 3 or higher Perplexity API account. * The tool does not replace professional fact-checking services for highly sensitive or complex content. # Document Q&A Source: https://docs.perplexity.ai/docs/cookbook/examples/file-attachment-qa/README Load documents and ask questions about them via the Agent API — text extraction, web search enrichment, multi-turn Q&A, and structured output extraction # Document Q\&A Load a document and ask questions about it using the Agent API. This example shows how to read document content, combine it with web search for enriched answers, and build a multi-turn Q\&A session over document content. ## Features * Load documents (text, CSV, JSON, markdown) and pass content to the Agent API * Ask questions grounded in the document content * Optionally combine with `web_search` for context enrichment * Multi-turn conversation to drill into specific sections * Structured output extraction from document content Pass document content directly in the `input` parameter of the Agent API. For text-based formats, read the file and include it in the prompt. Combine with `web_search` for context enrichment beyond the document. ## Installation ```bash theme={null} pip install perplexityai ``` ```bash theme={null} export PERPLEXITY_API_KEY="your_api_key_here" ``` ## Usage Save the full code below to `doc_qa.py` and run: ```bash theme={null} python doc_qa.py report.txt "What are the key findings in this report?" ``` For interactive mode: ```bash theme={null} python doc_qa.py report.txt --interactive ``` ## Full Code ```python Python theme={null} import sys import json import argparse from pathlib import Path from perplexity import Perplexity client = Perplexity() MAX_CONTENT_CHARS = 50000 # Truncate very large files to stay within token limits def read_document(file_path: str) -> str: """Read a document file and return its text content.""" path = Path(file_path) content = path.read_text(errors="replace") if len(content) > MAX_CONTENT_CHARS: content = content[:MAX_CONTENT_CHARS] + "\n\n[... truncated ...]" return content def ask_about_document( file_path: str, question: str, use_web_search: bool = False, conversation_history: list = None, ) -> dict: """Ask a question about a document's content.""" doc_content = read_document(file_path) filename = Path(file_path).name # Build the input with document content and question full_input = ( f"Document: {filename}\n" f"{'='*60}\n" f"{doc_content}\n" f"{'='*60}\n\n" f"Question: {question}" ) # Include conversation history for multi-turn if conversation_history: messages = conversation_history + [{"role": "user", "content": full_input}] response = client.responses.create( model="openai/gpt-5.4", input=messages, tools=[{"type": "web_search"}] if use_web_search else [], instructions="Answer questions based on the provided document content. Be specific and cite sections when possible.", ) else: response = client.responses.create( model="openai/gpt-5.4", input=full_input, tools=[{"type": "web_search"}] if use_web_search else [], instructions="Answer questions based on the provided document content. Be specific and cite sections when possible.", ) usage = response.usage return { "answer": response.output_text, "model": response.model, "tokens": { "input": usage.input_tokens if usage else 0, "output": usage.output_tokens if usage else 0, }, } def extract_structured_data(file_path: str, schema_name: str, schema: dict) -> dict: """Extract structured data from a document using a JSON schema.""" doc_content = read_document(file_path) response = client.responses.create( model="openai/gpt-5.4", input=f"Extract the requested structured data from this document:\n\n{doc_content}", response_format={ "type": "json_schema", "json_schema": {"name": schema_name, "schema": schema}, }, ) return json.loads(response.output_text) def interactive_session(file_path: str, use_web_search: bool = False): """Run an interactive Q&A session over a document.""" print(f"Document loaded: {file_path}") print(f"Web search: {'enabled' if use_web_search else 'disabled'}") print("Type 'quit' to exit.\n") history = [] while True: question = input("Question: ").strip() if question.lower() in ("quit", "exit", "q"): break if not question: continue result = ask_about_document(file_path, question, use_web_search, history) print(f"\nAnswer:\n{result['answer']}\n") print(f"({result['tokens']['input']}+{result['tokens']['output']} tokens)\n") # Add to conversation history for multi-turn history.append({"role": "user", "content": question}) history.append({"role": "assistant", "content": result["answer"]}) def main(): parser = argparse.ArgumentParser(description="Document Q&A") parser.add_argument("file", help="Path to the document file") parser.add_argument("question", nargs="?", help="Question to ask") parser.add_argument("--interactive", action="store_true", help="Interactive mode") parser.add_argument("--web-search", action="store_true", help="Enable web search") args = parser.parse_args() if not Path(args.file).exists(): print(f"Error: File not found: {args.file}", file=sys.stderr) sys.exit(1) if args.interactive: interactive_session(args.file, args.web_search) elif args.question: result = ask_about_document(args.file, args.question, args.web_search) print(result["answer"]) else: print("Error: Provide a question or use --interactive.", file=sys.stderr) sys.exit(1) if __name__ == "__main__": main() ``` ## Example Output ```bash theme={null} python doc_qa.py quarterly_report.txt "What was the total revenue for Q3?" ``` ``` Based on the quarterly report, total revenue for Q3 was $4.2 billion, representing a 15% year-over-year increase. The report attributes this growth primarily to the cloud services division, which grew 28% compared to the same period last year (see Section 3, Financial Highlights). ``` With web search enrichment: ```bash theme={null} python doc_qa.py quarterly_report.txt "How does this compare to industry benchmarks?" --web-search ``` ``` According to the report, Q3 revenue was $4.2 billion (from document). For comparison, the industry average revenue growth for cloud-focused companies in Q3 2025 was approximately 12% year-over-year (from web search: Bloomberg industry analysis). This places the company above the industry benchmark by roughly 3 percentage points. ``` ## Structured Data Extraction from Documents Extract specific fields from a document into a typed JSON structure: ```python theme={null} # Extract key metrics from a financial report schema = { "type": "object", "properties": { "company_name": {"type": "string"}, "quarter": {"type": "string"}, "total_revenue": {"type": "string"}, "net_income": {"type": "string"}, "revenue_growth_yoy": {"type": "string"}, "key_highlights": {"type": "array", "items": {"type": "string"}}, }, "required": ["company_name", "quarter", "total_revenue", "net_income", "revenue_growth_yoy", "key_highlights"], "additionalProperties": false, } data = extract_structured_data("quarterly_report.txt", "financial_summary", schema) print(json.dumps(data, indent=2)) ``` Combine document content with structured outputs to build reliable document processing pipelines. The JSON schema ensures consistent output regardless of document format variations. Very large documents are truncated to stay within token limits. For large files, consider splitting into sections and processing each separately. ## Limitations * Very large documents are truncated. The default limit is 50,000 characters (\~12,500 tokens). * Text-based formats (.txt, .csv, .md, .json) work best. For PDFs, use a library like `pdfplumber` or `PyPDF2` to extract text first. * Conversation history for multi-turn sessions does not re-read the file — the document content is included in the first message. # Financial News Tracker Source: https://docs.perplexity.ai/docs/cookbook/examples/financial-news-tracker/README A real-time financial news monitoring tool that fetches and analyzes market news using Perplexity's Sonar API # Financial News Tracker A command-line tool that fetches and analyzes real-time financial news using Perplexity's Sonar API. Get comprehensive market insights, news summaries, and investment analysis for any financial topic. ## Features * Real-time financial news aggregation from multiple sources * Market sentiment analysis (Bullish/Bearish/Neutral) * Impact assessment for news items (High/Medium/Low) * Sector and company-specific analysis * Investment insights and recommendations * Customizable time ranges (24h to 1 year) * Structured JSON output support * Beautiful emoji-enhanced CLI output ## Installation ### 1. Install required dependencies ```bash theme={null} # Install from requirements file (recommended) pip install -r requirements.txt # Or install manually pip install requests pydantic ``` ### 2. Make the script executable ```bash theme={null} chmod +x financial_news_tracker.py ``` ## API Key Setup The tool requires a Perplexity API key. You can provide it in one of these ways: ### 1. As an environment variable (recommended) ```bash theme={null} export PPLX_API_KEY=YOUR_API_KEY ``` ### 2. As a command-line argument ```bash theme={null} ./financial_news_tracker.py "tech stocks" --api-key YOUR_API_KEY ``` ### 3. In a file Create a file named `pplx_api_key` or `.pplx_api_key` in the same directory: ```bash theme={null} echo "YOUR_API_KEY" > .pplx_api_key chmod 600 .pplx_api_key ``` ## Quick Start Get the latest tech stock news: ```bash theme={null} ./financial_news_tracker.py "tech stocks" ``` This will fetch recent financial news about tech stocks, analyze market sentiment, and provide actionable insights. ## Usage Examples ### Basic usage - Get news for a specific topic ```bash theme={null} ./financial_news_tracker.py "S&P 500" ``` ### Get cryptocurrency news from the past week ```bash theme={null} ./financial_news_tracker.py "cryptocurrency" --time-range 1w ``` ### Track specific company news ```bash theme={null} ./financial_news_tracker.py "AAPL Apple stock" ``` ### Get news about market sectors ```bash theme={null} ./financial_news_tracker.py "energy sector oil prices" ``` ### Output as JSON for programmatic use ```bash theme={null} ./financial_news_tracker.py "inflation rates" --json ``` ### Use a different model ```bash theme={null} ./financial_news_tracker.py "Federal Reserve interest rates" --model sonar ``` ### Enable structured output (requires Tier 3+ API access) ```bash theme={null} ./financial_news_tracker.py "tech earnings" --structured-output ``` ## Time Range Options * `24h` - Last 24 hours (default) * `1w` - Last week * `1m` - Last month * `3m` - Last 3 months * `1y` - Last year ## Output Format The tool provides comprehensive financial analysis including: ### 1. Executive Summary A brief overview of the key financial developments ### 2. Market Analysis * **Market Sentiment**: Overall market mood (🐂 Bullish, 🐻 Bearish, ⚖️ Neutral) * **Key Drivers**: Factors influencing the market * **Risks**: Current market risks and concerns * **Opportunities**: Potential investment opportunities ### 3. News Items Each news item includes: * **Headline**: The main news title * **Impact**: Market impact level (🔴 High, 🟡 Medium, 🟢 Low) * **Summary**: Brief description of the news * **Affected Sectors**: Industries or companies impacted * **Source**: News source attribution ### 4. Investment Insights Actionable recommendations and analysis based on the news ## Example Output ``` 📊 FINANCIAL NEWS REPORT: tech stocks 📅 Period: Last 24 hours 📝 EXECUTIVE SUMMARY: Tech stocks showed mixed performance today as AI-related companies surged while semiconductor stocks faced pressure from supply chain concerns... 📈 MARKET ANALYSIS: Sentiment: 🐂 BULLISH Key Drivers: • Strong Q4 earnings from major tech companies • AI sector momentum continues • Federal Reserve signals potential rate cuts ⚠️ Risks: • Semiconductor supply chain disruptions • Regulatory scrutiny on big tech • Valuation concerns in AI sector 💡 Opportunities: • Cloud computing growth • AI infrastructure plays • Cybersecurity demand surge 📰 KEY NEWS ITEMS: 1. Microsoft Hits All-Time High on AI Growth Impact: 🔴 HIGH Summary: Microsoft stock reached record levels following strong Azure AI revenue... Sectors: Cloud Computing, AI, Software Source: Bloomberg 💼 INSIGHTS & RECOMMENDATIONS: • Consider diversifying within tech sector • AI infrastructure companies show strong momentum • Monitor semiconductor sector for buying opportunities ``` ## Advanced Features ### Custom Queries You can combine multiple topics for comprehensive analysis: ```bash theme={null} # Get news about multiple related topics ./financial_news_tracker.py "NVIDIA AMD semiconductor AI chips" # Track geopolitical impacts on markets ./financial_news_tracker.py "oil prices Middle East geopolitics" # Monitor economic indicators ./financial_news_tracker.py "inflation CPI unemployment Federal Reserve" ``` ### JSON Output For integration with other tools or scripts: ```bash theme={null} ./financial_news_tracker.py "bitcoin" --json | jq '.market_analysis.market_sentiment' ``` ## Tips for Best Results 1. **Be Specific**: Include company tickers, sector names, or specific events 2. **Combine Topics**: Mix company names with relevant themes (e.g., "TSLA electric vehicles") 3. **Use Time Ranges**: Match the time range to your investment horizon 4. **Regular Monitoring**: Set up cron jobs for daily market updates ## Limitations * Results depend on available public information * Not financial advice - always do your own research * Historical data may be limited for very recent events * Structured output requires Tier 3+ Perplexity API access ## Error Handling The tool includes comprehensive error handling for: * Invalid API keys * Network connectivity issues * API rate limits * Invalid queries * Parsing errors ## Integration Examples ### Daily Market Report Create a script for daily updates: ```bash theme={null} #!/bin/bash # daily_market_report.sh echo "=== Daily Market Report ===" > market_report.txt echo "Date: $(date)" >> market_report.txt echo "" >> market_report.txt ./financial_news_tracker.py "S&P 500 market overview" >> market_report.txt ./financial_news_tracker.py "top gaining stocks" >> market_report.txt ./financial_news_tracker.py "cryptocurrency bitcoin ethereum" >> market_report.txt ``` ### Python Integration ```python theme={null} import subprocess import json def get_financial_news(query, time_range="24h"): result = subprocess.run( ["./financial_news_tracker.py", query, "--time-range", time_range, "--json"], capture_output=True, text=True ) if result.returncode == 0: return json.loads(result.stdout) else: raise Exception(f"Error fetching news: {result.stderr}") # Example usage news = get_financial_news("tech stocks", "1w") print(f"Market sentiment: {news['market_analysis']['market_sentiment']}") ``` # Image Analysis Source: https://docs.perplexity.ai/docs/cookbook/examples/image-analysis/README Vision-powered image analysis with web search for context-enriched results using the Perplexity Agent API # Image Analysis Analyze images using vision models through the Perplexity Agent API, then enrich the analysis with web search to provide real-world context. This example combines image understanding with live information retrieval in a two-step pipeline: identify what is in the image, then research the identified subjects. ## Features * Upload images via base64 encoding or public HTTPS URL * Analyze images with vision-capable models like `openai/gpt-5.4` through the Agent API * Combine image analysis with web search for context enrichment * Two-step pipeline: identify, then research * Support for PNG, JPEG, WEBP, and GIF formats ## Installation ```bash Python theme={null} pip install perplexityai ``` ```bash TypeScript theme={null} npm install @perplexity-ai/perplexity_ai ``` ```bash theme={null} export PERPLEXITY_API_KEY="your_api_key_here" ``` ## Usage ```bash Python theme={null} python image_analysis.py path/to/photo.jpg python image_analysis.py https://example.com/photo.jpg ``` ```bash TypeScript theme={null} npx tsx image_analysis.ts path/to/photo.jpg npx tsx image_analysis.ts https://example.com/photo.jpg ``` ## Full Code ```python Python theme={null} import sys import base64 from perplexity import Perplexity client = Perplexity() def encode_image(image_path): """Read a local image and return a base64 data URI.""" with open(image_path, "rb") as f: encoded = base64.b64encode(f.read()).decode("utf-8") ext = image_path.rsplit(".", 1)[-1].lower() mime = {"png": "image/png", "jpg": "image/jpeg", "jpeg": "image/jpeg", "webp": "image/webp", "gif": "image/gif"}.get(ext, "image/png") return f"data:{mime};base64,{encoded}" def identify_image(image_source): """Step 1: Identify objects and subjects in an image.""" image_url = image_source if image_source.startswith("http") else encode_image(image_source) response = client.responses.create( model="openai/gpt-5.4", input=[{ "role": "user", "content": [ { "type": "input_text", "text": ( "Analyze this image in detail. Identify all notable objects, " "people, landmarks, species, or text. For each, provide a " "concise label and brief description. Format as a numbered list." ), }, {"type": "input_image", "image_url": image_url}, ], }], max_output_tokens=1024, ) return response.output_text def research_subjects(identification_text): """Step 2: Research identified subjects with web search.""" response = client.responses.create( model="openai/gpt-5.4", input=( f"The following subjects were identified in an image:\n\n" f"{identification_text}\n\n" f"Research each subject. For each, provide:\n" f"- What it is and why it is notable\n" f"- Key facts or recent news\n" f"- Historical or cultural significance if applicable\n\n" f"Combine the analysis into a comprehensive report." ), tools=[{"type": "web_search"}], instructions="You are an image research assistant. Provide accurate, up-to-date information. Synthesize image observations with research.", ) return response.output_text def analyze(image_source): """Full pipeline: identify then research.""" print(f"Analyzing: {image_source}\n") print("Step 1: Identifying subjects...") identification = identify_image(image_source) print(f"\n{identification}\n") print("Step 2: Researching subjects...") report = research_subjects(identification) print(f"\n{report}") if __name__ == "__main__": if len(sys.argv) < 2: print("Usage: python image_analysis.py ") sys.exit(1) analyze(sys.argv[1]) ``` ```typescript TypeScript theme={null} import Perplexity from "@perplexity-ai/perplexity_ai"; import * as fs from "fs"; import * as path from "path"; const client = new Perplexity(); function encodeImage(imagePath: string): string { const encoded = fs.readFileSync(imagePath).toString("base64"); const ext = path.extname(imagePath).slice(1).toLowerCase(); const mime: Record = { png: "image/png", jpg: "image/jpeg", jpeg: "image/jpeg", webp: "image/webp", gif: "image/gif", }; return `data:${mime[ext] || "image/png"};base64,${encoded}`; } async function identifyImage(imageSource: string): Promise { const imageUrl = imageSource.startsWith("http") ? imageSource : encodeImage(imageSource); const response = await client.responses.create({ model: "openai/gpt-5.4", input: [{ role: "user", content: [ { type: "input_text", text: "Analyze this image in detail. Identify all notable objects, " + "people, landmarks, species, or text. For each, provide a " + "concise label and brief description. Format as a numbered list.", }, { type: "input_image", image_url: imageUrl }, ], }], max_output_tokens: 1024, }); return response.output_text; } async function researchSubjects(identificationText: string): Promise { const response = await client.responses.create({ model: "openai/gpt-5.4", input: `The following subjects were identified in an image:\n\n` + `${identificationText}\n\n` + `Research each subject. For each, provide:\n` + `- What it is and why it is notable\n` + `- Key facts or recent news\n` + `- Historical or cultural significance if applicable\n\n` + `Combine the analysis into a comprehensive report.`, tools: [{ type: "web_search" }], instructions: "You are an image research assistant. Provide accurate, up-to-date information. Synthesize image observations with research.", }); return response.output_text; } async function analyze(imageSource: string): Promise { console.log(`Analyzing: ${imageSource}\n`); console.log("Step 1: Identifying subjects..."); const identification = await identifyImage(imageSource); console.log(`\n${identification}\n`); console.log("Step 2: Researching subjects..."); const report = await researchSubjects(identification); console.log(`\n${report}`); } const arg = process.argv[2]; if (!arg) { console.log("Usage: npx tsx image_analysis.ts "); process.exit(1); } analyze(arg); ``` ## Example Output ``` Analyzing: golden_gate.jpg Step 1: Identifying subjects... 1. Golden Gate Bridge - Iconic red-orange suspension bridge spanning the Golden Gate strait in San Francisco, California. 2. San Francisco Bay - Body of water beneath the bridge, connecting to the Pacific Ocean. 3. Marin Headlands - Hilly terrain on the far side, part of the Golden Gate National Recreation Area. 4. Fog bank - Low-lying cloud formation rolling in from the Pacific. Step 2: Researching subjects... ## Golden Gate Bridge - Comprehensive Analysis ### The Bridge The Golden Gate Bridge is a suspension bridge spanning the one-mile-wide strait connecting San Francisco Bay to the Pacific Ocean. Completed in 1937, it held the record for the longest suspension bridge span at 4,200 feet until 1964. Its "International Orange" color was chosen for fog visibility and aesthetic harmony. ### San Francisco Bay San Francisco Bay is a shallow estuary encompassing approximately 1,600 square miles of watershed, one of the largest natural harbors on the Pacific coast. ### Marin Headlands Part of the Golden Gate National Recreation Area, offering hiking trails with panoramic views of the bridge and city skyline. ### Fog Patterns Summer fog through the Golden Gate is a defining feature of San Francisco's microclimate, formed when warm inland air draws cool Pacific air through the strait. ``` Base64-encoded images count toward input token usage. A 1024x768 image consumes approximately 1,048 tokens. The maximum file size for base64 images is 50 MB. Vision input is supported on the Agent API via the `input_image` content type. Use a vision-capable model like `openai/gpt-5.4`. Check the [Agent API Image Attachments docs](/docs/agent-api/image-attachments) for supported formats and size limits. ## Limitations * Image analysis requires a vision-capable model (e.g., `openai/gpt-5.4`). Not all models support `input_image`. * Web search quality in Step 2 depends on identification accuracy in Step 1. * Only publicly accessible HTTPS URLs work for URL-based input. Private URLs will fail. * Animated GIFs are supported but only the first frame is analyzed. # Multi-Provider Model Comparison Source: https://docs.perplexity.ai/docs/cookbook/examples/model-comparison/README A CLI tool that sends the same prompt to multiple AI models via Perplexity's Agent API and compares response quality, latency, and cost # Multi-Provider Model Comparison A command-line tool that sends the same prompt to multiple AI models through Perplexity's Agent API and produces a side-by-side comparison of response quality, latency, and cost. Useful for evaluating which model best fits your use case. ## Features * Send identical prompts to 5 models across different providers in a single run * Measure response latency using wall-clock timing * Extract per-request cost from the `response.usage.cost.total_cost` field * Tabulated output comparing response length, latency, and cost * Model fallback chain support using the `models=[...]` parameter for high-availability workflows * Configurable prompt input via command-line argument or file ## Supported Models The default comparison set spans five providers: | Model | Provider | | ------------------------------ | ---------- | | `openai/gpt-5.4` | OpenAI | | `anthropic/claude-sonnet-4-6` | Anthropic | | `google/gemini-3.1-flash-lite` | Google | | `xai/grok-4.20-non-reasoning` | xAI | | `perplexity/sonar` | Perplexity | ## Installation ```bash theme={null} pip install perplexityai ``` ## API Key Setup Set your Perplexity API key as an environment variable. The SDK reads it automatically: ```bash theme={null} export PERPLEXITY_API_KEY="your_api_key_here" ``` Perplexity's Agent API provides access to models from multiple providers through a single API key. You do not need separate API keys for OpenAI, Anthropic, Google, or xAI. ## Usage ### Compare models with a prompt ```bash theme={null} python model_comparison.py "Explain the CAP theorem in distributed systems" ``` ### Read the prompt from a file ```bash theme={null} python model_comparison.py --file prompt.txt ``` ### Use a custom set of models ```bash theme={null} python model_comparison.py "What is quantum entanglement?" \ --models openai/gpt-5.4 anthropic/claude-sonnet-4-6 perplexity/sonar ``` ### Export results as JSON ```bash theme={null} python model_comparison.py "Summarize recent AI safety research" --json > results.json ``` ### Use model fallback chain Instead of comparing models, you can test the fallback chain feature. The API tries each model in order until one succeeds: ```bash theme={null} python model_comparison.py "Latest AI news" --fallback ``` ## How It Works 1. The CLI accepts a prompt and an optional list of models. 2. For each model, the tool records a start timestamp, calls `client.responses.create(model=..., input=...)`, and records the end timestamp. 3. From each response, it extracts `response.usage.cost.total_cost` for the request cost and computes latency as the elapsed wall-clock time. 4. Results are collected and displayed in a comparison table sorted by latency. 5. In fallback mode, the tool sends a single request with `models=[...]` and reports which model was ultimately used. The `response.usage.cost` object includes `input_cost`, `output_cost`, and `total_cost` in USD. This makes it straightforward to compare the true cost of each model for your specific prompt. ## Full Code ```python Python theme={null} import sys import json import time import argparse from typing import List, Optional from perplexity import Perplexity DEFAULT_MODELS = [ "openai/gpt-5.4", "anthropic/claude-sonnet-4-6", "google/gemini-3.1-flash-lite", "xai/grok-4.20-non-reasoning", "perplexity/sonar", ] def compare_models(prompt: str, models: List[str]) -> List[dict]: """Send the same prompt to each model and collect metrics.""" client = Perplexity() results = [] for model in models: print(f" Querying {model}...") try: start = time.time() response = client.responses.create( model=model, input=prompt, max_output_tokens=1024, ) elapsed = time.time() - start output_text = response.output_text total_cost = response.usage.cost.total_cost input_tokens = response.usage.input_tokens output_tokens = response.usage.output_tokens results.append({ "model": model, "status": "success", "latency_s": round(elapsed, 2), "response_length": len(output_text), "input_tokens": input_tokens, "output_tokens": output_tokens, "cost_usd": total_cost, "preview": output_text[:120].replace("\n", " "), }) except Exception as e: results.append({ "model": model, "status": "error", "error": str(e), "latency_s": None, "response_length": 0, "input_tokens": 0, "output_tokens": 0, "cost_usd": None, "preview": "", }) return results def run_fallback(prompt: str, models: List[str]) -> dict: """Send a single request with a model fallback chain.""" client = Perplexity() print(f" Sending request with fallback chain: {models}") start = time.time() response = client.responses.create( models=models, input=prompt, max_output_tokens=1024, ) elapsed = time.time() - start return { "requested_models": models, "model_used": response.model, "latency_s": round(elapsed, 2), "response_length": len(response.output_text), "cost_usd": response.usage.cost.total_cost, "preview": response.output_text[:200].replace("\n", " "), } def format_table(results: List[dict]) -> str: """Format comparison results as a text table.""" # Sort by latency (successful responses first) successful = [r for r in results if r["status"] == "success"] failed = [r for r in results if r["status"] != "success"] successful.sort(key=lambda r: r["latency_s"]) lines = [] header = f"{'Model':<42} {'Latency':>8} {'Length':>8} {'Tokens':>8} {'Cost':>10}" lines.append(header) lines.append("-" * len(header)) for r in successful: tokens = f"{r['input_tokens']}+{r['output_tokens']}" cost = f"${r['cost_usd']:.5f}" lines.append( f"{r['model']:<42} {r['latency_s']:>7.2f}s {r['response_length']:>8} {tokens:>8} {cost:>10}" ) for r in failed: lines.append(f"{r['model']:<42} {'FAILED':>8} {'-':>8} {'-':>8} {'-':>10}") return "\n".join(lines) def main(): parser = argparse.ArgumentParser( description="Multi-Provider Model Comparison" ) parser.add_argument("prompt", nargs="?", help="The prompt to send") parser.add_argument("--file", help="Read prompt from a file") parser.add_argument( "--models", nargs="+", default=DEFAULT_MODELS, help="Models to compare", ) parser.add_argument( "--fallback", action="store_true", help="Use model fallback chain instead of comparing", ) parser.add_argument( "--json", action="store_true", help="Output results as JSON" ) args = parser.parse_args() # Resolve prompt if args.file: with open(args.file, "r") as f: prompt = f.read().strip() elif args.prompt: prompt = args.prompt else: print("Error: Provide a prompt or use --file.", file=sys.stderr) sys.exit(1) print(f"Prompt: {prompt[:80]}{'...' if len(prompt) > 80 else ''}\n") if args.fallback: print("Running model fallback chain...\n") result = run_fallback(prompt, args.models) if args.json: print(json.dumps(result, indent=2)) else: print(f"Fallback chain: {' -> '.join(result['requested_models'])}") print(f"Model used: {result['model_used']}") print(f"Latency: {result['latency_s']}s") print(f"Response length: {result['response_length']} chars") print(f"Cost: ${result['cost_usd']:.5f}") print(f"\nPreview: {result['preview']}") else: print(f"Comparing {len(args.models)} models...\n") results = compare_models(prompt, args.models) if args.json: print(json.dumps(results, indent=2)) else: print(format_table(results)) print(f"\nComparison complete. {len(results)} models evaluated.") if __name__ == "__main__": main() ``` ## Example Output Running the comparison: ```bash theme={null} python model_comparison.py "Explain the CAP theorem in distributed systems" ``` Produces output like: ``` Prompt: Explain the CAP theorem in distributed systems Comparing 5 models... Querying openai/gpt-5.4... Querying anthropic/claude-sonnet-4-6... Querying google/gemini-3.1-flash-lite... Querying xai/grok-4.20-non-reasoning... Querying perplexity/sonar... Model Latency Length Tokens Cost ------------------------------------------------------------------------------ xai/grok-4.20-non-reasoning 1.24s 1842 18+312 $0.00048 google/gemini-3.1-flash-lite 1.87s 2105 18+356 $0.00031 perplexity/sonar 2.13s 1654 18+280 $0.00034 openai/gpt-5.4 3.41s 2487 18+421 $0.00438 anthropic/claude-sonnet-4-6 3.78s 2301 18+389 $0.00527 Comparison complete. 5 models evaluated. ``` Running with the fallback chain: ```bash theme={null} python model_comparison.py "Latest AI news" --fallback ``` ``` Prompt: Latest AI news Running model fallback chain... Sending request with fallback chain: ['openai/gpt-5.4', ...] Fallback chain: openai/gpt-5.4 -> anthropic/claude-sonnet-4-6 -> google/gemini-3.1-flash-lite -> xai/grok-4.20-non-reasoning -> perplexity/sonar Model used: openai/gpt-5.4 Latency: 3.12s Response length: 2034 chars Cost: $0.00415 Preview: The AI landscape continues to evolve rapidly in 2025... ``` Model fallback is useful for production systems where availability matters more than model selection. The API tries each model in the `models` array in order and returns the first successful response. See the [model fallback guide](/docs/agent-api/model-fallback) for details. ## Tips for Meaningful Comparisons 1. **Use the same `max_output_tokens`** across all models to keep output lengths comparable. 2. **Run multiple trials** and average the results, since latency can vary between requests due to load. 3. **Test with representative prompts** for your actual use case rather than generic questions. 4. **Consider cost per token** in addition to total cost, especially for high-volume applications. ## Limitations * Latency measurements reflect end-to-end wall-clock time including network round trips, not pure model inference time. * Cost values come from the API response and reflect per-request pricing at the time of the call. * Response quality is subjective and not captured by quantitative metrics alone. Review the actual output text for qualitative evaluation. * Rate limits vary by model and provider. Sequential comparison requests may be affected by rate limiting on high-demand models. # Academic Research Finder CLI Source: https://docs.perplexity.ai/docs/cookbook/examples/research-finder/README A command-line tool that uses Perplexity's Sonar API to find and summarize academic literature # Academic Research Finder CLI A command-line tool that uses Perplexity's Sonar API to find and summarize academic literature (research papers, articles, etc.) related to a given question or topic. ## Features * Takes a natural language question or topic as input, ideally suited for academic inquiry. * Leverages Perplexity Sonar API, guided by a specialized prompt to prioritize scholarly sources (e.g., journals, conference proceedings, academic databases). * Outputs a concise summary based on the findings from academic literature. * Lists the primary academic sources used, aiming to include details like authors, year, title, publication, and DOI/link when possible. * Supports different Perplexity models (defaults to `sonar-pro`). * Allows results to be output in JSON format. ## Installation ### 1. Install required dependencies Ensure you are using the Python environment you intend to run the script with (e.g., `python3.10` if that's your target). ```bash theme={null} # Install from requirements file (recommended) pip install -r requirements.txt # Or install manually pip install requests ``` ### 2. Make the script executable (Optional) ```bash theme={null} chmod +x research_finder.py ``` Alternatively, you can run the script using `python3 research_finder.py ...`. ## API Key Setup The tool requires a Perplexity API key (`PPLX_API_KEY`) to function. You can provide it in one of these ways (checked in this order): 1. **As a command-line argument:** ```bash theme={null} python3 research_finder.py "Your query" --api-key YOUR_API_KEY ``` 2. **As an environment variable:** ```bash theme={null} export PPLX_API_KEY=YOUR_API_KEY python3 research_finder.py "Your query" ``` 3. **In a file:** Create a file named `pplx_api_key`, `.pplx_api_key`, `PPLX_API_KEY`, or `.PPLX_API_KEY` in the *same directory as the script* or in the *current working directory* containing just your API key. ```bash theme={null} echo "YOUR_API_KEY" > .pplx_api_key chmod 600 .pplx_api_key # Optional: restrict permissions python3 research_finder.py "Your query" ``` ## Usage Run the script from the `sonar-use-cases/research_finder` directory or provide the full path. ```bash theme={null} # Basic usage python3 research_finder.py "What are the latest advancements in quantum computing?" # Using a specific model python3 research_finder.py "Explain the concept of Large Language Models" --model sonar-small-online # Getting output as JSON python3 research_finder.py "Summarize the plot of Dune Part Two" --json # Using a custom system prompt file python3 research_finder.py "Benefits of renewable energy" --prompt-file /path/to/your/custom_prompt.md # Using an API key via argument python3 research_finder.py "Who won the last FIFA World Cup?" --api-key sk-... # Using the executable (if chmod +x was used) ./research_finder.py "Latest news about Mars exploration" ``` ### Arguments * `query`: (Required) The research question or topic (enclose in quotes if it contains spaces). * `-m`, `--model`: Specify the Perplexity model (default: `sonar-pro`). * `-k`, `--api-key`: Provide the API key directly. * `-p`, `--prompt-file`: Path to a custom system prompt file. * `-j`, `--json`: Output the results in JSON format. ## Example Output (Human-Readable - *Note: Actual output depends heavily on the query and API results*) ``` Initializing research assistant for query: "Recent studies on transformer models in NLP"... Researching in progress... ✅ Research Complete! 📝 SUMMARY: Recent studies on transformer models in Natural Language Processing (NLP) continue to explore architectural improvements, efficiency optimizations, and new applications. Key areas include modifications to the attention mechanism (e.g., sparse attention, linear attention) to handle longer sequences more efficiently, techniques for model compression and knowledge distillation, and applications beyond text, such as in computer vision and multimodal tasks. Research also focuses on understanding the internal workings and limitations of large transformer models. 🔗 SOURCES: 1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. (arXiv:1706.03762) 2. Tay, Y., Dehghani, M., Bahri, D., & Metzler, D. (2020). Efficient transformers: A survey. arXiv preprint arXiv:2009.06732. 3. Beltagy, I., Peters, M. E., & Cohan, A. (2020). Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150. 4. Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in bertology: What we know about how bert works. Transactions of the Association for Computational Linguistics, 8, 842-866. (arXiv:2002.12327) ``` ## Limitations * The ability of the Sonar API to consistently prioritize and access specific academic databases or extract detailed citation information (like DOIs) may vary. The quality depends on the API's search capabilities and the structure of the source websites. * The script performs basic parsing to separate summary and sources; complex or unusual API responses might not be parsed perfectly. Check the raw response in case of issues. * Queries that are too broad or not well-suited for academic search might yield less relevant results. * Error handling for API rate limits or specific API errors could be more granular. # Search News Monitor Source: https://docs.perplexity.ai/docs/cookbook/examples/search-news-monitor/README A CLI tool that uses Perplexity's Search API to monitor real-time news across multiple configurable topics with domain and recency filtering # Search News Monitor A command-line tool that uses Perplexity's Search API (`client.search.create(...)`) to monitor real-time news across multiple topics. Configure topics, domain filters, and recency windows to build a continuous news monitoring pipeline. ## Features * Monitor multiple topics in a single run using the Search API * Filter results by domain with `search_domain_filter` (allowlist or denylist) * Control recency with `search_recency_filter` (day, week, month, year) * Access structured result fields: `title`, `url`, `snippet`, `date` * Configurable polling interval for continuous monitoring * Output as formatted text or JSON for downstream processing ## Installation ```bash Python theme={null} pip install perplexityai ``` ```bash TypeScript theme={null} npm install @perplexity-ai/perplexity_ai ``` ## API Key Setup Set your Perplexity API key as an environment variable. The SDK reads it automatically: ```bash theme={null} export PERPLEXITY_API_KEY="your_api_key_here" ``` ## Usage ```bash theme={null} # Monitor default topics python news_monitor.py # Specify custom topics python news_monitor.py --topics "artificial intelligence" "climate policy" "space exploration" # Filter to specific domains (allowlist) python news_monitor.py --topics "AI regulation" --domains reuters.com apnews.com bbc.co.uk # Exclude domains (denylist) python news_monitor.py --topics "technology" --exclude-domains pinterest.com reddit.com # Set recency filter python news_monitor.py --topics "semiconductor industry" --recency day # Run in continuous monitoring mode python news_monitor.py --topics "cybersecurity" --watch --interval 300 # Export as JSON python news_monitor.py --topics "renewable energy" --json > news.json ``` ## How It Works 1. The CLI accepts a list of topics and optional filtering parameters. 2. For each topic, it calls `client.search.create(query=..., max_results=...)` with the configured domain and recency filters. 3. Each search result contains `title`, `url`, `snippet`, and `date` fields, which are extracted and formatted. 4. In watch mode, the tool repeats the search at a configurable interval, displaying only new results since the last poll. Use the `search_recency_filter` parameter with values like `"day"`, `"week"`, `"month"`, or `"year"` to focus on recent news. This is simpler than specifying exact date ranges and works well for monitoring workflows. Domain filters operate in either allowlist or denylist mode, not both simultaneously. Use `--domains` for allowlist or `--exclude-domains` for denylist, but not both in the same request. A maximum of 20 domains can be specified. ## Full Code ```python Python theme={null} import sys import json import time import argparse from typing import List, Optional from perplexity import Perplexity DEFAULT_TOPICS = ["artificial intelligence", "climate change", "cybersecurity"] def search_topic( client, topic, max_results=5, domains=None, exclude_domains=None, recency=None, ): """Search for news on a single topic and return structured results.""" params = {"query": topic, "max_results": max_results} if domains: params["search_domain_filter"] = domains elif exclude_domains: params["search_domain_filter"] = [f"-{d}" for d in exclude_domains] if recency: params["search_recency_filter"] = recency search = client.search.create(**params) return [ { "title": item.title, "url": item.url, "snippet": item.snippet[:200] if item.snippet else "", "date": item.date if hasattr(item, "date") else None, } for item in search.results ] def monitor_once(client, topics, max_results, domains, exclude_domains, recency): """Run a single monitoring pass across all topics.""" return { topic: search_topic(client, topic, max_results, domains, exclude_domains, recency) for topic in topics } def format_results(all_results): """Format monitoring results as human-readable text.""" lines = [f"NEWS MONITOR - {time.strftime('%Y-%m-%d %H:%M:%S')}", "=" * 60] for topic, results in all_results.items(): lines.append(f"\n[ {topic.upper()} ] - {len(results)} results") lines.append("-" * 40) if not results: lines.append(" No results found.") continue for i, r in enumerate(results, 1): lines.append(f" {i}. {r['title']}") lines.append(f" {r['url']}") if r["date"]: lines.append(f" Published: {r['date']}") if r["snippet"]: lines.append(f" {r['snippet']}...") lines.append("") return "\n".join(lines) def main(): parser = argparse.ArgumentParser(description="Search News Monitor") parser.add_argument("--topics", nargs="+", default=DEFAULT_TOPICS) parser.add_argument("--max-results", type=int, default=5) parser.add_argument("--domains", nargs="+", help="Allowlist domains") parser.add_argument("--exclude-domains", nargs="+", help="Denylist domains") parser.add_argument("--recency", choices=["day", "week", "month", "year"], default="week") parser.add_argument("--watch", action="store_true", help="Continuous monitoring") parser.add_argument("--interval", type=int, default=300, help="Poll interval (seconds)") parser.add_argument("--json", action="store_true", help="Output as JSON") args = parser.parse_args() if args.domains and args.exclude_domains: print("Error: Use --domains or --exclude-domains, not both.", file=sys.stderr) sys.exit(1) client = Perplexity() print(f"Monitoring {len(args.topics)} topics: {', '.join(args.topics)}") print(f"Recency: {args.recency} | Max results per topic: {args.max_results}\n") seen_urls = set() while True: all_results = monitor_once( client, args.topics, args.max_results, args.domains, args.exclude_domains, args.recency, ) if args.watch: filtered = {} for topic, results in all_results.items(): new = [r for r in results if r["url"] not in seen_urls] seen_urls.update(r["url"] for r in new) filtered[topic] = new all_results = filtered print(json.dumps(all_results, indent=2) if args.json else format_results(all_results)) if not args.watch: break print(f"\nNext check in {args.interval} seconds... (Ctrl+C to stop)\n") time.sleep(args.interval) if __name__ == "__main__": main() ``` ```typescript TypeScript theme={null} import Perplexity from "@perplexity-ai/perplexity_ai"; const DEFAULT_TOPICS = ["artificial intelligence", "climate change", "cybersecurity"]; interface SearchResult { title: string; url: string; snippet: string; date: string | null; } async function searchTopic( client: InstanceType, topic: string, maxResults: number, domains?: string[], recency?: string ): Promise { const params: Record = { query: topic, max_results: maxResults }; if (domains) params.search_domain_filter = domains; if (recency) params.search_recency_filter = recency; const search = await client.search.create(params as any); return (search as any).results.map((item: any) => ({ title: item.title, url: item.url, snippet: item.snippet ? item.snippet.substring(0, 200) : "", date: item.date || null, })); } function formatResults(allResults: Record): string { const lines: string[] = []; const ts = new Date().toISOString().replace("T", " ").substring(0, 19); lines.push(`NEWS MONITOR - ${ts}`, "=".repeat(60)); for (const [topic, results] of Object.entries(allResults)) { lines.push(`\n[ ${topic.toUpperCase()} ] - ${results.length} results`, "-".repeat(40)); if (!results.length) { lines.push(" No results found."); continue; } results.forEach((r, i) => { lines.push(` ${i + 1}. ${r.title}`, ` ${r.url}`); if (r.date) lines.push(` Published: ${r.date}`); if (r.snippet) lines.push(` ${r.snippet}...`); lines.push(""); }); } return lines.join("\n"); } async function main() { const args = process.argv.slice(2); const topicsIdx = args.indexOf("--topics"); let topics = DEFAULT_TOPICS; if (topicsIdx !== -1) { topics = []; for (let i = topicsIdx + 1; i < args.length && !args[i].startsWith("--"); i++) topics.push(args[i]); } const recencyIdx = args.indexOf("--recency"); const recency = recencyIdx !== -1 ? args[recencyIdx + 1] : "week"; const maxIdx = args.indexOf("--max-results"); const maxResults = maxIdx !== -1 ? parseInt(args[maxIdx + 1]) : 5; const watch = args.includes("--watch"); const intervalIdx = args.indexOf("--interval"); const interval = intervalIdx !== -1 ? parseInt(args[intervalIdx + 1]) : 300; const outputJson = args.includes("--json"); const client = new Perplexity(); console.log(`Monitoring ${topics.length} topics: ${topics.join(", ")}`); console.log(`Recency: ${recency} | Max results per topic: ${maxResults}\n`); const seenUrls = new Set(); while (true) { const allResults: Record = {}; for (const topic of topics) allResults[topic] = await searchTopic(client, topic, maxResults, undefined, recency); let display = allResults; if (watch) { display = {}; for (const [topic, results] of Object.entries(allResults)) { const fresh = results.filter((r) => !seenUrls.has(r.url)); fresh.forEach((r) => seenUrls.add(r.url)); display[topic] = fresh; } } console.log(outputJson ? JSON.stringify(display, null, 2) : formatResults(display)); if (!watch) break; console.log(`\nNext check in ${interval} seconds... (Ctrl+C to stop)\n`); await new Promise((r) => setTimeout(r, interval * 1000)); } } main(); ``` ## Example Output ```bash theme={null} python news_monitor.py --topics "artificial intelligence" "climate policy" --recency day ``` ``` Monitoring 2 topics: artificial intelligence, climate policy Recency: day | Max results per topic: 5 NEWS MONITOR - 2026-02-26 14:30:00 ============================================================ [ ARTIFICIAL INTELLIGENCE ] - 5 results ---------------------------------------- 1. OpenAI Announces New Enterprise AI Safety Framework https://www.reuters.com/technology/openai-enterprise-safety-2026 Published: 2026-02-26 OpenAI introduced a comprehensive safety framework for enterprise deployments, addressing concerns about autonomous agent behavior... 2. EU AI Act Enforcement Begins for High-Risk Systems https://www.bbc.co.uk/news/technology-ai-act-enforcement Published: 2026-02-26 The European Union has started enforcing new requirements for high-risk AI systems under the AI Act... [ CLIMATE POLICY ] - 3 results ---------------------------------------- 1. G7 Nations Agree on Carbon Border Adjustment Timeline https://www.reuters.com/sustainability/g7-carbon-border-2026 Published: 2026-02-26 G7 leaders reached consensus on implementing coordinated carbon border adjustment mechanisms by 2028... ``` The Search API returns structured results with `title`, `url`, `snippet`, and `date` fields. Unlike the Agent API or Sonar API, it does not generate AI summaries -- it returns raw ranked web results. See the [Search API quickstart](/docs/search/quickstart) for full details. ## Continuous Monitoring Tips 1. **Set appropriate intervals.** For breaking news, use 60-120 second intervals. For general topic monitoring, 300-600 seconds is sufficient. 2. **Combine recency with domain filters.** Use `--recency day` with trusted news domains for a curated news feed. 3. **Pipe JSON output to other tools.** Use `--json` with tools like `jq` or downstream scripts for alerting and aggregation. 4. **Track seen URLs.** The watch mode automatically deduplicates results across polling cycles using URL tracking. ## Limitations * The Search API charges per request. Frequent polling across many topics will increase costs. * The `search_recency_filter` is relative to the current time and cannot specify exact date ranges. For precise date filtering, use `search_after_date_filter` and `search_before_date_filter` instead. * Search result availability depends on web indexing. Very recent content (within minutes) may not appear immediately. * The `snippet` field length varies by result and may be truncated for long pages. # SEC Filing Search Source: https://docs.perplexity.ai/docs/cookbook/examples/sec-filing-search/README Use search domain filtering to query SEC.gov and EDGAR for financial filings, extract key financial data, and build structured financial summaries # SEC Filing Search Search SEC.gov and EDGAR for financial filings using domain filtering, extract key financial metrics, and produce structured summaries of public company filings. This example demonstrates practical financial data extraction using the Agent API with targeted domain filters. ## Features * Search SEC.gov and EDGAR exclusively using `search_domain_filter` * Extract key metrics from 10-K, 10-Q, and 8-K filings * Structured JSON output for financial data * Compare filings across companies or time periods * Combine SEC data with broader market context via web search This example uses the Agent API's `web_search` tool with domain filtering to target SEC.gov specifically. The search is grounded in actual SEC filings rather than secondary reporting. ## Installation ```bash theme={null} pip install perplexityai ``` ```bash theme={null} export PERPLEXITY_API_KEY="your_api_key_here" ``` ## Usage Save the full code below to `sec_search.py` and run: ```bash theme={null} python sec_search.py "Apple 10-K 2025 revenue and operating income" ``` Compare companies: ```bash theme={null} python sec_search.py --compare AAPL MSFT GOOGL --metric revenue ``` ## Full Code ```python Python theme={null} import sys import json import argparse from perplexity import Perplexity client = Perplexity() SEC_DOMAINS = [ "sec.gov", "edgar.sec.gov", "efts.sec.gov", ] def search_sec_filings(query: str) -> dict: """Search SEC filings with domain-filtered web search.""" response = client.responses.create( model="openai/gpt-5.4", input=query, tools=[{ "type": "web_search", "filters": { "search_domain_filter": SEC_DOMAINS, }, }], instructions=( "You are a financial analyst assistant. Search SEC filings for the requested " "information. Cite specific filing types (10-K, 10-Q, 8-K) and dates. " "Report exact numbers from the filings, not estimates." ), max_output_tokens=2048, ) return { "query": query, "answer": response.output_text, "model": response.model, "cost": response.usage.cost.total_cost, } def extract_financial_metrics(company: str, filing_type: str = "10-K") -> dict: """Extract structured financial metrics from SEC filings.""" response = client.responses.create( model="openai/gpt-5.4", input=f"Find the most recent {filing_type} filing for {company} on SEC EDGAR and extract key financial metrics.", tools=[{ "type": "web_search", "filters": { "search_domain_filter": SEC_DOMAINS, }, }], instructions=( f"Search SEC EDGAR for {company}'s most recent {filing_type} filing. " "Extract the exact financial figures reported. Use numbers directly from the filing." ), response_format={ "type": "json_schema", "json_schema": { "name": "sec_financials", "schema": { "type": "object", "properties": { "company": {"type": "string"}, "ticker": {"type": "string"}, "filing_type": {"type": "string"}, "filing_period": {"type": "string"}, "filing_date": {"type": "string"}, "total_revenue": {"type": "string"}, "net_income": {"type": "string"}, "total_assets": {"type": "string"}, "total_debt": {"type": "string"}, "operating_income": {"type": "string"}, "eps_diluted": {"type": "string"}, "cash_and_equivalents": {"type": "string"}, }, "required": [ "company", "ticker", "filing_type", "filing_period", "filing_date", "total_revenue", "net_income", "total_assets", "total_debt", "operating_income", "eps_diluted", "cash_and_equivalents", ], "additionalProperties": False, }, }, }, ) return json.loads(response.output_text) def compare_companies(tickers: list[str], metric: str = "revenue") -> list[dict]: """Compare a specific financial metric across multiple companies.""" results = [] for ticker in tickers: print(f" Searching {ticker}...") try: data = extract_financial_metrics(ticker) results.append(data) except Exception as e: results.append({"ticker": ticker, "error": str(e)}) return results def search_filing_changes(company: str) -> str: """Search for material changes or risk factors in recent filings.""" response = client.responses.create( model="openai/gpt-5.4", input=( f"What are the key risk factors and material changes disclosed in {company}'s " f"most recent 10-K or 10-Q filing on SEC EDGAR?" ), tools=[{ "type": "web_search", "filters": { "search_domain_filter": SEC_DOMAINS, }, }], instructions=( "Focus on risk factors (Item 1A) and material changes. " "Cite specific sections and filing dates." ), max_output_tokens=2048, ) return response.output_text def main(): parser = argparse.ArgumentParser(description="SEC Filing Search") parser.add_argument("query", nargs="?", help="Search query for SEC filings") parser.add_argument("--extract", help="Extract financial metrics for a company ticker") parser.add_argument("--compare", nargs="+", help="Compare companies by ticker") parser.add_argument("--metric", default="revenue", help="Metric to compare") parser.add_argument("--risks", help="Search risk factors for a company") parser.add_argument("--json", action="store_true", help="Output as JSON") args = parser.parse_args() if args.compare: print(f"Comparing {len(args.compare)} companies...\n") results = compare_companies(args.compare, args.metric) if args.json: print(json.dumps(results, indent=2)) else: for r in results: if "error" in r: print(f" {r['ticker']}: ERROR - {r['error']}") else: print(f" {r['company']} ({r['ticker']}) — {r['filing_type']} ({r['filing_period']})") print(f" Revenue: {r['total_revenue']}") print(f" Net Income: {r['net_income']}") print(f" Operating Income: {r['operating_income']}") print(f" EPS: {r['eps_diluted']}") print() elif args.extract: print(f"Extracting financials for {args.extract}...\n") data = extract_financial_metrics(args.extract) if args.json: print(json.dumps(data, indent=2)) else: print(f"{data['company']} ({data['ticker']})") print(f"Filing: {data['filing_type']} for {data['filing_period']} (filed {data['filing_date']})") print(f" Revenue: {data['total_revenue']}") print(f" Net Income: {data['net_income']}") print(f" Operating Income: {data['operating_income']}") print(f" Total Assets: {data['total_assets']}") print(f" Total Debt: {data['total_debt']}") print(f" Cash: {data['cash_and_equivalents']}") print(f" EPS (diluted): {data['eps_diluted']}") elif args.risks: print(f"Searching risk factors for {args.risks}...\n") print(search_filing_changes(args.risks)) elif args.query: result = search_sec_filings(args.query) if args.json: print(json.dumps(result, indent=2)) else: print(result["answer"]) else: print("Error: Provide a query, --extract TICKER, --compare TICKERS, or --risks COMPANY", file=sys.stderr) sys.exit(1) if __name__ == "__main__": main() ``` ## Example Output ### Basic Filing Search ```bash theme={null} python sec_search.py "Apple 10-K 2025 revenue breakdown by segment" ``` ``` According to Apple's FY2025 10-K filing (filed October 2025), total net revenue was $394.3 billion. Revenue by segment: - iPhone: $200.6B (50.8%) - Services: $96.2B (24.4%) - Mac: $29.4B (7.5%) - iPad: $28.3B (7.2%) - Wearables, Home and Accessories: $39.8B (10.1%) Services revenue grew 14% year-over-year, continuing to be the fastest- growing segment. (Source: Apple Inc. 10-K, SEC EDGAR) ``` ### Structured Extraction ```bash theme={null} python sec_search.py --extract AAPL --json ``` ```json theme={null} { "company": "Apple Inc.", "ticker": "AAPL", "filing_type": "10-K", "filing_period": "FY2025 (ending September 2025)", "filing_date": "2025-10-31", "total_revenue": "$394.3 billion", "net_income": "$101.2 billion", "total_assets": "$352.6 billion", "total_debt": "$98.3 billion", "operating_income": "$123.4 billion", "eps_diluted": "$6.72", "cash_and_equivalents": "$29.9 billion" } ``` ### Company Comparison ```bash theme={null} python sec_search.py --compare AAPL MSFT GOOGL ``` ``` Comparing 3 companies... Apple Inc. (AAPL) — 10-K (FY2025) Revenue: $394.3 billion Net Income: $101.2 billion Operating Income: $123.4 billion EPS: $6.72 Microsoft Corporation (MSFT) — 10-K (FY2025) Revenue: $254.2 billion Net Income: $89.4 billion Operating Income: $115.6 billion EPS: $12.01 Alphabet Inc. (GOOGL) — 10-K (FY2025) Revenue: $348.2 billion Net Income: $86.7 billion Operating Income: $108.3 billion EPS: $7.02 ``` SEC EDGAR contains the official, audited financial data for all US public companies. By restricting search to `sec.gov` and `edgar.sec.gov`, you ensure your financial data comes from primary source filings rather than secondary reporting. Financial data extracted by the model should be verified against the original filing before use in official reports or investment decisions. The model may occasionally misparse tables or footnotes. ## Limitations * The search is limited to what SEC EDGAR makes publicly available and indexable. * Very recent filings may not yet be indexed by the search engine. * Complex financial tables (multi-year comparisons, segment breakdowns with footnotes) may be summarized rather than fully extracted. * The model provides data as-is from filings. It does not adjust for accounting method changes between periods. # TypeScript Agent CLI Source: https://docs.perplexity.ai/docs/cookbook/examples/typescript-agent-cli/README An interactive TypeScript CLI with streaming responses, model selection, and web search using the Perplexity Agent API # TypeScript Agent CLI A TypeScript-first interactive command-line interface that connects to the Perplexity Agent API with streaming responses, runtime model selection, and integrated web search. ## Features * Interactive REPL with streaming token output * Runtime model selection from a curated list * Web search integration via the `web_search` tool * TypeScript-specific patterns: type narrowing, const assertions, typed error classes * Conversation history for multi-turn interactions * Graceful error handling and clean shutdown ## Installation ```bash theme={null} mkdir typescript-agent-cli && cd typescript-agent-cli npm init -y npm install @perplexity-ai/perplexity_ai npm install -D typescript @types/node tsx ``` ```bash theme={null} export PERPLEXITY_API_KEY="your_api_key_here" ``` ## Usage ```bash theme={null} npx tsx src/cli.ts ``` The CLI prompts you to select a model, then enters an interactive loop: ``` Available models: 1. OpenAI GPT-5.1 (openai/gpt-5.4) 2. Google Gemini 3 Flash (google/gemini-3.1-flash-lite) Select a model (1-2): 1 > What is the current state of quantum computing? ``` Commands: `/search` (enable web search), `/nosearch` (disable), `/clear` (reset history), `/quit` (exit). ## Full Code Save as `src/cli.ts`: ```typescript theme={null} import Perplexity from "@perplexity-ai/perplexity_ai"; import * as readline from "readline"; // --- Configuration --- const AVAILABLE_MODELS = [ { name: "openai/gpt-5.4", label: "OpenAI GPT-5.1" }, { name: "google/gemini-3.1-flash-lite", label: "Google Gemini 3 Flash" }, ] as const; type ModelName = (typeof AVAILABLE_MODELS)[number]["name"]; interface Message { role: "user" | "assistant"; content: string; } // --- Helpers --- function createRL(): readline.Interface { return readline.createInterface({ input: process.stdin, output: process.stdout }); } function ask(rl: readline.Interface, prompt: string): Promise { return new Promise((resolve) => rl.question(prompt, (a) => resolve(a.trim()))); } // --- Model selection --- async function selectModel(rl: readline.Interface): Promise { console.log("\nAvailable models:"); AVAILABLE_MODELS.forEach((m, i) => console.log(` ${i + 1}. ${m.label} (${m.name})`) ); while (true) { const idx = parseInt(await ask(rl, `\nSelect a model (1-${AVAILABLE_MODELS.length}): `), 10) - 1; if (idx >= 0 && idx < AVAILABLE_MODELS.length) { console.log(`Using model: ${AVAILABLE_MODELS[idx].name}\n`); return AVAILABLE_MODELS[idx].name; } console.log("Invalid selection."); } } // --- Streaming query --- async function streamQuery( client: Perplexity, model: ModelName, history: Message[], userMessage: string, useWebSearch: boolean ): Promise { const input = [ ...history.map((m) => ({ role: m.role as "user" | "assistant", content: m.content })), { role: "user" as const, content: userMessage }, ]; const tools: Array<{ type: "web_search" }> = useWebSearch ? [{ type: "web_search" as const }] : []; const stream = await client.responses.create({ model, input, tools, stream: true, instructions: "You are a helpful assistant. Use web search when available for current events. Be concise.", max_output_tokens: 2048, }); let fullResponse = ""; for await (const chunk of stream) { if (chunk.type === "response.output_text.delta") { const delta = (chunk as any).delta as string; process.stdout.write(delta); fullResponse += delta; } if (chunk.type === "response.output_item.added") { const item = (chunk as any).item; if (item?.type === "search_results") { process.stdout.write("\n[Searching the web...]\n"); } } if (chunk.type === "response.completed") { const usage = (chunk as any).response?.usage; if (usage) { process.stdout.write(`\n\n[Tokens: ${usage.input_tokens} in / ${usage.output_tokens} out]`); } } } process.stdout.write("\n"); return fullResponse; } // --- Command handling --- function handleCommand(cmd: string, state: { webSearch: boolean; history: Message[] }): boolean { switch (cmd.toLowerCase()) { case "/quit": case "/exit": console.log("Goodbye."); return true; case "/search": state.webSearch = true; console.log("Web search enabled."); return false; case "/nosearch": state.webSearch = false; console.log("Web search disabled."); return false; case "/clear": state.history = []; console.log("History cleared."); return false; case "/help": console.log("\n /search /nosearch /clear /quit /help\n"); return false; default: console.log(`Unknown command: ${cmd}. Type /help.`); return false; } } // --- Main --- async function main(): Promise { const client = new Perplexity(); const rl = createRL(); const model = await selectModel(rl); const state = { webSearch: true, history: [] as Message[] }; console.log("Type a message to chat, or /help for commands. Web search is ON.\n"); process.on("SIGINT", () => { console.log("\nGoodbye."); rl.close(); process.exit(0); }); while (true) { const input = await ask(rl, "> "); if (!input) continue; if (input.startsWith("/")) { if (handleCommand(input, state)) break; continue; } try { const response = await streamQuery(client, model, state.history, input, state.webSearch); state.history.push({ role: "user", content: input }); state.history.push({ role: "assistant", content: response }); if (state.history.length > 20) state.history = state.history.slice(-20); } catch (error: unknown) { if (error instanceof Perplexity.APIConnectionError) { console.error("\nConnection error. Check your network."); } else if (error instanceof Perplexity.RateLimitError) { console.error("\nRate limit exceeded. Wait and retry."); } else if (error instanceof Perplexity.APIStatusError) { console.error(`\nAPI error: ${(error as any).message}`); } else { console.error("\nUnexpected error:", error); } } console.log(); } rl.close(); } main(); ``` ## Example Session ``` Available models: 1. OpenAI GPT-5.1 (openai/gpt-5.4) 2. Google Gemini 3 Flash (google/gemini-3.1-flash-lite) Select a model (1-2): 1 Using model: openai/gpt-5.4 Type a message to chat, or /help for commands. Web search is ON. > What were the major AI announcements this week? [Searching the web...] This week saw several notable AI developments: 1. Anthropic released Claude 4 with improved reasoning... 2. Google DeepMind published new protein folding results... 3. OpenAI announced enterprise partnerships for GPT-5... [Tokens: 1420 in / 287 out] > /nosearch Web search disabled. > Explain transformers in simple terms A transformer is a neural network architecture that processes all parts of an input simultaneously rather than sequentially... [Tokens: 2580 in / 195 out] > /quit Goodbye. ``` ## Key TypeScript Patterns ### Const Assertions for Tool Types Use `as const` to narrow tool type literals: ```typescript theme={null} const tools = [{ type: "web_search" as const }]; ``` ### Streaming Event Type Narrowing Check `chunk.type` before accessing event-specific fields: ```typescript theme={null} for await (const chunk of stream) { if (chunk.type === "response.output_text.delta") { process.stdout.write((chunk as any).delta); } } ``` ### Typed Error Handling ```typescript theme={null} try { // API call } catch (error) { if (error instanceof Perplexity.APIConnectionError) { // Handle network issues } else if (error instanceof Perplexity.RateLimitError) { // Handle rate limits } } ``` Conversation history is preserved across turns, so the model can reference earlier messages. Use `/clear` to start a fresh conversation without restarting the CLI. ## Limitations * The CLI uses Node.js `readline`, which does not support arrow-key history navigation. For a richer experience, consider `inquirer` or `prompts`. * Conversation history is in-memory only and lost when the process exits. * Streaming events may vary by model provider. The `response.output_text.delta` event is consistent across all models. # Perplexity API Cookbook Source: https://docs.perplexity.ai/docs/cookbook/index Practical guides, runnable examples, and integration patterns for building with every Perplexity API A collection of practical guides, runnable examples, and integration patterns for building with [**Perplexity's API Platform**](https://docs.perplexity.ai/) — covering the Agent API, Search API, Embeddings API, and Sonar API. ## How to Use This Cookbook Practical deep-dives on patterns that go beyond the docs — structured outputs, function calling, RAG pipelines, and cross-cutting best practices. Runnable projects covering every API. From research assistants and news monitors to document Q\&A and image analysis — with Python and TypeScript. Connect Perplexity with external frameworks like the OpenAI Agents SDK, LangChain memory systems, and persistent storage. Real-world applications built by the community — see what others are building with the API Platform. ## Quick Start To use the Perplexity API Platform, you'll need an API key. If you don't have one yet: Navigate to the **API Keys** tab in the API Portal and generate a new key. The Perplexity API SDK is available in Python and TypeScript. Install the package for your preferred language: ```bash Python theme={null} pip install perplexityai ``` ```bash TypeScript theme={null} npm install @perplexity-ai/perplexity_ai ``` The Perplexity API Platform supports a wide range of use cases across its different APIs. Here are some recommended starting points based on your goals: * New to the platform? Refer to our [Quick Start Guide](/docs/getting-started/quickstart) * Want to build something? Take a look at our [Examples](/docs/cookbook/examples/README) ## What's Covered | API | Guides | Examples | | ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Agent API** | [Multi-Provider Orchestration](/docs/cookbook/articles/multi-provider-orchestration/README), [Function Calling End-to-End](/docs/cookbook/articles/function-calling-e2e/README), [OpenAI Agents Integration](/docs/cookbook/articles/openai-agents-integration/README) | [Research Assistant](/docs/cookbook/examples/agent-research-assistant/README), [Model Comparison](/docs/cookbook/examples/model-comparison/README), [TypeScript CLI](/docs/cookbook/examples/typescript-agent-cli/README), [Image Analysis](/docs/cookbook/examples/image-analysis/README), [File Attachment Q\&A](/docs/cookbook/examples/file-attachment-qa/README) | | **Search API** | [Search Domain Filtering](/docs/cookbook/articles/search-domain-filtering/README) | [News Monitor](/docs/cookbook/examples/search-news-monitor/README), [SEC Filing Search](/docs/cookbook/examples/sec-filing-search/README) | | **Embeddings API** | [RAG Pipeline](/docs/cookbook/articles/embeddings-rag/README) | [Document Q\&A](/docs/cookbook/examples/document-qa/README) | | **Sonar API** | [Streaming Citations](/docs/cookbook/articles/streaming-citations/README), [Academic Search](/docs/cookbook/articles/academic-search/README), [Async Deep Research](/docs/cookbook/articles/async-deep-research/README) | — | # 4Point Hoops | AI Basketball Analytics Platform Source: https://docs.perplexity.ai/docs/cookbook/showcase/4point-Hoops Advanced NBA analytics platform that combines live Basketball-Reference data with Perplexity Sonar to deliver deep-dive player stats, cross-season comparisons and expert-grade AI explanations ![4Point Hoops Dashboard](https://d112y698adiu2z.cloudfront.net/photos/production/software_photos/003/442/047/datas/original.png) **4Point Hoops** is an advanced NBA analytics platform that turns raw basketball statistics into actionable, narrative-driven insights. By scraping Basketball-Reference in real time and routing context-rich prompts to Perplexity's Sonar Pro model, it helps fans, analysts, and fantasy players understand the "why" and "what's next" – not just the numbers.