# Chat Completions Source: https://docs.perplexity.ai/api-reference/chat-completions post /chat/completions Generates a model's response for the given chat conversation. # Changelog Source: https://docs.perplexity.ai/changelog/changelog Please note that as of February 22, 2025, several models and model name aliases will no longer be accessible. The following model names will no longer be available via API: `llama-3.1-sonar-small-128k-online` `llama-3.1-sonar-large-128k-online` `llama-3.1-sonar-huge-128k-online` We recommend updating your applications to use our recently released Sonar or Sonar Pro models – you can learn more about them here. Thank you for being a Perplexity API user. We are expanding API offerings with the most efficient and cost-effective search solutions available: **Sonar** and **Sonar Pro**. **Sonar** gives you fast, straightforward answers **Sonar Pro** tackles complex questions that need deeper research and provides more sources Both models offer built-in citations, automated scaling of rate limits, and public access to advanced features like structured outputs and search domain filters. And don’t worry, we never train on your data. Your information stays yours. You can learn more about our new APIs here - [http://sonar.perplexity.ai/](http://sonar.perplexity.ai/) We are excited to announce the public availability of citations in the Perplexity API. In addition, we have also increased our default rate limit for the sonar online models to 50 requests/min for all users. Effective immediately, all API users will see citations returned as part of their requests by default. This is not a breaking change. The **return\_citations** parameter will no longer have any effect. If you have any questions or need assistance, feel free to reach out to our team at [api@perplexity.ai](mailto:api@perplexity.ai) We are excited to announce the launch of our latest Perplexity Sonar models: **Online Models** - `llama-3.1-sonar-small-128k-online` `llama-3.1-sonar-large-128k-online` **Chat Models** - `llama-3.1-sonar-small-128k-chat` `llama-3.1-sonar-large-128k-chat` These new additions surpass the performance of the previous iteration. For detailed information on our supported models, please visit our model card documentation. **\[Action Required]** Model Deprecation Notice Please note that several models will no longer be accessible effective 8/12/2024. We recommend updating your applications to use models in the Llama-3.1 family immediately. The following model names will no longer be available via API - `llama-3-sonar-small-32k-online` `llama-3-sonar-large-32k-online` `llama-3-sonar-small-32k-chat` `llama-3-sonar-large-32k-chat` `llama-3-8b-instruct` `llama-3-70b-instruct` `mistral-7b-instruct` `mixtral-8x7b-instruct` We recommend switching to models in the Llama-3.1 family: **Online Models** - `llama-3.1-sonar-small-128k-online` `llama-3.1-sonar-large-128k-online` **Chat Models** - `llama-3.1-sonar-small-128k-chat` `llama-3.1-sonar-large-128k-chat` **Instruct Models** - `llama-3.1-70b-instruct` `llama-3.1-8b-instruct` If you have any questions, please email [support@perplexity.ai](mailto:support@perplexity.ai). Thank you for being a Perplexity API user. Stay curious, Team Perplexity Please note that as of May 14, several models and model name aliases will no longer be accessible. We recommend updating your applications to use models in the Llama-3 family immediately. The following model names will no longer be available via API: `codellama-70b-instruct` `mistral-7b-instruct` `mixtral-8x22b-instruct` `pplx-7b-chat` `pplx-7b-online` `sonar-small-chat` `sonar-small-online` `pplx-70b-chat` `pplx-70b-online` `pplx-8x7b-chat` `pplx-8x7b-online` `sonar-medium-chat` `sonar-medium-online` In lieu of the above, we recommend switching to models from the Llama 3 family: `llama-3-sonar-small-32k-chat` `llama-3-sonar-small-32k-online` `llama-3-sonar-large-32k-chat` `llama-3-sonar-large-32k-online` `llama-3-8b-instruct` `llama-3-70b-instruct` Effective immediately, input and output tokens are now charged with the same price. Previously, output tokens were more expensive than input tokens. Prices have generally gone down as a result. **Announcing Our Newest Model** We are excited to announce the launch of our latest Perplexity models: sonar-small-chat and sonar-medium-chat, along with their search-enhanced versions, sonar-small-online and sonar-medium-online. These new additions surpass our earlier models in cost-efficiency, speed, and performance. For detailed information on our supported models, please visit our model card documentation. **Expanded Context Windows** The context window length for several models has been doubled from 8k to 16k, including mixtral-8x7b-instruct and all Perplexity models. 4k tokens are reserved for search results in online models. **Model Deprecation Notice** Please note that as of March 15, the pplx-70b-chat, pplx-70b-online, llama-2-70b-chat, and codellama-34b-instruct models will no longer be available through the Perplexity API. We will gradually phase out less frequently used models in favor of our newer and more performant offerings. **Revised Pricing Structure for 8x7b Models** The pricing for the mixtral-8x7b-instruct model will be adjusted. Previously charged at $0.14 / $0.58 per million input and output tokens, the rates will change to $0.60 / $1.80 per million input and output tokens moving forward. **Increased Public Rate Limits** Public limits for all models have increased by \~2x. Find the current rate limits here. **Access to Citations and Elevated Rate Limits** Responding to popular demand in our API discussion forum, we are introducing URL citation access for our Online LLMs to approved users. For access to citations, or to request a rate limit increase, please complete this form. **Terms of Service and Data Processing Addendum** We wish to reiterate our commitment to data privacy for commercial application developers using the Perplexity API. The updated Terms of Service and Data Processing Addendum can be found here. Thank you for being a Perplexity API user. We're excited to announce that pplx-api is now serving the latest open-source mixture-of-experts model, `mixtral-8x7b-instruct`, at the blazingly fast speed of inference you are accustomed to. We’re excited to share two new PPLX models: pplx-7b-online and pplx-70b-online. These first-of-a-kind models are integrated with our in-house search technology for factual grounding. Read our blog post for more information! [https://blog.perplexity.ai/blog/introducing-pplx-online-llms](https://blog.perplexity.ai/blog/introducing-pplx-online-llms) We're also announcing general availability for our API. We've rolled out usage-based pricing, which enables us to gradually relax the rate limits on our models. Follow the updated steps for getting started. We have removed support for `replit-code-v1.5-3b` and `openhermes-2-mistral-7b`. There are no immediate plans to add these models back. If you were a user who enjoyed `openhermes-2-mistral-7b`, try instead using our in-house models, `pplx-7b-chat`, `pplx-70b-chat`! The Perplexity AI API is currently in beta release v0. Clients are not protected from backwards incompatible changes and cannot specify their desired API version. Examples of backwards incompatible changes include... Removing support for a given model. Renaming a response field. Removing a response field. Adding a required request parameter. Backwards incompatible changes will be documented here. Generally, the API is currently designed to be compatible with OpenAI client libraries. This means that, given the same request body, swapping the API base URL and adding your Perplexity API key will yield a response that can be parsed in the same way as the response OpenAI would yield, except for certain explicitly unsupported body parameters documented in the reference (link to reference). # Forum Source: https://docs.perplexity.ai/discussions/discussions We host our discussions on GitHub. You can access our GitHub discussion forum [here](https://github.com/ppl-ai/api-discussion/discussions). # Frequently Asked Questions Source: https://docs.perplexity.ai/faq/faq 1. The API uses the same search system as the UI with differences in configuration - so their outputs may differ. 2. The underlying AI model might differ between the API and the UI for a given query. 3. We give users the power to tune the API to their respective use cases using sampling parameters like `presence_penalty`, `top_p` etc and custom tuning to specific use cases might lead to less generalization of the API/different results vs the UI. We set optimized defaults and recommend not to explicitly provide sampling parameters in your API requests. We collect the following types of information: **API Usage Data:** We collect billable usage metadata such as the number of requests and tokens. You can view your own usage in the [Perplexity API dashboard](https://perplexity.ai/settings/api). **User Account Information:** When you create an account with us, we collect your name, email address, and other relevant contact information. We do not retain any query data sent through the API and do not train on any of your data. Yes, the [Sonar Models](https://docs.perplexity.ai/guides/model-cards), leverage information from Perplexity’s search index and the public internet. You can find our [rate limits here](https://docs.perplexity.ai/guides/usage-tiers). We email users about new developments and also post in the [changelog](/changelog.mdx). 401 error codes indicate that the provided API key is invalid, deleted or belongs to an account which ran out of credits. You likely need to purchase more credits in the [Perplexity API dashboard](https://perplexity.ai/settings/api). You can avoid this issue by configuring auto-top-up. Currently, we do not support fine-tuning. Please reach out to [api@perplexity.ai](mailto:api@perplexity.ai) or [support@perplexity.ai](mailto:support@perplexity.ai) for other API inquiries. You can also post on our [discussion forum](https://github.com/ppl-ai/api-discussion/discussions) and we will get back to you. We do not guarantee this at the moment. The models are hosted in the US and we do not train on any of your data. And no, your data is not going to China. Yes, our reasoning APIs that use DeepSeek's models are uncensored and on par with the other APIs in terms of content moderation. Yes, you will have access to the model's CoT. # Perplexity Crawlers Source: https://docs.perplexity.ai/guides/bots We strive to improve our service every day by delivering the best search experience possible. To achieve this, we collect data using web crawlers (“robots”) and user agents that gather and index information from the internet, operating either automatically or in response to user requests. Webmasters can use the following robots.txt tags to manage how their sites and content interact with Perplexity. Each setting works independently, and it may take up to 24 hours for our systems to reflect changes. | User Agent | Description | | :-------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | PerplexityBot | `PerplexityBot` is designed to surface and link websites in search results on Perplexity. It is not used to crawl content for AI foundation models. To ensure your site appears in search results, we recommend allowing `PerplexityBot` in your site’s `robots.txt` file and permitting requests from our published IP ranges listed below.

Full user-agent string: `Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)`

Published IP addresses: [https://www.perplexity.com/perplexitybot.json](https://www.perplexity.com/perplexitybot.json) | | Perplexity‑User | `Perplexity-User` supports user actions within Perplexity. When users ask Perplexity a question, it might visit a web page to help provide an accurate answer and include a link to the page in its response. `Perplexity-User` controls which sites these user requests can access. It is not used for web crawling or to collect content for training AI foundation models.

Full user-agent string: `Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Perplexity-User/1.0; +https://perplexity.ai/perplexity-user)`

Published IP addresses: [https://www.perplexity.com/perplexity-user.json](https://www.perplexity.com/perplexity-user.json)

Since a user requested the fetch, this fetcher generally ignores robots.txt rules. | # Initial Setup Source: https://docs.perplexity.ai/guides/getting-started Register and make a successful API request ## Registration * Start by visiting the [API Settings page](https://www.perplexity.ai/pplx-api) * Register your credit card to get started This step will not charge your credit card. It just stores payment information for later API usage. ## Generate an API key * Every API call needs a valid API key The API key is a long-lived access token that can be used until it is manually refreshed or deleted. Send the API key as a bearer token in the Authorization header with each API request. When you run out of credits, your API keys will be blocked until you add to your credit balance. You can avoid this by configuring "Automatic Top Up", which refreshes your balance whenever you drop below \$2. ## Make your API call * The API is conveniently OpenAI client-compatible for easy integration with existing applications. ```cURL cURL curl --location 'https://api.perplexity.ai/chat/completions' \ --header 'accept: application/json' \ --header 'content-type: application/json' \ --header 'Authorization: Bearer {API_KEY}' \ --data '{ "model": "sonar-pro ", "messages": [ { "role": "system", "content": "Be precise and concise." }, { "role": "user", "content": "How many stars are there in our galaxy?" } ] }' ``` ```python python from openai import OpenAI YOUR_API_KEY = "INSERT API KEY HERE" messages = [ { "role": "system", "content": ( "You are an artificial intelligence assistant and you need to " "engage in a helpful, detailed, polite conversation with a user." ), }, { "role": "user", "content": ( "How many stars are in the universe?" ), }, ] client = OpenAI(api_key=YOUR_API_KEY, base_url="https://api.perplexity.ai") # chat completion without streaming response = client.chat.completions.create( model="sonar-pro", messages=messages, ) print(response) # chat completion with streaming response_stream = client.chat.completions.create( model="sonar-pro", messages=messages, stream=True, ) for response in response_stream: print(response) ``` # Supported Models Source: https://docs.perplexity.ai/guides/model-cards | Model | Context Length | Model Type | | --------------------- | -------------- | --------------- | | `sonar-reasoning-pro` | 127k | Chat Completion | | `sonar-reasoning` | 127k | Chat Completion | | `sonar-pro` | 200k | Chat Completion | | `sonar` | 127k | Chat Completion | 1. `sonar-reasoning-pro` and `sonar-pro` have a max output token limit of 8k 2. The reasoning models output CoTs in their responses as well ## Legacy Models These models will be deprecated and will no longer be available to use after 2/22/2025 | Model | Context Length | Model Type | | ----------------------------------- | -------------- | --------------- | | `llama-3.1-sonar-small-128k-online` | 127k | Chat Completion | | `llama-3.1-sonar-large-128k-online` | 127k | Chat Completion | | `llama-3.1-sonar-huge-128k-online` | 127k | Chat Completion | # Pricing Source: https://docs.perplexity.ai/guides/pricing ## Sonar Reasoning | Model | Input Tokens (Per Million Tokens) | Output Tokens (Per Million Tokens) | Price per 1000 searches | | --------------------- | --------------------------------- | ---------------------------------- | ----------------------- | | `sonar-reasoning-pro` | \$2 | \$8 | \$5 | | `sonar-reasoning` | \$1 | \$5 | \$5 | | `sonar-pro` | \$3 | \$15 | \$5 | | `sonar` | \$1 | \$1 | \$5 | **Detailed Pricing Breakdown for Sonar Reasoning Pro and Sonar Pro** **Input Tokens** 1. Input tokens are priced at \$3/1M tokens 2. Input tokens comprise of Prompt tokens (user prompt) + Citation tokens (these are processed tokens from running searches) **Search Queries** 1. To give detailed answers, both the Pro APIs also run multiple searches on top of the user prompt where necessary for more exhaustive information retrieval 2. Searches are priced at \$5/1000 searches 3. A request that does 3 searches will cost \$0.015 in this step. **Output Tokens** 1. Output tokens (Completion tokens) are priced at \$15/1M tokens **Total Price** Your total price per request finally is a sum of the above 3 components **Sonar Reasoning and Sonar** Citation tokens are not considered as part of input tokens and each request does 1 search for Sonar Reasoning and Sonar. ## Legacy Models These models will be deprecated and will no longer be available to use after 2/22/2025 | Model | Input Tokens (Per Million Tokens) | Output Tokens (Per Million Tokens) | Price per 1000 requests | | ----------------------------------- | --------------------------------- | ---------------------------------- | ----------------------- | | `llama-3.1-sonar-small-128k-online` | \$0.2 | \$0.2 | \$5 | | `llama-3.1-sonar-large-128k-online` | \$1 | \$1 | \$5 | | `llama-3.1-sonar-huge-128k-online` | \$5 | \$5 | \$5 | # Prompt Guide Source: https://docs.perplexity.ai/guides/prompt-guide ## System Prompt You can use the system prompt to provide instructions related to style, tone, and language of the response. The real-time search component of our models does not attend to the system prompt. **Example of a system prompt** ``` You are a helpful AI assistant. Rules: 1. Provide only the final answer. It is important that you do not include any explanation on the steps below. 2. Do not show the intermediate steps information. Steps: 1. Decide if the answer should be a brief sentence or a list of suggestions. 2. If it is a list of suggestions, first, write a brief and natural introduction based on the original query. 3. Followed by a list of suggestions, each suggestion should be split by two newlines. ``` ## User Prompt You should use the user prompt to pass in the actual query for which you need an answer for. The user prompt will be used to kick off a real-time web search to make sure the answer has the latest and the most relevant information needed. **Example of a user prompt** ``` What are the best sushi restaurants in the world currently? ``` # Structured Outputs Guide Source: https://docs.perplexity.ai/guides/structured-outputs Structured outputs is currently a beta feature and only available to users in Tier-3 ## Overview We currently support two types of structured outputs: **JSON Schema** and **Regex**. LLM responses will work to match the specified format, except for the following cases: * The output exceeds `max_tokens` Enabling the structured outputs can be done by adding a `response_format` field in the request: **JSON Schema** * `response_format: { type: "json_schema", json_schema: {"schema": object} }` . * The schema should be a valid JSON schema object. **Regex** (only avilable for `sonar` right now) * `response_format: { type: "regex", regex: {"regex": str} }` . * The regex is a regular expression string. We recommend to give the LLM some hints about the output format in the prompts. The first request with a new JSON Schema or Regex expects to incur delay on the first token. Typically, it takes 10 to 30 seconds to prepare the new schema. Once the schema has been prepared, the subsequent requests will not see such delay. ## Examples ### 1. Get a response in JSON format **Request** ```python python import requests from pydantic import BaseModel class AnswerFormat(BaseModel): first_name: str last_name: str year_of_birth: int num_seasons_in_nba: int url = "https://api.perplexity.ai/chat/completions" headers = {"Authorization": "Bearer YOUR_API_KEY"} payload = { "model": "sonar", "messages": [ {"role": "system", "content": "Be precise and concise."}, {"role": "user", "content": ( "Tell me about Michael Jordan. " "Please output a JSON object containing the following fields: " "first_name, last_name, year_of_birth, num_seasons_in_nba. " )}, ], "response_format": { "type": "json_schema", "json_schema": {"schema": AnswerFormat.model_json_schema()}, }, } response = requests.post(url, headers=headers, json=payload).json() print(response["choices"][0]["message"]["content"]) ``` **Response** ``` {"first_name":"Michael","last_name":"Jordan","year_of_birth":1963,"num_seasons_in_nba":15} ``` ### 2. Use a regex to output the format **Request** ```python python import requests url = "https://api.perplexity.ai/chat/completions" headers = {"Authorization": "Bearer YOUR_API_KEY"} payload = { "model": "sonar", "messages": [ {"role": "system", "content": "Be precise and concise."}, {"role": "user", "content": "What is the IPv4 address of OpenDNS DNS server?"}, ], "response_format": { "type": "regex", "regex": {"regex": r"(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"}, }, } response = requests.post(url, headers=headers, json=payload).json() print(response["choices"][0]["message"]["content"]) ``` **Response** ``` 208.67.222.222 ``` ## Best Practices ### Generating responses in a JSON Format For Python users, we recommend using the Pydantic library to [generate JSON schema](https://docs.pydantic.dev/latest/api/base_model/#pydantic.BaseModel.model_json_schema). **Unsupported JSON Schemas** Recursive JSON schema is not supported. As a result of that, unconstrained objects are not supported either. Here’s a few example of unsupported schemas: ``` # UNSUPPORTED! from typing import Any class UnconstrainedDict(BaseModel): unconstrained: dict[str, Any] class RecursiveJson(BaseModel): value: str child: list["RecursiveJson"] ``` ### Generating responses using a regex **Supported Regex** * Characters: `\d`, `\w`, `\s` , `.` * Character classes: `[0-9A-Fa-f]` , `[^x]` * Quantifiers: `*`, `?` , `+`, `{3}`, `{2,4}` , `{3,}` * Alternation: `|` * Group: `( ... )` * Non-capturing group: `(?: ... )` * Positive lookahead: `(?= ... )` * Negative lookahead: `(?! ... )` **Unsupported Regex** * Contents of group: `\1` * Anchors: `^`, `$`, `\b` * Positive look-behind: `(?<= ... )` * Negative look-behind: `(? 1. The tiers are based on cumulative purchases on a given account 2. The beta features accumulate as you progress through the tiers # null Source: https://docs.perplexity.ai/home export function openSearch() { document.getElementById('search-bar-entry').click(); }
Welcome to Sonar by Perplexity

Power your products with unparalleled real-time, web-wide research and Q\&A capabilities.

**Get Started**

Explore the docs

Start using the API in minutes Integrate the API into your workflows Frequently asked questions about the API Track the status of our services including the API

# Application Status Source: https://docs.perplexity.ai/system-status/system-status You can check the status of our services [here](https://status.perplexity.com/). If you experience any issues or have any questions, please contact us at [api@perplexity.ai](mailto:api@perplexity.ai) or flag a bug report in our [Discord](https://discord.com/invite/perplexity-ai) channel.