Why isn't the `response_format` parameter working for reasoning models?
The sonar-reasoning-pro
model is designed to output a <think>
section containing reasoning tokens, immediately followed by a valid JSON object. As a result, the response_format
parameter does not remove these reasoning tokens from the output.
We recommend using a custom parser to extract the valid JSON portion. An example implementation can be found here.
Does the API use content filtering or SafeSearch?
Yes, for the API, content filtering in the form of SafeSearch is turned on by default. This helps filter out potentially offensive and inappropriate content, including pornography, from search results. SafeSearch is an automated filter that works across search results to provide a safer experience. You can learn more about SafeSearch on the official Wikipedia page.
How do I file a bug report and what happens afterward?
To file a bug report, please use our GitHub repository and file the bug in issues. Once you’ve submitted your report, we kindly ask that you share the link to the issue with us via email at api@perplexity.ai so we can track it on our end.
We truly appreciate your patience, and we’ll get back to you as soon as possible. Due to the current volume of reports, it may take a little time for us to respond—but rest assured, we’re on it.
Where are Perplexity's language models hosted?
Our compute is hosted via Amazon Web Services in North America. By default, the API has zero day retention of user prompt data, which is never used for AI training.
How can I upgrade to the next usage tier?
The only way for an account to be upgraded to the next usage tier is through all-time credit purchase.
Here are the spending criteria associated with each tier:
Tier | Credit Purchase (all time) |
---|---|
Tier 0 | - |
Tier 1 | $50 |
Tier 2 | $250 |
Tier 3 | $500 |
Tier 4 | $1000 |
Tier 5 | $5000 |
How can I track my spend/usage per API key?
We offer a way to track your billing per API key. You can do this by navigating to the following location:
Settings > View Dashboard > Invoice history > Invoices
Then click on any invoice and each item from the total bill will have a code at the end of it (e.g., pro (743S)). Those 4 characters are the last 4 of your API key.
How do I request a new feature?
A Feature Request is a suggestion to improve or add new functionality to the Perplexity Sonar API, such as:
If your request aligns with these, please submit a feature request here: Github Feature requests
Why are the results from the API different from the UI?
presence_penalty
, top_p
, etc. Custom tuning to specific use cases might lead to less generalization compared to the UI. We set optimized defaults and recommend not to explicitly provide sampling parameters in your API requests.Will user data submitted through the API be used for model training or other purposes?
We collect the following types of information:
API Usage Data: We collect billable usage metadata such as the number of requests and tokens. You can view your own usage in the Perplexity API dashboard.
User Account Information: When you create an account with us, we collect your name, email address, and other relevant contact information.
We do not retain any query data sent through the API and do not train on any of your data.
Does the API currently support web browsing?
Yes, the Sonar Models leverage information from Perplexity’s search index and the public internet.
What are the limitations to the number of API calls?
You can find our rate limits here.
What's the best way to stay up to date with API updates?
We email users about new developments and also post in the changelog.
How should I respond to 401: Authorization errors?
Do you support fine-tuning?
Currently, we do not support fine-tuning.
I have another question or an issue
Please reach out to api@perplexity.ai or support@perplexity.ai for other API inquiries. You can also post on our discussion forum and we will get back to you.
Does Perplexity provide service quality assurances such as service uptime, frequency of failures, and target recovery time in the event of a failure?
We do not guarantee this at the moment.
Where are your DeepSeek reasoning models behind Sonar Reasoning and Sonar Reasoning Pro hosted? Is my data going to China?
The models are hosted in the US and we do not train on any of your data. And no, your data is not going to China.
Are your reasoning APIs that use DeepSeek uncensored?
Yes, our reasoning APIs that use DeepSeek’s models are uncensored and on par with the other APIs in terms of content moderation.
Do you expose CoTs if I use your reasoning APIs or Deep Research API?
We expose the CoTs for Sonar Reasoning Pro and Sonar Reasoning. We don’t currently expose the CoTs for Deep Research.
Does R1-1776 search the web?
R1-1776 is an offline chat model that does not search the web. So this model might not have the most up-to-date information beyond its training cutoff date—which should be the same as R1.
Are the reasoning tokens in Deep Research same as CoTs in the answer?
Reasoning tokens in Deep Research are a bit different than the CoTs in the answer—these tokens are used to reason through the research material before generating the final output via the CoTs.
Is the internet data access provided by the API identical to that of Perplexity's web interface?
Yes, the API offers exactly the same internet data access as Perplexity’s web platform.
To what extent is the API OpenAI compatible?
The Perplexity API is designed to be broadly compatible with OpenAI’s chat completions endpoint. It adopts a similar structure—including fields such as id
, model
, and usage
—and supports analogous parameters like model
, messages
, and stream
.
Key Differences from the standard OpenAI response include:
Response Object Structure:
object
value of "chat.completion"
and a created
timestamp, whereas our response uses object: "response"
and a created_at
field.choices
array, our response content is provided under an output
array that contains detailed message objects.Message Details:
type
(usually "message"
), a unique id
, and a status
.content
array that contains objects with type
, text
, and an annotations
array for additional context.Additional Fields:
status
, error
, incomplete_details
, instructions
, and max_output_tokens
) that are not present in standard OpenAI responses.usage
field also differs, offering detailed breakdowns of input and output tokens (including fields like input_tokens_details
and output_tokens_details
).These differences are intended to provide enhanced functionality and additional context while maintaining broad compatibility with OpenAI’s API design.