Why isn't the `response_format` parameter working for reasoning models?
sonar-reasoning-pro
model is designed to output a <think>
section containing reasoning tokens, immediately followed by a valid JSON object. As a result, the response_format
parameter does not remove these reasoning tokens from the output.We recommend using a custom parser to extract the valid JSON portion. An example implementation can be found here.Does the API use content filtering or SafeSearch?
How do I file a bug report and what happens afterward?
Where are Perplexity's language models hosted?
How can I upgrade to the next usage tier?
Tier | Credit Purchase (all time) |
---|---|
Tier 0 | - |
Tier 1 | $50 |
Tier 2 | $250 |
Tier 3 | $500 |
Tier 4 | $1000 |
Tier 5 | $5000 |
How can I track my spend/usage per API key?
How do I request a new feature?
Why are the results from the API different from the UI?
presence_penalty
, top_p
, etc. Custom tuning to specific use cases might lead to less generalization compared to the UI. We set optimized defaults and recommend not to explicitly provide sampling parameters in your API requests.Will user data submitted through the API be used for model training or other purposes?
Does the API currently support web browsing?
What are the limitations to the number of API calls?
What's the best way to stay up to date with API updates?
How should I respond to 401: Authorization errors?
Do you support fine-tuning?
I have another question or an issue
Does Perplexity provide service quality assurances such as service uptime, frequency of failures, and target recovery time in the event of a failure?
Where are your DeepSeek reasoning models behind Sonar Reasoning and Sonar Reasoning Pro hosted? Is my data going to China?
Are your reasoning APIs that use DeepSeek uncensored?
Do you expose CoTs if I use your reasoning APIs or Deep Research API?
Are the reasoning tokens in Deep Research same as CoTs in the answer?
Is the internet data access provided by the API identical to that of Perplexity's web interface?
To what extent is the API OpenAI compatible?
id
, model
, and usage
—and supports analogous parameters like model
, messages
, and stream
.Key Differences from the standard OpenAI response include:object
value of "chat.completion"
and a created
timestamp, whereas our response uses object: "response"
and a created_at
field.choices
array, our response content is provided under an output
array that contains detailed message objects.type
(usually "message"
), a unique id
, and a status
.content
array that contains objects with type
, text
, and an annotations
array for additional context.status
, error
, incomplete_details
, instructions
, and max_output_tokens
) that are not present in standard OpenAI responses.usage
field also differs, offering detailed breakdowns of input and output tokens (including fields like input_tokens_details
and output_tokens_details
).