Why isn't the `response_format` parameter working for reasoning models?
Why isn't the `response_format` parameter working for reasoning models?
sonar-reasoning-pro
model is designed to output a <think>
section containing reasoning tokens, immediately followed by a valid JSON object. As a result, the response_format
parameter does not remove these reasoning tokens from the output.We recommend using a custom parser to extract the valid JSON portion. An example implementation can be found here.Does the API use content filtering or SafeSearch?
Does the API use content filtering or SafeSearch?
How do I file a bug report and what happens afterward?
How do I file a bug report and what happens afterward?
Where are Perplexity's language models hosted?
Where are Perplexity's language models hosted?
How can I upgrade to the next usage tier?
How can I upgrade to the next usage tier?
Tier | Credit Purchase (all time) |
---|---|
Tier 0 | - |
Tier 1 | $50 |
Tier 2 | $250 |
Tier 3 | $500 |
Tier 4 | $1000 |
Tier 5 | $5000 |
How can I track my spend/usage per API key?
How can I track my spend/usage per API key?
How do I request a new feature?
How do I request a new feature?
- Requesting support for a new model or capability (e.g., image processing, fine-tuning options)
- Asking for new API parameters (e.g., additional filters, search options)
- Suggesting performance improvements (e.g., faster response times, better citation handling)
- Enhancing existing API features (e.g., improving streaming reliability, adding new output formats)
Why are the results from the API different from the UI?
Why are the results from the API different from the UI?
- The API uses the same search system as the UI with differences in configuration—so their outputs may differ.
- The underlying AI model might differ between the API and the UI for a given query.
- We give users the power to tune the API to their respective use cases using sampling parameters like
presence_penalty
,top_p
, etc. Custom tuning to specific use cases might lead to less generalization compared to the UI. We set optimized defaults and recommend not to explicitly provide sampling parameters in your API requests.
Will user data submitted through the API be used for model training or other purposes?
Will user data submitted through the API be used for model training or other purposes?
Does the API currently support web browsing?
Does the API currently support web browsing?
What are the limitations to the number of API calls?
What are the limitations to the number of API calls?
What's the best way to stay up to date with API updates?
What's the best way to stay up to date with API updates?
How should I respond to 401: Authorization errors?
How should I respond to 401: Authorization errors?
Do you support fine-tuning?
Do you support fine-tuning?
I have another question or an issue
I have another question or an issue
Does Perplexity provide service quality assurances such as service uptime, frequency of failures, and target recovery time in the event of a failure?
Does Perplexity provide service quality assurances such as service uptime, frequency of failures, and target recovery time in the event of a failure?
Do you expose CoTs if I use your reasoning APIs or Deep Research API?
Do you expose CoTs if I use your reasoning APIs or Deep Research API?
Are the reasoning tokens in Deep Research same as CoTs in the answer?
Are the reasoning tokens in Deep Research same as CoTs in the answer?
Is the internet data access provided by the API identical to that of Perplexity's web interface?
Is the internet data access provided by the API identical to that of Perplexity's web interface?
To what extent is the API OpenAI compatible?
To what extent is the API OpenAI compatible?
id
, model
, and usage
—and supports analogous parameters like model
, messages
, and stream
.Key Differences from the standard OpenAI response include:-
Response Object Structure:
- OpenAI responses typically have an
object
value of"chat.completion"
and acreated
timestamp, whereas our response usesobject: "response"
and acreated_at
field. - Instead of a
choices
array, our response content is provided under anoutput
array that contains detailed message objects.
- OpenAI responses typically have an
-
Message Details:
- Each message in our output includes a
type
(usually"message"
), a uniqueid
, and astatus
. - The actual text is nested within a
content
array that contains objects withtype
,text
, and anannotations
array for additional context.
- Each message in our output includes a
-
Additional Fields:
- Our API response provides extra meta-information (such as
status
,error
,incomplete_details
,instructions
, andmax_output_tokens
) that are not present in standard OpenAI responses. - The
usage
field also differs, offering detailed breakdowns of input and output tokens (including fields likeinput_tokens_details
andoutput_tokens_details
).
- Our API response provides extra meta-information (such as