Changelog
Changelog
We are excited to announce the public availability of citations in the Perplexity API. In addition, we have also increased our default rate limit for the sonar online models to 50 requests/min for all users.
Effective immediately, all API users will see citations returned as part of their requests by default. This is not a breaking change. The return_citations parameter will no longer have any effect.
If you have any questions or need assistance, feel free to reach out to our team at api@perplexity.ai
We are excited to announce the launch of our latest Perplexity Sonar models:
Online Models -
llama-3.1-sonar-small-128k-online
llama-3.1-sonar-large-128k-online
Chat Models -
llama-3.1-sonar-small-128k-chat
llama-3.1-sonar-large-128k-chat
These new additions surpass the performance of the previous iteration. For detailed information on our supported models, please visit our model card documentation.
[Action Required] Model Deprecation Notice Please note that several models will no longer be accessible effective 8/12/2024. We recommend updating your applications to use models in the Llama-3.1 family immediately.
The following model names will no longer be available via API -
llama-3-sonar-small-32k-online
llama-3-sonar-large-32k-online
llama-3-sonar-small-32k-chat
llama-3-sonar-large-32k-chat
llama-3-8b-instruct
llama-3-70b-instruct
mistral-7b-instruct
mixtral-8x7b-instruct
We recommend switching to models in the Llama-3.1 family:
Online Models -
llama-3.1-sonar-small-128k-online
llama-3.1-sonar-large-128k-online
Chat Models -
llama-3.1-sonar-small-128k-chat
llama-3.1-sonar-large-128k-chat
Instruct Models -
llama-3.1-70b-instruct
llama-3.1-8b-instruct
If you have any questions, please email support@perplexity.ai. Thank you for being a Perplexity API user.
Stay curious,
Team Perplexity
Please note that as of May 14, several models and model name aliases will no longer be accessible. We recommend updating your applications to use models in the Llama-3 family immediately. The following model names will no longer be available via API:
codellama-70b-instruct
mistral-7b-instruct
mixtral-8x22b-instruct
pplx-7b-chat
pplx-7b-online
sonar-small-chat
sonar-small-online
pplx-70b-chat
pplx-70b-online
pplx-8x7b-chat
pplx-8x7b-online
sonar-medium-chat
sonar-medium-online
In lieu of the above, we recommend switching to models from the Llama 3 family:
llama-3-sonar-small-32k-chat
llama-3-sonar-small-32k-online
llama-3-sonar-large-32k-chat
llama-3-sonar-large-32k-online
llama-3-8b-instruct
llama-3-70b-instruct
Effective immediately, input and output tokens are now charged with the same price. Previously, output tokens were more expensive than input tokens. Prices have generally gone down as a result.
Announcing Our Newest Model
We are excited to announce the launch of our latest Perplexity models: sonar-small-chat and sonar-medium-chat, along with their search-enhanced versions, sonar-small-online and sonar-medium-online. These new additions surpass our earlier models in cost-efficiency, speed, and performance. For detailed information on our supported models, please visit our model card documentation.
Expanded Context Windows
The context window length for several models has been doubled from 8k to 16k, including mixtral-8x7b-instruct and all Perplexity models. 4k tokens are reserved for search results in online models.
Model Deprecation Notice
Please note that as of March 15, the pplx-70b-chat, pplx-70b-online, llama-2-70b-chat, and codellama-34b-instruct models will no longer be available through the Perplexity API. We will gradually phase out less frequently used models in favor of our newer and more performant offerings.
Revised Pricing Structure for 8x7b Models
The pricing for the mixtral-8x7b-instruct model will be adjusted. Previously charged at 0.14/0.58 per million input and output tokens, the rates will change to 0.60/1.80 per million input and output tokens moving forward.
Increased Public Rate Limits
Public limits for all models have increased by ~2x. Find the current rate limits here.
Access to Citations and Elevated Rate Limits
Responding to popular demand in our API discussion forum, we are introducing URL citation access for our Online LLMs to approved users. For access to citations, or to request a rate limit increase, please complete this form.
Terms of Service and Data Processing Addendum
We wish to reiterate our commitment to data privacy for commercial application developers using the Perplexity API. The updated Terms of Service and Data Processing Addendum can be found here. Thank you for being a Perplexity API user.
We’re excited to announce that pplx-api is now serving the latest open-source mixture-of-experts model, mixtral-8x7b-instruct
, at the blazingly fast speed of inference you are accustomed to.
We’re excited to share two new PPLX models: pplx-7b-online and pplx-70b-online. These first-of-a-kind models are integrated with our in-house search technology for factual grounding. Read our blog post for more information!
https://blog.perplexity.ai/blog/introducing-pplx-online-llms
We’re also announcing general availability for our API. We’ve rolled out usage-based pricing, which enables us to gradually relax the rate limits on our models. Follow the updated steps for getting started.
We have removed support for replit-code-v1.5-3b
and openhermes-2-mistral-7b
. There are no immediate plans to add these models back. If you were a user who enjoyed openhermes-2-mistral-7b
, try instead using our in-house models, pplx-7b-chat
, pplx-70b-chat
!
The Perplexity AI API is currently in beta release v0. Clients are not protected from backwards incompatible changes and cannot specify their desired API version. Examples of backwards incompatible changes include…
Removing support for a given model. Renaming a response field. Removing a response field. Adding a required request parameter. Backwards incompatible changes will be documented here.
Generally, the API is currently designed to be compatible with OpenAI client libraries. This means that, given the same request body, swapping the API base URL and adding your Perplexity API key will yield a response that can be parsed in the same way as the response OpenAI would yield, except for certain explicitly unsupported body parameters documented in the reference (link to reference).