API Updates February 2024
Announcing Our Newest Model
We are excited to announce the launch of our latest Perplexity models: sonar-small-chat
and sonar-medium-chat
, along with their search-enhanced versions, sonar-small-online
and sonar-medium-online
. These new additions surpass our earlier models in cost-efficiency, speed, and performance. For detailed information on our supported models, please visit our model card documentation.
Expanded Context Windows
The context window length for several models has been doubled from 8k to 16k, including mixtral-8x7b-instruct and all Perplexity models. 4k tokens are reserved for search results in online models.
Model Deprecation Notice
Please note that as of March 15, the pplx-70b-chat
, pplx-70b-online
, llama-2-70b-chat
, and codellama-34b-instruct
models will no longer be available through the Perplexity API. We will gradually phase out less frequently used models in favor of our newer and more performant offerings.
Revised Pricing Structure for 8x7b Models
The pricing for the mixtral-8x7b-instruct model will be adjusted. Previously charged at $0.14 / $0.58 per million input and output tokens, the rates will change to $0.60 / $1.80 per million input and output tokens moving forward.
Increased Public Rate Limits
Public limits for all models have increased by ~2x. Find the current rate limits here.
Access to Citations and Elevated Rate Limits
Responding to popular demand in our API discussion forum, we are introducing URL citation access for our Online LLMs to approved users. For access to citations, or to request a rate limit increase, please complete this form.
Terms of Service and Data Processing Addendum
We wish to reiterate our commitment to data privacy for commercial application developers using the Perplexity API. The updated Terms of Service and Data Processing Addendum can be found here.
Thank you for being a Perplexity API user.