API Updates February 2024

Announcing Our Newest Model

We are excited to announce the launch of our latest Perplexity models: sonar-small-chat and sonar-medium-chat, along with their search-enhanced versions, sonar-small-online and sonar-medium-online. These new additions surpass our earlier models in cost-efficiency, speed, and performance. For detailed information on our supported models, please visit our model card documentation.

Expanded Context Windows

The context window length for several models has been doubled from 8k to 16k, including mixtral-8x7b-instruct and all Perplexity models. 4k tokens are reserved for search results in online models.

Model Deprecation Notice

Please note that as of March 15, the pplx-70b-chat, pplx-70b-online, llama-2-70b-chat, and codellama-34b-instruct models will no longer be available through the Perplexity API. We will gradually phase out less frequently used models in favor of our newer and more performant offerings.

Revised Pricing Structure for 8x7b Models

The pricing for the mixtral-8x7b-instruct model will be adjusted. Previously charged at $0.14 / $0.58 per million input and output tokens, the rates will change to $0.60 / $1.80 per million input and output tokens moving forward.

Increased Public Rate Limits

Public limits for all models have increased by ~2x. Find the current rate limits here.

Access to Citations and Elevated Rate Limits

Responding to popular demand in our API discussion forum, we are introducing URL citation access for our Online LLMs to approved users. For access to citations, or to request a rate limit increase, please complete this form.

Terms of Service and Data Processing Addendum

We wish to reiterate our commitment to data privacy for commercial application developers using the Perplexity API. The updated Terms of Service and Data Processing Addendum can be found here.
Thank you for being a Perplexity API user.