Perplexity Sonar Models

ModelParameter CountContext LengthModel Type
llama-3.1-sonar-small-128k-online8B127,072Chat Completion
llama-3.1-sonar-large-128k-online70B127,072Chat Completion
llama-3.1-sonar-huge-128k-online405B127,072Chat Completion

The search subsystem of the Online LLMs do not attend to the system prompt. You can use the system prompt to provide instructions related to style, tone, and language of the response.

Beta Access: The Online LLMs have some features in closed beta - to request access to them, please fill out this form.

Perplexity Chat Models

ModelParameter CountContext LengthModel Type
llama-3.1-sonar-small-128k-chat8B127,072Chat Completion
llama-3.1-sonar-large-128k-chat70B127,072Chat Completion

Open-Source Models

Where possible, we try to match the Hugging Face implementation.

ModelParameter CountContext LengthModel Type
llama-3.1-8b-instruct8B131,072Chat Completion
llama-3.1-70b-instruct70B131,072Chat Completion

Special Tokens

We do not raise any exceptions if your chat inputs contain messages with special tokens. If avoiding prompt injections is a concern for your use case, it is recommended that you check for special tokens prior to calling the API. For more details, read Meta’s recommendations for Llama.