Perplexity Models

Model	Parameter Count	Context Length	Model Type
`llama-3-sonar-small-32k-chat`	7B	32768	Chat Completion
`llama-3-sonar-small-32k-online`	7B	28000	Chat Completion
`llama-3-sonar-large-32k-chat`	70B	32768	Chat Completion
`llama-3-sonar-large-32k-online`	70B	28000	Chat Completion

Open-Source Models

Where possible, we try to match the Hugging Face implementation.

Model	Parameter Count	Context Length	Model Type
`llama-3-8b-instruct`	8B	8192	Chat Completion
`llama-3-70b-instruct`	70B	8192	Chat Completion
`mixtral-8x7b-instruct`	8x7B	16384	Chat Completion

Special Tokens

We do not raise any exceptions if your chat inputs contain messages with special tokens. If avoiding prompt injections is a concern for your use case, it is recommended that you check for special tokens prior to calling the API. For more details, read Meta's recommendations for Llama.

Online LLMs

Note that the search subsystem of the Online LLMs do not attend to the system prompt.

Access to citations via API is in closed beta. To request access to citations, fill out this form and send an email describing your use case to [email protected].