ModelContext LengthModel Type
codellama-34b-instruct16384Chat Completion
llama-2-70b-chat4096Chat Completion
mistral-7b-instruct \[2\]4096 [1]Chat Completion
mixtral-8x7b-instruct4096 [1]Chat Completion
pplx-7b-chat8192Chat Completion
pplx-70b-chat4096Chat Completion
pplx-7b-online4096Chat Completion
pplx-70b-online4096Chat Completion

[1] We will be increasing the context length of mistral-7b-instruct to 32k tokens (see roadmap).

[2] This model refers to the v0.2 release of mistral-7b-instruct.

Special Tokens

We do not raise any exceptions if your chat inputs contain messages with special tokens. If avoiding prompt injections is a concern for your use case, it is recommended that you check for special tokens prior to calling the API. For more details, read Meta’s recommendations for Llama.

Online LLMs

It is recommended to use only single-turn conversations for the online LLMs (pplx-7b-online and pplx-70b-online). Any system messages given in the request will additionally be ignored.