{ "id": "12345678-1234-1234-1234-123456789012", "object": "chat.completion", "created": 1641234567, "model": "sonar-pro", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The movie that won the Cannes Film Festival Special Jury Prize in 1996 was \"Breaking the Waves\" directed by Lars von Trier. This Danish drama film starred Emily Watson and was notable for being von Trier's breakthrough film that helped establish him as a major international filmmaker..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 45, "completion_tokens": 128, "total_tokens": 173 }, "thread_id": "3947452d-7cc0-48b3-afe7-d053c1083b78"}
Pro Search
Context Management
Learn how to use threading and context management with Perplexity’s sonar-pro model to maintain conversation continuity across multiple API calls.
Context Management enables you to maintain conversation context across multiple API calls using Perplexity’s advanced threading system. This feature is exclusively available for the sonar-pro model and allows for natural follow-up questions and contextual conversations.
Threading and context management features are only available with the sonar-pro model. Other models do not support these parameters.
When you enable threading with use_threads: true, Perplexity creates a conversation thread that maintains context between API calls. You can then reference this thread in subsequent requests using the thread_id parameter, allowing the model to understand follow-up questions and maintain conversational continuity.
Begin a conversation with threading enabled by setting use_threads: true:
Copy
Ask AI
curl --request POST \ --url https://api.perplexity.ai/chat/completions \ --header "Content-Type: application/json" \ --header "Authorization: Bearer YOUR_API_KEY" \ --data '{ "model": "sonar-pro", "use_threads": true, "messages": [ { "role": "user", "content": "Which movie won the Cannes film festival special jury prize in 1996?" } ], "stream": false, "web_search_options": { "search_type": "pro" } }'
Copy
Ask AI
{ "id": "12345678-1234-1234-1234-123456789012", "object": "chat.completion", "created": 1641234567, "model": "sonar-pro", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The movie that won the Cannes Film Festival Special Jury Prize in 1996 was \"Breaking the Waves\" directed by Lars von Trier. This Danish drama film starred Emily Watson and was notable for being von Trier's breakthrough film that helped establish him as a major international filmmaker..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 45, "completion_tokens": 128, "total_tokens": 173 }, "thread_id": "3947452d-7cc0-48b3-afe7-d053c1083b78"}
The response includes a thread_id field that you’ll use for follow-up questions in the same conversational context.
Notice that the follow-up question “What else has that director made?” doesn’t specify which director, but the model understands from the thread context that you’re referring to Lars von Trier.
Thread IDs are sensitive identifiers that provide access to conversation history. Store them securely and associate them with the appropriate users in your application.
Copy
Ask AI
# Example: Storing thread IDs with user sessionsuser_threads = {}def get_user_thread(user_id): return user_threads.get(user_id)def set_user_thread(user_id, thread_id): user_threads[user_id] = thread_id
2
Handle Thread Expiration
Threads may expire after extended periods of inactivity. Always handle cases where a thread_id becomes invalid:
Copy
Ask AI
def resilient_chat(user_id, message): thread_id = get_user_thread(user_id) response = safe_threaded_request(thread_id, message) # Update stored thread ID if we got a new one if not thread_id or response.thread_id != thread_id: set_user_thread(user_id, response.thread_id) return response
3
Optimize for Context Length
While threads maintain context automatically, be mindful of context limits. For very long conversations, consider summarizing early parts of the conversation:
Threads automatically manage context, but extremely long conversations may hit context limits. The model will intelligently summarize or truncate older context as needed.
Build a research assistant that maintains context across multiple queries:
Copy
Ask AI
class ResearchAssistant: def __init__(self, api_key): self.client = OpenAI( api_key=api_key, base_url="https://api.perplexity.ai" ) self.current_thread = None def start_research(self, topic): """Start a new research session""" response = self.client.chat.completions.create( model="sonar-pro", use_threads=True, messages=[ { "role": "user", "content": f"I'm researching {topic}. Can you provide an overview of the current state and recent developments?" } ], web_search_options={"search_type": "pro"} ) self.current_thread = response.thread_id return response.choices[0].message.content def ask_followup(self, question): """Ask a follow-up question in the current research context""" if not self.current_thread: raise ValueError("No active research session. Call start_research() first.") response = self.client.chat.completions.create( model="sonar-pro", use_threads=True, thread_id=self.current_thread, messages=[{"role": "user", "content": question}], web_search_options={"search_type": "pro"} ) return response.choices[0].message.content def deep_dive(self, aspect): """Deep dive into a specific aspect of the research topic""" return self.ask_followup( f"Can you provide more detailed information about {aspect}? " f"Include recent studies, key players, and future outlook." )# Example usageassistant = ResearchAssistant("YOUR_API_KEY")# Start researchoverview = assistant.start_research("quantum computing applications in cryptography")print("Research Overview:", overview)# Ask follow-up questionscompanies = assistant.ask_followup("Which companies are leading in this area?")print("Leading Companies:", companies)challenges = assistant.ask_followup("What are the main technical challenges?")print("Challenges:", challenges)# Deep dive into specific aspectstimeline = assistant.deep_dive("implementation timeline and milestones")print("Timeline Analysis:", timeline)
Create an educational assistant that builds on previous explanations:
Copy
Ask AI
class EducationalAssistant: def __init__(self, api_key): self.client = OpenAI( api_key=api_key, base_url="https://api.perplexity.ai" ) self.learning_threads = {} def start_lesson(self, student_id, subject): """Start a new learning session for a subject""" response = self.client.chat.completions.create( model="sonar-pro", use_threads=True, messages=[ { "role": "user", "content": f"I want to learn about {subject}. Please start with the fundamentals and explain concepts clearly for a beginner." } ], web_search_options={"search_type": "pro"} ) self.learning_threads[student_id] = response.thread_id return response.choices[0].message.content def ask_question(self, student_id, question): """Ask a question in the context of the ongoing lesson""" thread_id = self.learning_threads.get(student_id) if not thread_id: raise ValueError("No active lesson for this student.") response = self.client.chat.completions.create( model="sonar-pro", use_threads=True, thread_id=thread_id, messages=[{"role": "user", "content": question}], web_search_options={"search_type": "pro"} ) return response.choices[0].message.content def request_examples(self, student_id, concept): """Request examples for a specific concept""" return self.ask_question( student_id, f"Can you provide practical examples of {concept} that we've been discussing? " f"Please relate them to what we've already covered." ) def check_understanding(self, student_id): """Ask the AI to check student's understanding based on the conversation""" return self.ask_question( student_id, "Based on our conversation so far, can you create a few questions to test my understanding? " "Please make them progressively more challenging." )# Example usageteacher = EducationalAssistant("YOUR_API_KEY")# Start learning about machine learninglesson_start = teacher.start_lesson("student_123", "machine learning basics")print("Lesson Introduction:", lesson_start)# Ask clarifying questionsclarification = teacher.ask_question("student_123", "What's the difference between supervised and unsupervised learning?")print("Clarification:", clarification)# Request examplesexamples = teacher.request_examples("student_123", "supervised learning algorithms")print("Examples:", examples)# Check understandingquiz = teacher.check_understanding("student_123")print("Understanding Check:", quiz)
Problem: Getting a “thread not found” error when using a previously valid thread ID.Solution: Threads may expire after extended periods of inactivity. Implement fallback logic to start a new thread:
Problem: Threading parameters are ignored or cause errors with other models.Solution: Threading is exclusively available for sonar-pro. Ensure you’re using the correct model:
Problem: Very long conversations may hit context limits.Solution: The model automatically manages context, but you can implement conversation summarization for extremely long threads:
Copy
Ask AI
def summarize_conversation(thread_id): """Request a summary of the conversation so far""" response = client.chat.completions.create( model="sonar-pro", use_threads=True, thread_id=thread_id, messages=[ { "role": "user", "content": "Can you provide a concise summary of our conversation so far, highlighting the key points and conclusions?" } ] ) return response.choices[0].message.content
Always validate thread IDs before using them in production applications. Invalid or expired thread IDs will cause API errors that should be handled gracefully.