The Sonar API provides powerful features for building production-ready applications. This guide covers two core capabilities: streaming responses for real-time output and structured outputs for consistent data formats. For prompting guidance, see the Prompt Guide.
Streaming allows you to receive partial responses from the Sonar API as they are generated, rather than waiting for the complete response. This is particularly useful for real-time user experiences, long responses, and interactive applications.
from perplexity import Perplexityclient = Perplexity()# Create streaming completionstream = client.chat.completions.create( model="sonar", messages=[{"role": "user", "content": "What is the latest in AI research?"}], stream=True)# Process streaming responsecontent = ""for chunk in stream: if chunk.choices[0].delta.content: content_piece = chunk.choices[0].delta.content content += content_piece print(content_piece, end="", flush=True) # Collect metadata from final chunks if hasattr(chunk, 'search_results') and chunk.search_results: search_results = chunk.search_results if hasattr(chunk, 'usage') and chunk.usage: usage_info = chunk.usage
Structured outputs enable you to enforce specific response formats from Perplexity’s models, ensuring consistent, machine-readable data that can be directly integrated into your applications without manual parsing.We support JSON Schema structured outputs. To enable structured outputs, add a response_format field to your request with the following structure:
Improve Schema Compliance: Give the LLM hints about the output format in your prompts to improve adherence to the structured format. Include phrases like “Please return the data as a JSON object with the following structure…”
The first request with a new JSON Schema may incur a delay on the first token (typically 10-30 seconds) as the schema is prepared. Subsequent requests will not see this delay.
Links in JSON Responses: Requesting links as part of a JSON response may not always work reliably. Use the links returned in the citations or search_results fields from the API response instead.