Overview

Sonar models support PDF document analysis through file URL uploads. You can ask questions about PDF content, get summaries, extract information, and perform detailed analysis of uploaded documents.

PDF files must be accessible via a public URL.

Supported Features

  • Document Summarization: Get concise summaries of PDF content
  • Question Answering: Ask specific questions about the document
  • Content Extraction: Extract key information, data, and insights
  • Multi-language Support: Analyze PDFs in various languages
  • Large Document Handling: Process lengthy documents efficiently

Basic Usage

Simple PDF Analysis

curl -X POST "https://api.perplexity.ai/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "content": [
          {
            "type": "text",
            "text": "Summarize this document"
          },
          {
            "type": "file_url",
            "file_url": {
              "url": "https://example.com/document.pdf"
            }
          }
        ],
        "role": "user"
      }
    ],
    "model": "sonar-pro"
  }'
curl -X POST "https://api.perplexity.ai/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "content": [
          {
            "type": "text",
            "text": "What are the key findings in this research paper? Provide additional context from recent studies."
          },
          {
            "type": "file_url",
            "file_url": {
              "url": "https://example.com/research-paper.pdf"
            }
          }
        ],
        "role": "user"
      }
    ],
    "model": "sonar-pro",
    "web_search_options": {"search_type": "pro"}
  }'

File Requirements

Format Support

  • PDF files (.pdf extension)
  • Text-based PDFs (not scanned images)
  • Password-protected PDFs (if publicly accessible)

Size Limits

  • Recommended: Under 50MB
  • Maximum processing time: 60 seconds
  • Large files may take longer to analyze

Common Use Cases

Academic Research

question = "What methodology was used in this study and what were the main conclusions?"
question = "Extract the key terms and conditions from this contract"

Financial Reports

question = "What are the revenue trends and key financial metrics mentioned?"

Technical Documentation

question = "Explain the implementation details and provide a step-by-step guide"

Best Practices

Error Handling

Common Issues

ErrorCauseSolution
Invalid URLURL not accessibleVerify URL returns PDF directly
File too largePDF exceeds size limitsCompress or split the document
Processing timeoutDocument too complexSimplify question or use smaller sections

Example Error Handling

# Basic error handling with curl
response=$(curl -s -w "\n%{http_code}" -X POST "https://api.perplexity.ai/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"content": [{"type": "text", "text": "Analyze this PDF"}], "role": "user"}],
    "model": "sonar-pro"
  }')

http_code=$(echo "$response" | tail -n1)
content=$(echo "$response" | head -n -1)

if [ "$http_code" -eq 200 ]; then
    echo "Success: $content"
else
    echo "Error $http_code: $content"
fi

Integration Examples

# Function to analyze PDF (as a shell function)
analyze_pdf() {
    local pdf_url=$1
    local question=$2
    
    curl -X POST "https://api.perplexity.ai/chat/completions" \
      -H "Authorization: Bearer YOUR_API_KEY" \
      -H "Content-Type: application/json" \
      -d "{
        \"messages\": [{
          \"content\": [
            {\"type\": \"text\", \"text\": \"$question\"},
            {\"type\": \"file_url\", \"file_url\": {\"url\": \"$pdf_url\"}}
          ],
          \"role\": \"user\"
        }],
        \"model\": \"sonar-pro\"
      }"
}

# Usage
analyze_pdf "https://example.com/report.pdf" "What are the main recommendations?"

Pricing

PDF analysis follows standard Sonar pricing based on:

  • Input tokens (document content + question)
  • Output tokens (AI response)
  • Web search usage (if enabled)

Large PDFs consume more input tokens. Consider the document size when estimating costs.

Next Steps