Skip to main content
The Media Classifier is an intelligent system that detects when your queries would benefit from visual content and includes relevant images or videos in the response. When enabled, the classifier analyzes your questions and enriches responses with appropriate visual content when the query is inherently visual or would be better answered with media.

Overview

When enabled, the Media Classifier transforms text-only interactions into rich, multimedia experiences by:
  • Intelligently detecting queries that benefit from visual content
  • Including relevant images for visual concepts, objects, and places
  • Adding videos for processes, demonstrations, and dynamic content
  • Enhancing educational and informational responses with media
Media Classifier is exclusively available with the sonar-pro model and must be explicitly enabled using the media_response.enable_media_classifier parameter.

How It Works

The Media Classifier uses advanced intent detection to analyze your queries and determine when visual content would be valuable. When enabled, the system intelligently includes media based on query context and intent.

Intelligent Classification

  • Intent-based detection: Analyzes query content to identify visual needs
  • Smart media selection: Automatically decides between images, videos, or both based on query type
  • Context-aware selection: Chooses appropriate media types (images vs videos)
  • Configurable control: Enable/disable and override media types as needed
  • Seamless integration: Media appears as structured arrays in the response
  • Quality filtering: Only includes high-quality, relevant visual content

Types of Queries That Trigger Media

The classifier identifies several categories of queries that benefit from visual content:
  • Visual Concepts
  • Geographic Locations
  • Processes & Demonstrations
  • Educational Content
  • “What does a quasar look like?”
  • “Show me examples of Art Deco architecture”
  • “What are the different types of cloud formations?”
  • “How do you identify poisonous mushrooms?”

Request Format

To use the Media Classifier, you need to explicitly enable it using the media_response parameter:
curl -X POST "https://api.perplexity.ai/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sonar-pro",
    "web_search_options": {
      "search_type": "pro"
    },
    "media_response": {
      "enable_media_classifier": true
    },
    "messages": [
      {
        "role": "user", 
        "content": "Show me photos of the Golden Gate Bridge."
      }
    ],
    "stream": false,
    "max_tokens": 10000
  }'

Media Response Parameters

The media_response object controls the Media Classifier behavior:
enable_media_classifier
boolean
required
Set to true to enable the Media Classifier for this request
overrides
object
Optional overrides to control specific media types

Advanced Configuration

You can override the automatic media selection using the overrides parameter:
curl -X POST "https://api.perplexity.ai/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sonar-pro",
    "web_search_options": {
      "search_type": "pro"
    },
    "media_response": {
      "enable_media_classifier": true,
      "overrides": {
        "return_videos": true,
        "return_images": true
      }
    },
    "messages": [
      {
        "role": "user", 
        "content": "How do you tie a bowline knot?"
      }
    ]
  }'

Response Format

When the Media Classifier is enabled, the response includes additional images and/or videos arrays alongside the standard chat completion response. The classifier intelligently decides which media types to include based on your query.
  • Image Response
  • Video Response
{
  "id": "b27600e5-2f83-420b-9d27-83b677ccc600",
  "model": "sonar",
  "created": 1760635673,
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 230,
    "total_tokens": 231,
    "search_context_size": "low",
    "cost": {
      "input_tokens_cost": 0.0,
      "output_tokens_cost": 0.0,
      "request_cost": 0.005,
      "total_cost": 0.005
    }
  },
  "citations": [
    "https://en.wikipedia.org/wiki/Hello_(Adele_song)",
    "https://open.spotify.com/track/1Yk0cQdMLx5RzzFTYwmuld",
    "https://www.youtube.com/watch?v=mHONNcZbwDY",
    "https://www.hellomagazine.com/us/",
    "https://www.hello-products.com",
    "https://en.wikipedia.org/wiki/Hello",
    "https://supersimple.com/song/hello/"
  ],
  "search_results": [
    {
      "title": "Hello (Adele song) - Wikipedia",
      "url": "https://en.wikipedia.org/wiki/Hello_(Adele_song)",
      "date": "2015-10-22",
      "last_updated": "2025-10-16",
      "snippet": "",
      "source": "web"
    },
    {
      "title": "Hello - song and lyrics by Adele - Spotify",
      "url": "https://open.spotify.com/track/1Yk0cQdMLx5RzzFTYwmuld",
      "date": "2025-06-01",
      "last_updated": null,
      "snippet": "",
      "source": "web"
    },
    {
      "title": "HELLO! US Edition - Latest news and Photos",
      "url": "https://www.hellomagazine.com/us/",
      "date": "2025-10-16",
      "last_updated": "2025-10-16",
      "snippet": "",
      "source": "web"
    }
  ],
  "images": [
    {
      "image_url": "https://c8.alamy.com/comp/2CHK43R/hello!-magazine-front-cover-of-their-edition-dated-july-26-2010-2CHK43R.jpg",
      "origin_url": "https://www.alamy.com/stock-photo/hello-magazine-front-cover.html",
      "height": 1390,
      "width": 975,
      "title": "Hello magazine front cover hi-res stock photography and images - Alamy"
    },
    {
      "image_url": "https://58v76y8z87lo.hellomagazine.com/horizon/original_aspect_ratio/efd1d865f6f2-hello1911coverukpennyhires.jpg",
      "origin_url": "https://www.hellomagazine.com/magazine/",
      "height": 1024,
      "width": 785,
      "title": "Magazine | HELLO!"
    }
  ],
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "The term \"hello\" can refer to several different things depending on the context:\n\n1. **Greeting**: \"Hello\" is a common English greeting used to acknowledge someone's presence or to initiate a conversation. It has been in use since the early 19th century[6].\n\n2. **Songs**:\n   - **Adele's \"Hello\"**: Released in 2015, this song by Adele became a huge success, debuting at number one on the Billboard Hot 100 and setting several records for digital sales and streaming[1].\n   - **Lionel Richie's \"Hello\"**: A classic hit from 1984, known for its catchy melody and memorable lyrics[3].\n\n3. **Media and Products**:\n   - **HELLO Magazine**: A popular celebrity news magazine that covers entertainment and lifestyle news[4].\n   - **Hello Products**: A brand that aims to make everyday moments more enjoyable with a touch of magic[5].\n\n4. **Computing**: In programming, \"Hello, World!\" is a traditional first program written by beginners to test their setup and learn basic syntax[6]."
      },
      "delta": {
        "role": "assistant",
        "content": ""
      }
    }
  ]
}

Response Fields

When media is included, the response contains additional arrays with structured media data:

Images Array

images
array
Array of image objects when visual content is included

Videos Array

videos
array
Array of video objects when video content is included

Media Types and Sources

The Media Classifier draws from various high-quality sources to provide relevant visual content:

Image Sources

  • Educational databases: Scientific diagrams, historical photos, reference images
  • Geographic content: Maps, satellite imagery, landmark photos
  • Illustrative content: Concept visualizations, process diagrams
  • Cultural content: Art, architecture, cultural artifacts

Video Sources

  • Educational videos: How-to demonstrations, scientific processes
  • Documentary content: Nature footage, historical events
  • Instructional material: Step-by-step tutorials, skill demonstrations
The classifier prioritizes educational and informational content over entertainment or commercial media to ensure responses remain focused and valuable.

Best Practices

Crafting Media-Friendly Queries

To get the best results from the Media Classifier, consider these query patterns:
1

Be specific about visual elements

Instead of “Tell me about butterflies,” try “What do monarch butterfly wings look like and how do they migrate?”
2

Ask about processes or demonstrations

“How do you perform CPR?” or “What’s the proper technique for meditation?” are more likely to include helpful video content.
3

Include location or geographic context

“What does the Aurora Borealis look like from Iceland?” provides geographic context that enhances media selection.
4

Request comparisons or examples

“Show me different types of cloud formations” or “Compare Gothic and Romanesque architecture” encourage visual comparisons.

Example Effective Queries

What are the key identifying features of venomous snakes found in North America?

Integration Examples

Handling Media in Applications

When building applications that use the Media Classifier, handle the images and videos arrays from the response:
React Media Handler
function MessageRenderer({ response }) {
  const { choices, images = [], videos = [] } = response;
  const content = choices[0]?.message?.content || '';
  
  return (
    <div className="message-container">
      {/* Display the text content */}
      <div className="text-content">
        {content}
      </div>
      
      {/* Display images if present */}
      {images.length > 0 && (
        <div className="images-section">
          <h4>Related Images</h4>
          <div className="images-grid">
            {images.map((image, index) => (
              <div key={index} className="image-item">
                <img 
                  src={image.image_url} 
                  alt={image.title}
                  width={image.width}
                  height={image.height}
                  loading="lazy"
                />
                <div className="image-caption">
                  <p>{image.title}</p>
                  <a href={image.origin_url} target="_blank" rel="noopener">
                    Source
                  </a>
                </div>
              </div>
            ))}
          </div>
        </div>
      )}
      
      {/* Display videos if present */}
      {videos.length > 0 && (
        <div className="videos-section">
          <h4>Related Videos</h4>
          <div className="videos-grid">
            {videos.map((video, index) => (
              <div key={index} className="video-item">
                <a href={video.url} target="_blank" rel="noopener">
                  <img 
                    src={video.thumbnail_url}
                    alt="Video thumbnail"
                    width={video.thumbnail_width}
                    height={video.thumbnail_height}
                  />
                </a>
              </div>
            ))}
          </div>
        </div>
      )}
    </div>
  );
}

Media Content Policies

Always implement appropriate content filtering and user controls when displaying media content from API responses in production applications.
Consider implementing:
  • Content moderation: Filter inappropriate or sensitive media
  • Loading states: Handle media loading gracefully
  • Accessibility: Provide alt text and video descriptions
  • Performance: Implement lazy loading for images and videos

Use Cases

Educational Applications

The Media Classifier is particularly valuable for educational platforms by automatically including relevant visual content when queries relate to learning topics.

Travel and Tourism

Perfect for travel applications that need rich visual content:
What are the must-see attractions in Kyoto during cherry blossom season?

Limitations and Considerations

Current Limitations

  • Model dependency: Only available with sonar-pro model
  • Manual activation: Must be explicitly enabled via enable_media_classifier parameter
  • Content availability: Media inclusion depends on query relevance and source availability
  • Format constraints: Media appears in specific markdown/HTML formats

Content Considerations

The Media Classifier prioritizes educational, informational, and factual visual content. Entertainment or promotional media is generally not included.
  • Educational focus: Prioritizes learning and information over entertainment
  • Quality standards: Only includes high-quality, relevant media
  • Source reliability: Media comes from reputable educational and informational sources
  • Content safety: Automated filtering removes inappropriate or sensitive content

Performance Impact

Including media in responses may affect:
  • Response time: Slight increase due to media selection and processing
  • Response size: Larger payloads when media URLs are included
  • Bandwidth usage: Consider data usage when displaying returned media

Advanced Features

Context-Aware Media Selection

The classifier considers conversation context when selecting media. For example, in a conversation about marine biology, asking “What do coral reefs look like?” will prioritize educational coral reef images and videos of reef ecosystems based on the established context.

Multi-Modal Learning Enhancement

The Media Classifier works especially well for multi-modal learning scenarios:
  • Visual learners: Automatic inclusion of diagrams and illustrations
  • Kinesthetic learners: Process videos and step-by-step visual guides
  • Comprehensive understanding: Text combined with visual reinforcement
The Media Classifier is designed to enhance understanding through visual content. When enabled, it intelligently selects media when visual elements would significantly improve the educational or informational value of the response.