Overview
When enabled, the Media Classifier transforms text-only interactions into rich, multimedia experiences by:- Intelligently detecting queries that benefit from visual content
- Including relevant images for visual concepts, objects, and places
- Adding videos for processes, demonstrations, and dynamic content
- Enhancing educational and informational responses with media
Media Classifier is exclusively available with the
sonar-pro model and must be explicitly enabled using the media_response.enable_media_classifier parameter.How It Works
The Media Classifier uses advanced intent detection to analyze your queries and determine when visual content would be valuable. When enabled, the system intelligently includes media based on query context and intent.Intelligent Classification
- Intent-based detection: Analyzes query content to identify visual needs
- Smart media selection: Automatically decides between images, videos, or both based on query type
- Context-aware selection: Chooses appropriate media types (images vs videos)
- Configurable control: Enable/disable and override media types as needed
- Seamless integration: Media appears as structured arrays in the response
- Quality filtering: Only includes high-quality, relevant visual content
Types of Queries That Trigger Media
The classifier identifies several categories of queries that benefit from visual content:- Visual Concepts
- Geographic Locations
- Processes & Demonstrations
- Educational Content
- “What does a quasar look like?”
- “Show me examples of Art Deco architecture”
- “What are the different types of cloud formations?”
- “How do you identify poisonous mushrooms?”
Request Format
To use the Media Classifier, you need to explicitly enable it using themedia_response parameter:
Media Response Parameters
Themedia_response object controls the Media Classifier behavior:
Set to
true to enable the Media Classifier for this requestOptional overrides to control specific media types
Advanced Configuration
You can override the automatic media selection using theoverrides parameter:
Response Format
When the Media Classifier is enabled, the response includes additionalimages and/or videos arrays alongside the standard chat completion response. The classifier intelligently decides which media types to include based on your query.
- Image Response
- Video Response
Response Fields
When media is included, the response contains additional arrays with structured media data:Images Array
Array of image objects when visual content is included
Videos Array
Array of video objects when video content is included
Media Types and Sources
The Media Classifier draws from various high-quality sources to provide relevant visual content:Image Sources
- Educational databases: Scientific diagrams, historical photos, reference images
- Geographic content: Maps, satellite imagery, landmark photos
- Illustrative content: Concept visualizations, process diagrams
- Cultural content: Art, architecture, cultural artifacts
Video Sources
- Educational videos: How-to demonstrations, scientific processes
- Documentary content: Nature footage, historical events
- Instructional material: Step-by-step tutorials, skill demonstrations
Best Practices
Crafting Media-Friendly Queries
To get the best results from the Media Classifier, consider these query patterns:1
Be specific about visual elements
Instead of “Tell me about butterflies,” try “What do monarch butterfly wings look like and how do they migrate?”
2
Ask about processes or demonstrations
“How do you perform CPR?” or “What’s the proper technique for meditation?” are more likely to include helpful video content.
3
Include location or geographic context
“What does the Aurora Borealis look like from Iceland?” provides geographic context that enhances media selection.
4
Request comparisons or examples
“Show me different types of cloud formations” or “Compare Gothic and Romanesque architecture” encourage visual comparisons.
Example Effective Queries
Integration Examples
Handling Media in Applications
When building applications that use the Media Classifier, handle theimages and videos arrays from the response:
React Media Handler
Media Content Policies
Consider implementing:- Content moderation: Filter inappropriate or sensitive media
- Loading states: Handle media loading gracefully
- Accessibility: Provide alt text and video descriptions
- Performance: Implement lazy loading for images and videos
Use Cases
Educational Applications
The Media Classifier is particularly valuable for educational platforms by automatically including relevant visual content when queries relate to learning topics.Travel and Tourism
Perfect for travel applications that need rich visual content:Limitations and Considerations
Current Limitations
- Model dependency: Only available with
sonar-promodel - Manual activation: Must be explicitly enabled via
enable_media_classifierparameter - Content availability: Media inclusion depends on query relevance and source availability
- Format constraints: Media appears in specific markdown/HTML formats
Content Considerations
The Media Classifier prioritizes educational, informational, and factual visual content. Entertainment or promotional media is generally not included.
- Educational focus: Prioritizes learning and information over entertainment
- Quality standards: Only includes high-quality, relevant media
- Source reliability: Media comes from reputable educational and informational sources
- Content safety: Automated filtering removes inappropriate or sensitive content
Performance Impact
Including media in responses may affect:- Response time: Slight increase due to media selection and processing
- Response size: Larger payloads when media URLs are included
- Bandwidth usage: Consider data usage when displaying returned media
Advanced Features
Context-Aware Media Selection
The classifier considers conversation context when selecting media. For example, in a conversation about marine biology, asking “What do coral reefs look like?” will prioritize educational coral reef images and videos of reef ecosystems based on the established context.Multi-Modal Learning Enhancement
The Media Classifier works especially well for multi-modal learning scenarios:- Visual learners: Automatic inclusion of diagrams and illustrations
- Kinesthetic learners: Process videos and step-by-step visual guides
- Comprehensive understanding: Text combined with visual reinforcement
The Media Classifier is designed to enhance understanding through visual content. When enabled, it intelligently selects media when visual elements would significantly improve the educational or informational value of the response.