Image & Video APIs

Visual and audio clarity

Last updated: Aug-26-2025

Making content distinguishable means ensuring that users can perceive and differentiate important information regardless of their visual or auditory abilities. This includes providing sufficient color contrast, not relying solely on color to convey information, controlling audio levels, and allowing customization of visual and audio elements.

Users with color blindness, low vision, hearing impairments, or various sensitivities need content that can be perceived clearly in different ways. This section covers Cloudinary's tools for creating high-contrast visuals, assisting color blind users, managing audio levels, customizing text presentation, and adapting content for different viewing modes and environments.

Visual and audio clarity considerations

Consideration Cloudinary Image Techniques Cloudinary Video Techniques WCAG Reference
Consider users who can't distinguish colors.

If you're using color to convey important information, think about adding patterns, shapes, or text labels so everyone can understand the message.

🔧 Assist people with color blind conditions 1.4.1 Use of color
Think about users who may be startled or distracted by unexpected audio.

If your content plays sound automatically, consider giving users controls to pause, stop, or adjust the volume.

🔧 Adjust audio volume

🔧 Cloudinary Video Player
1.4.2 Audio control
Consider users with visual impairments who may have difficulty reading text with poor contrast.

They'll need sufficient color contrast between text and backgrounds to read your content comfortably.

🔧 Customizable caption styling

🔧 Text overlays on images and videos

🔧 Adjust contrast on images and videos

🔧 Replacing colors for light/dark themes
1.4.3 Contrast (minimum)

1.4.6 Contrast (enhanced)
Consider whether users can resize, customize, or access your text content.

Actual text is generally more flexible and accessible than text embedded in images.

🔧 Customize text overlays in images

🔧 OCR text detection and extraction
1.4.5 Images of text
Think about users who have difficulty separating speech from background noise.

They may need clear audio where the main content stands out from any background sounds.

🔧 Mixing audio tracks 1.4.7 Low or no background audio

Assist people with color blind conditions

People with color blindness may have difficulty distinguishing between certain colors, particularly red and green. Cloudinary provides tools to help make your media more accessible by using both color and pattern to convey information.

Original Original
X-ray mode X-Ray Mode Striped Overlays Striped Overlays

Simulate color blind conditions

You can experience how your images look to people with different color blind conditions. Apply the e_simulate_colorblind effect with parameters like deuteranopia, protanopia, tritanopia, or cone_monochromacy to preview your content (see all the options).

Color palette with different simulated colorblind conditions

Analyze color accessibility

For a more objective approach to assessing the accessibility of your images, use Accessibility analysis (currently available to paid accounts only).

  1. Upload your images with the accessibility_analysis parameter set to true:

  2. See the accessibility results in the response:

For more information see Accessibility analysis, and for an example of using the results, watch this video tutorial.

Apply stripes

Consider a chart that uses red and green colors to convey information. For someone with red-green color blindness, this information would be inaccessible.

Pie chart Original image Pie chart as seen by someone with deuteranopia Simulated deuteranopia


By adding patterns or symbols alongside colors, you can ensure the information is conveyed regardless of color perception.

Pie chart e_assist_colorblind:20 Pie chart as seen by someone with deuteranopia e_assist_colorblind:20
/e_simulate_colorblind


To add the stripes, apply the assist_colorblind effect with a stripe strength from 1 to 100, e.g. e_assist_colorblind:20:

Apply color shifts

For an image where the problematic colors aren't isolated, it can be even harder to distinguish the content of the image.

Flower and grasshopper Original image Flower and grasshopper as seen by someone with deuteranopia Simulated deuteranopia


By shifting the colors, you can ensure the image is clear regardless of color perception.

Flower and grasshopper e_assist_colorblind:xray Flower and grasshopper as seen by someone with deuteranopia Simulated deuteranopia
after using e_assist_colorblind:xray


To shift the colors, apply the e_assist_colorblind effect with the xray effect, e.g. e_assist_colorblind:xray:

Consider implementing a toggle that adds the assist colorblind effects to your images on demand.

Interactive color blind accessibility demo

Use the controls below to test different color blind assistance techniques and simulate various color blind conditions. This helps you understand which techniques work best for different types of color vision deficiency.

Demo image showing color blind accessibility techniques

Current Transformation URL:

https://res.cloudinary.com/demo/image/upload/w_400/bo_1px_solid_black/docs/piechart.png

Tips for Testing:

  • Pie Chart: Notice how stripes help distinguish sections that may look similar with color blindness
  • Red Flower: X-ray mode shifts colors to make the flower more visible against the green background
  • Compare: Try different combinations to see which techniques work best for each condition

Adjust audio volume

For people with hearing impairments or those in different listening environments, providing volume control options ensures your audio and video content is accessible. The WCAG guidelines specify that if audio plays automatically for more than 3 seconds, users must have a mechanism to pause, stop, or control the volume independently.

With Cloudinary, you can implement this mechanism both programmatically and using the Cloudinary Video Player.

Programmatic volume adjustment

Programmatically adjust the volume directly in your media transformations using the volume effect (e_volume). This allows you to give control to your users via external controls (as shown in the demo).

For example, to reduce the volume to 50% (e_volume:-50):


To increase the volume by 150% (e_volume:150):


You can also mute audio completely by setting the volume to mute:

Note
You can also adjust volume programmatically using the HTMLMediaElement volume property:

Demo: External volume controls using transformations

For users with restricted movement or motor disabilities, you can create larger, more accessible volume controls outside the video player. These external controls use Cloudinary's volume transformations to deliver videos at different volume levels, making them easier to interact with than the built-in player controls. You can see the delivery URL change when you choose a different volume.

Volume: Normal (100%)
Current transformation URL:
https://res.cloudinary.com/demo/video/upload/docs/grocery-store.mp4

Video Player volume controls

The Cloudinary Video Player provides built-in volume controls that users can adjust according to their needs. The player includes both a volume button and a volume slider for precise control.

You can customize the volume controls and set default volume levels in your JavaScript:

Customizable caption styling

Captions and subtitles must meet specific contrast requirements to be accessible to people with visual impairments. The WCAG guidelines specify minimum contrast ratios between text and background colors to ensure readability.

Understanding contrast ratios

A contrast ratio measures the difference in brightness between text and its background, expressed as a ratio like 4.5:1 or 7:1. The higher the number, the more contrast there is.

WCAG Requirements:

  • Level AA (minimum): 4.5:1 contrast ratio for normal text
  • Level AAA (enhanced): 7:1 contrast ratio for normal text
  • Large text (18pt+ or 14pt+ bold): Lower ratios of 3:1 (AA) or 4.5:1 (AAA)

How to measure contrast ratios

You can measure contrast ratios using online tools such as WebAIM Contrast Checker.

How it works:

  1. Pick your colors: Select the text color and background color
  2. Get the ratio: The tool calculates the mathematical contrast ratio
  3. Check compliance: See if it meets WCAG AA (4.5:1) or AAA (7:1) standards

Example measurements:

  • Black text (#000000) on white background (#FFFFFF) = 21:1 (excellent)
  • White text (#FFFFFF) on blue background (#0066CC) = 5.74:1 (passes AA)
  • Light gray text (#CCCCCC) on white background (#FFFFFF) = 1.61:1 (fails - too low)

Implementing accessible caption styling

The Cloudinary Video Player allows you to customize caption appearance to meet contrast requirements. The recommended approach is to use the built-in theme options which provide predefined backgrounds and styling.

The built-in themes are described in this table:

Theme Description Best for
default None High contrast videos only
videojs-default High contrast theme with a dark background and white text General accessibility
yellow-outlined Yellow text with a dark outline for visibility Videos with varied backgrounds
player-colors Uses the video player's custom color scheme for the text and background Brand consistency + accessibility
3d Text with a 3D shadow effect Stylistic preference

The example at the top of this section uses the videojs-default theme. Note that you can also override elements of the theme, for example, by setting the font size. Here's the Video Player configuration:


To set custom colors for the font and background you can use the player-colors theme. This theme uses the colors that you configure when customizing your Video Player.

Related topics

Text overlays on images and videos

Before creating text overlays embedded in images or videos, consider whether the text could instead be placed in your HTML and visually positioned over the media using CSS. HTML text is inherently more accessible because it can be announced by screen readers, restyled by users, translated automatically, and scales with user preferences—all without requiring additional accessibility techniques.

When you do need embedded text overlays in images and videos, it's crucial to ensure sufficient contrast between the text and background for readability. People with visual impairments or those viewing content in bright environments need clear, high-contrast text. Adding background colors or effects to text overlays helps meet WCAG contrast requirements and improves accessibility for everyone.

Text overlays on images with background

Without proper contrast, text overlays can be difficult or impossible to read. Here's how to add accessible text overlays with background colors:

Image with white text overlay that's difficult to read Poor contrast - hard to read Image with white text on black background that's easy to read High contrast - accessible


The accessible version uses a semi transparent black background (b_rgb:00000080) behind white text (co_white) to achieve maximum contrast:

Text overlays on videos with background

Video text overlays face additional challenges as backgrounds change throughout the video. Consistent background colors ensure text remains readable regardless of the video content. This video uses white text (co_white) on a semi-transparent blue background (b_rgb:0000cc90) to create an overlay that remains visible throughout the video.

Adjust contrast on images and videos

Proper contrast, brightness, and saturation adjustments are essential for making images and videos accessible to people with visual impairments, low vision, or those viewing content in challenging lighting conditions. These adjustments can help ensure content remains visible and legible across different viewing environments and for users with varying visual needs.

Contrast adjustments for images

Contrast adjustments can dramatically improve the readability and accessibility of images. Here are examples showing how different contrast levels affect image visibility:

Image with reduced contrast Low contrast
(-80)
Original image Original
(0)
Image with enhanced contrast High contrast
(+80)


Use the contrast effect with a value between -100 and 100:

Interactive contrast, brightness, and saturation demo

In addition to contrast, you can also alter brightness and saturation to help improve image visibility.

Use the controls below to see how contrast, brightness, and saturation adjustments affect image accessibility in real-time. Notice how the transformation URL changes as you adjust the settings:

Demo image for contrast adjustments
Current transformation URL:
https://res.cloudinary.com/demo/image/upload/c_scale,w_500/f_auto/q_auto/docs/groceryshop.jpg

Video visual adjustments

Video content can also benefit from contrast, brightness, and saturation adjustments. These are especially important for users with visual impairments who may struggle with low-contrast video content.

This video uses enhanced contrast (e_contrast:50), increased brightness (e_brightness:10) and saturation (e_saturation:20) to improve visibility and accessibility.

Replacing colors for light/dark themes

For users who navigate websites with light and dark themes, consistency in visual presentation is crucial for both usability and accessibility. Light and dark themes can significantly impact users with visual sensitivities, light sensitivity conditions, or those who simply prefer one theme over another for better readability. Cloudinary provides powerful tools to automatically adapt image colors to match your application's theme, ensuring a cohesive visual experience.

Understanding the accessibility need

Different users have varying preferences and needs when it comes to visual themes:

  • Light sensitivity: Users with photophobia, migraines, or certain medical conditions may find dark themes more comfortable
  • Visual impairments: Some users with low vision find better contrast in a specific theme
  • Environmental factors: Dark themes can be easier on the eyes in low-light environments
  • Battery conservation: On OLED displays, dark themes can help conserve battery life
  • Personal preference: Users may simply prefer one theme for better readability

Dynamic color replacement with replace_color

The replace_color effect allows you to dynamically swap colors in images based on the user's theme preference. This is particularly useful for logos, icons, and graphics that need to maintain brand consistency while adapting to different backgrounds. Try changing the theme at the top right of this page, and you'll see how the different icons look with light and dark themes.

Original logo Light theme (original) Logo adapted for dark theme Dark theme adapted


This example replaces the predominant color with light gray (e_replace_color:e6e6e6:50) with a tolerance of 50 to ensure similar shades are also replaced:

Using the theme effect for comprehensive adaptation

For more sophisticated theme adaptation, use the theme effect which applies comprehensive color adjustments to the image based on a specific background color.

For example, change the screen capture to a dark theme with increased sensitivity to photographic elements (e_theme:color_black:photosensitivity_110):

Original Cloudinary website screenshot Original (Light Theme) Dark-themed Cloudinary website screenshot Dark Theme Adaptation
e_theme:color_black:photosensitivity_110


The effect applies an algorithm that intelligently adjusts the color of illustrations, such as backgrounds, designs, texts, and logos, while keeping photographic elements in their original colors.

Interactive theme adaptation demo

Experience how Cloudinary can automatically adapt images for different themes. This demo shows how the same image can be dynamically modified to suit both light and dark themes using the replace_color transformation, in addition to smart color replacement using the theme effect:

Demo image for theme adaptation
Current transformation URL:
https://res.cloudinary.com/demo/image/upload/c_scale,w_400/f_auto/q_auto/cloudinary_icon.png

Related topics

Customize text overlays in images

Customizable text overlays are essential for accessibility because they allow you to adapt text presentation to meet diverse user needs. Users with visual impairments, dyslexia, or reading difficulties often benefit from specific font styles, sizes, and spacing adjustments. By providing flexibility in text overlay styling, you ensure your content remains accessible across different abilities and preferences.

The WCAG guidelines emphasize that text should be customizable to support users who need larger fonts, different font families, or modified spacing for better readability. Cloudinary's text overlay system provides extensive customization options that help you meet these accessibility requirements while maintaining visual appeal.

Standard text

Large bold (low vision)

Letter spacing (dyslexia)


Understanding text overlay parameters

Cloudinary's text overlay transformation (l_text) supports numerous styling parameters that can be combined to create accessible and visually appealing text:

Core Parameters (Required):

  • Font: Any universally available font or custom font (e.g., Arial, Helvetica, Times)
  • Size: Text size in pixels (e.g., 50, 100)

Styling Parameters (Optional):

  • Weight: Font thickness (normal, bold, thin, light)
  • Style: Font appearance (normal, italic)
  • Decoration: Text decoration (normal, underline, strikethrough)
  • Alignment: Text positioning (left, center, right, justify)
  • Stroke: Text outline (none, stroke)
  • Letter spacing: Space between characters (letter_spacing_<value>)
  • Line spacing: Space between lines (line_spacing_<value>)

Visual Enhancement Parameters:

  • Color: Text color (co_<color>)
  • Background: Background color (b_<color>)
  • Border: Outline styling (bo_<border>)

Best practices for accessible text overlays
  • Font Size: Use sizes of at least 16px for body text, larger for headers. Users with low vision may need even larger text.
  • Font Choice: Sans-serif fonts like Arial and Helvetica are often easier to read, especially for users with dyslexia.
  • Letter Spacing: Additional spacing between letters can improve readability for users with dyslexia or visual processing difficulties.
  • Color Contrast: Ensure sufficient contrast between text and background colors (minimum 4.5:1 ratio for normal text).
  • Background: Use solid background colors behind text when overlaying on complex images to ensure readability.
  • Font Weight: Bold text can improve readability, but avoid fonts that are too thin (like light or thin weights) for important content.

Interactive text overlay customization demo

Use the controls below to experiment with different text styling parameters and see how they affect accessibility and readability. Notice how the transformation URL updates as you adjust the settings:

Demo image with customizable text overlay
Current transformation URL:
https://res.cloudinary.com/demo/image/upload/c_fit,l_text:Arial_50:Sample%20Text,co_black,w_1800/fl_layer_apply,g_center/c_scale,w_600/f_auto/q_auto/docs/white-texture.jpg

Video text overlays

The same customization principles apply to video text overlays. Here's an example of accessible text styling on video content:


This example uses large, bold white text (Arial_60_bold) with a semi-transparent black background (b_rgb:000000cc) to ensure high contrast and readability across the entire video.

OCR text detection and extraction

For images containing text content, Optical Character Recognition (OCR) technology can extract that text and make it accessible to screen readers and other assistive technologies. This is particularly important for images of documents, signs, menus, handwritten notes, or any visual content where text is embedded within the image rather than provided as separate HTML text.

Cloudinary's OCR Text Detection and Extraction add-on can automatically extract text from images during upload, making the content available for accessibility purposes.

Here's an example showing an Italian restaurant menu and the text that Cloudinary's OCR add-on automatically extracted from it:

Italian restaurant menu showing three menu options with prices.

Extracted Text Content (Available to Screen Readers):

MENU 1
INSALATA VERDE
PIZZA CAPRESE
18.50

MENU 2
BRUSCHETTA DELLA CASA
INSALATA DI POLLO
19.50

MENU 3
BRUSCHETTA DELLA CASA
CANNELLONI DI CARNE
AL FORNO 21.50

This text content was automatically extracted using OCR and can be read by screen readers, making the Italian menu accessible to users with visual impairments. Note that the OCR detected the language as Italian (locale: "it") and extracted all menu items with their prices.

To extract text from an image:

  1. Subscribe to the OCR add-on: Enable the OCR Text Detection and Extraction add-on in your Cloudinary account.

  2. Extract text during upload: When uploading images that contain text, use the ocr parameter to extract the text content:

  3. Use extracted text for accessibility: The OCR results are returned in the upload response and can be used to provide accessible alternatives:

    Here's an example in React using the Italian restaurant menu response:

Notes
  • You can invoke the OCR Text Detection and Extraction add-on for images already in your product environment using the Admin API update method.
  • You can retrieve the response at a later date using the Admin API resource method.
  • Consider using contextual or structured metadata to store the text.

Mixing audio tracks

For users with hearing difficulties or auditory processing disorders, the ability to control the balance between foreground speech and background audio is crucial for accessibility. The WCAG guidelines specify that background sounds should be at least 20 decibels lower than foreground speech content, or users should have the ability to turn off background sounds entirely.

Cloudinary's audio mixing capabilities allow you to layer multiple audio tracks and control their relative volumes, ensuring your content meets accessibility requirements while maintaining audio richness.

To control the volume of different audio tracks, use the volume effect in each of the audio layers. In this example, the narration is set to a volume of 3dB higher than the original asset (e_volume:3dB), and the background wind noise is set to a volume of 18dB lower than the original asset (e_volume:-18dB):

Audio normalization for consistent levels

Before mixing audio tracks, it helps to normalize them to consistent baseline levels. Different audio recordings often have varying baseline volumes, which can make it difficult to achieve predictable dB differences for accessibility compliance.

To normalize your audio files before uploading them to Cloudinary, you can use audio processing tools, such as FFmpeg.

For example, normalize the audio file nantech.mp3 to -16 LUFS:

This ensures that when you apply -20dB or -25dB adjustments in Cloudinary, you get the exact dB separation needed for WCAG compliance.

Interactive audio mixing demo

This demo shows how Cloudinary can mix a primary audio track (nanotechnology narration) with a background audio layer (wind sounds). Use the controls to adjust the volume levels and observe how the dB difference affects accessibility:

🎙️ Narration (Foreground)

Range: -20 dB to +20 dB

🌬️ Wind (Background)

Range: -50 dB to 0 dB
dB Difference: 21 dB
WCAG Compliant: Background is 21 dB lower than foreground (exceeds 20 dB requirement)
Current transformation URL:
https://res.cloudinary.com/demo/video/upload/e_volume:3dB/l_audio:docs:wind_norm/e_volume:-18dB/fl_layer_apply/docs/nanotech_norm.mp3

User-controlled audio track levels

Similar to the above demo, you could provide controls in your application to let the user decide on the levels of each track to meet their needs. Here's some example React code that you could use:

Best practices for accessible audio mixing
  • Always provide a no-background option: Some users need complete silence behind speech
  • Maintain 20+ dB separation: When background audio is present, ensure it's at least 20 dB lower
  • Test with real users: Audio perception varies greatly between individuals
  • Consider frequency content: Low-frequency background sounds are less distracting than mid-range frequencies
  • Provide visual indicators: Show users the current dB levels and compliance status
  • Use consistent levels: Maintain the same audio balance throughout your content

Related topics

✔️ Feedback sent!

Rate this page: