Programmable Media

Placing layers on videos

Last updated: Apr-05-2024

Cloudinary allows you to dynamically add layers to specific locations within your videos, where the new layers are added over the base video as overlays, and can also be easily transformed to suit your needs. There are multiple options for adding a new layer to a base video, either an asset uploaded to Cloudinary, a remote asset, or a text string.

Video layers can also be added as underlays instead, and there are special layer applications for using layers in combination with other Cloudinary transformations.

Here are examples of some popular use cases that you can accomplish using layers (combined with other transformations):

Add video overlays
Add image overlays
Add text overlays

Layer transformation syntax

In its most simple form, adding a layer over the base video takes following URL syntax:

The layer parameter is in its own URL component and starts the overlay definition (similar to an open bracket). The layer_apply flag is in a separate component that closes the definition (similar to a closing bracket) and instructs Cloudinary to place it.

Replace any forward slashes in the public ID of the overlay with colons.

You can enhance your layer both by controlling where and how it is placed on the base video using gravity, offset and other placement qualifiers, and by applying transformations to the layered asset, using the following general URL syntax.

Authenticated or private layers

You can add asset overlays that are set to authenticated or private by modifying the syntax:

Image overlays:

  • For private image layers: l_private:<public_id of layer>
  • For authenticated image layers: l_authenticated:<public_id of layer>

Video overlays:

  • For private video layers: l_video:private:<public_id of layer>
  • For authenticated video layers: l_video:authenticated:<public_id of layer>

Audio overlays:

  • For private audio layers: l_audio:private:<public_id of layer>
  • For authenticated audio layers: l_audio:authenticated:<public_id of layer>


You can only add overlays that are set to authenticated or private if you also sign the whole URL (no separate signature is required for the overlay part). See the Media access control documentation for more details on delivering private and authenticated assets.

Image overlays

The default overlay type is an image. For example, adding an overlay of a logo to a base video (l_docs:logo-semi-opaque/fl_layer_apply):


If the public ID of an image includes slashes (e.g., the public ID of an image is animals/dog), replace the slashes with colons when using the image as an overlay (e.g., the public ID of the overlay image becomes animals:dog when used as an overlay).

See also: Fade overlays in and out.

Remote image overlays

Use a remote image (an image not stored in your Cloudinary product environment) as an overlay by adding the fetch (or url for some SDKs) property of the layer parameter ( l_fetch: in URLs) and the base64 encoded URL of the remote image. The general URL syntax for adding a remote image as an overlay takes the following form:

For example, adding an overlay of the remote image to the base video.

The Cloudinary SDKs automatically generate a base64 encoded URL for a given HTTP/S URL, but if you generate the delivery URL in your own code, then you'll need to supply the fetch URL in base64 encoding with padding.

Video overlays

Add another video as an overlay over the base video by using the overlay video parameter (l_video: in URLs) and the public ID of a previously uploaded video (e.g. l_video:dog for a video with the public ID of dog).

You can enhance your layer both by controlling when the video overlay is displayed by using any combination of the 3 offset parameters (see Trimming videos for more information on the parameters and their possible values). You can also decide where and how it is placed on the base video using gravity, offset and other placement qualifiers, and by applying transformations to the layered asset, with the following general syntax.

For example, adding an overlay of a video that fades in after 2.5 seconds and fades out 7.5 seconds later. The overlay is also positioned with a gravity of 'south east' and scaled to a width of 500 pixels.

  • If the public ID of a video includes slashes (e.g., the public ID of a video is animals/cat), replace the slashes with colons when using the video as an overlay (e.g., the public ID of the video becomes animals:cat when used as an overlay).
  • When delivering public videos, you can only add other public videos as overlays. See the Media access control documentation for more details on delivering private and authenticated assets.

Audio overlays

Add an additional audio track over the existing video using the audio overlay parameter (l_audio in URLs) and the public ID of a video or audio file stored in your product environment.

If you supply a video as the overlay, only the audio track will be overlaid. The visual part is discarded.

You can add audio tracks alongside the existing audio, such as adding a voiceover or music. You can also control which parts of the layered audio file to play using the 3 timing offset parameters (see Trimming videos). You can also control the volume using the volume effect.

Here's an example of adding an MP3 music file with the public ID, electronic to the hourglass_timer base video:

It's also possible to remove or replace the existing audio track from the base video. To remove the existing audio track, set the audio codec to none (ac_none in URLs). You can then replace this with your own audio by adding it as an audio overlay as shown above.

Here's an example of replacing the original audio of the video with the electronic MP3 file and reducing the volume by 50%:

See also: Mixing audio tracks

Text overlays

Add a text overlay over the base video with the text property of the layer parameter ( l_text: in URLs). The parameter also requires specifying font family and size (separated with an underscore and followed by a colon), and the text string to display. The general URL syntax for adding a text layer takes the following form:

In addition to the required font and size styling values, you can optionally specify a variety of other CSS-like styling parameters and to further customize your text layers by specifying text color, adding line breaks, emojis and other special characters, and other text layer options.

Cloudinary first generates an image from the text definition and then overlays it like any other image overlay, and thus supports all the same transformations that any image overlay supports.

For example, to overlay the text string "Coffee" in the Arial font with a size of 80 pixels (l_text:Arial_80:Coffee/fl_layer_apply):


You can add subtitles to videos as an embedded layer based on SRT or WebVTT files that were previously uploaded to Cloudinary as raw files. To do this, use the subtitles: property of the overlay parameter ( l_subtitles: in URLs), followed by the public ID (including the .srt or .vtt extension).

For example, to overlay the subtitles in the outdoors.vtt file:

You can optionally include font and size settings (separated with an underscore) before the subtitles file name. For example, to overlay the subtitles in the outdoors.vtt file using the Arial font with a size of 20 pixels:

You can also control the color of the text by adding the color property (co in URLs) to change the fill color of the text, and adding the background property (b in URLs) to change the color of the text background. Colors can be set as an RGB hex triplet (e.g. b_rgb:3e2222), a 3 character RGB hex (e.g. co_rgb:777), a named color (e.g. b_green), or an RGB hex quadruplet, where the 4th value represents the opacity (e.g. b_rgb:3e222266).

For example, to overlay the subtitles in the outdoors.vtt file using the Arial font with a size of 40 pixels, in yellow text (FFFF00) with a black background:

Layer placement

The fl_layer_apply component not only acts as the closing bracket for the layer, but is also used to include any options that will control how the layer is placed on the base video, and any details regarding the relationship between the overlay element and the base video.

Some Cloudinary SDKs use layer apply flags as described, and any placement qualifiers must also be the last component of the layer transformation. However some of the Cloudinary SDKs do not use a specific layer apply flag. Instead, when the SDK generates the transformation URL from your code, it automatically adds the fl_layer_apply flag together with placement qualifiers based on the transformation hierarchy in your SDK code.

Positioning layers with gravity

To determine the position of the new layer, you can add the gravity parameter to define the location to place the layer within the base video ('center' by default). The gravity parameter is added in the same component as the layer_apply flag.

For example, to add an image overlay to the base video with gravity set to west (l_lotus_layer/fl_layer_apply,g_west):

To fine tune the exact location of the layer, you can offset the overlay from the focus of gravity by also adding the x and y coordinate offset parameters. These parameters accept either integer values representing the number of pixels to adjust the overlay position in the horizontal or vertical directions, or decimal values representing a percentage-based offset relative to the containing video (e.g., 0.2 for an offset of 20%).

For example, to place a text overlay at a vertical distance of 5% from the top of the video (l_text:Roboto_400:Paradise/fl_layer_apply,g_north,y_0.05):

See the Positioning with Gravity interactive demo to experiment with gravity and coordinate parameters.

Tracking layers

Video tracking layers provides a mechanism to overlay images or text on a video with the overlay tracking a detected person throughout, using the track_person gravity value. You can make use of fashion object detection (using the Cloudinary AI Content Analysis add on) to determine whether to include the overlay based on the presence of the object on the person. Some use cases for this functionality might be:

  • Overlay relevant prices when specific items of clothing are detected in a product video.
  • Add supplementary information for the components of an outfit, such as the brand.
  • Highlight names of people featured in a video based on clothing characteristics.

To add a tracking layer, apply the g_track_person gravity value in the transformation component after your layer definition, ensuring that you complete the layer using fl_layer_apply, for example:

You can use fashion object detection to ensure that the layer tracks all instances of people based on the defined object. To do this, add the obj parameter with your object value separated from the gravity with a colon (:).

You may also want to control the position of the layer and apply adaptive sizing to ensure the layer maintains a consistent size relative to the person and doesn't overlap any other tracked people. To do this, add the position and adaptivesize parameters separated from the rest of the transformation by a semi-colon (;)

In this example, the dress object is being detected, the layer is positioned above the people ensuring that it is visible for both, and the size adapts to always be 50% sized relative to the tracked people:

You can also apply tracking for text layers, for example:

  • Only one tracked layer can be applied at a time.
  • The maximum video duration that tracked layers can be applied to is 3 minutes.
  • When requesting your video on the fly, you will receive a 423 response until the video has been processed. Once processed, subsequent transformations will be applied synchronously.
  • You can apply transformations to the layer, such as controlling duration, by adding those into the layer definition component (e.g. l_price_tag,du_3)

Layer transformations

You can apply resizing and other transformation options on your overlays just like any other asset delivered from Cloudinary. You can apply multiple (chained) transformations to overlays by adding them in separate components before the layer_apply component. All chained transformations, until a transformation component that includes the layer_apply flag, are applied on the last added overlay or underlay instead of applying them on the base asset (the layer_apply flag closes the layer, similar to a closing bracket).

For example:

  1. Adding a logo overlay scaled down to 50% of its original width and made into a watermark by reducing the opacity to 70% and increasing the brightness to 50% using the brightness effect. The transformed image is then placed as a layer in the top-right corner of the base video (l_cloudinary_icon_blue/c_scale,w_0.5/o_70/e_brightness:50/fl_layer_apply,g_north_east):

  2. The base video is scaled to a width of 500 pixels before adding an image overlay, where the overlay is automatically cropped to a 150px thumbnail including only the detected face and placed in the top left corner (c_scale,w_500/l_docs:model/c_thumb,g_face,w_150/fl_layer_apply,g_north_west):

You cannot use object aware cropping in layers.

Multiple overlays

Multiple overlays can easily be added as chained transformations to an asset. The following example adds both image and text overlays to the base video as follows:

  1. An overlay of an image called umbrella, scaled to a 300px and place in the top left corner (l_umbrella/c_scale,w_300/fl_layer_apply,g_north_west).
  2. An overlay of an image called cloudinary_icon_white with a relative width of 50% of the base video and an opacity of 50% and a brightness of 100 (l_cloudinary_icon_white/c_scale,fl_relative,w_0.5/o_50/e_brightness:100/fl_layer_apply).
  3. The white text string "London" in bold 'Roboto' font with a size of 80 pixels, placed at a distance of 20 pixels from the bottom of the base video (co_white,l_text:roboto_80_bold:London/fl_layer_apply,g_south,y_20).

Nesting layers

Layers can be nested within layers. Each layer must have its own layer and layer_apply components, and the inner layer must be closed before the outer one, like with any nested programming statement.

For example, adding text to the moon overlay:

The first image layer has a transformation that changes its size and the second layer is a text layer configured with a font and size. The second layer is closed and placed by the first (inner) fl_layer_apply. Since no gravity was specified for that later, it's placed in the center of the first overlay. Then the outer layer apply closes and places the entire layer (including its nested layer) and positions it in the northeast corner.

Relative layer sizing

By default, whenever you apply relative resize transformations to your overlay, the overlay is resized relative to its own original size. However, you can add the relative flag (fl_relative in URLs) to specify that percentage-based width and height parameters of overlays (e.g., w_0.5) are relative to the size of the base video instead of to the original size of the overlaying asset itself.

For example, to add an overlay of the image called cloudinary_icon_blue, where the overlay is resized to 80% of the width of the base video (l_cloudinary_icon_blue/c_scale,fl_relative,w_0.8/fl_layer_apply):

Text layer options

Text layers can be customized in a variety of ways, such as applying CSS-like styles, adding line breaks, applying special characters, custom fonts, and more.

Styling parameters

In addition to the required font family and font size values of the text layer, a variety of optional CSS-like styles are supported, such as decoration, alignment, letter spacing, line spacing and more. For a full list, see the Styling parameters table in the reference guide.

The Cloudinary SDK helper methods support supplying the values as an array of mapped values or as a serialized string of values. For example, in Ruby (other frameworks use similar syntax):
overlay: { text: 'Hello World', font_family: 'Arial', font_size: 18, font_weight: 'bold', font_style: 'italic', letter_spacing: 4 }

For example, to overlay the text string "Style" in Verdana bold with a size of 75 pixels, underlined, and with 14 pixels spacing between the letters: l_text:Verdana_50_bold_underline_letter_spacing_14:Style:

Text color

You can control the color of the text overlay by adding the color property (co in URLs).

Opaque colors can be set as an RGB hex triplet (e.g., co_rgb:3e2222), a 3-digit RGB hex (e.g., co_rgb:777) or a named color (e.g., co_green). By default, if the color property is omitted, the text has a black color.

For example, adding the text string "Style" in Times bold with a size of 90 pixels at a distance of 20 pixels from the bottom of the base video, in yellow text (FFFF00):

You can also use a 4-digit or 8-digit RGBA hex quadruplet for the color, where the 4th hex value represents the alpha (opacity) value (e.g., co_rgb:3e222240 results in 25% opacity).

The example below uses the same text string "Style" in Times bold with a size of 90 pixels at a distance of 20 pixels from the bottom of the base video, in yellow text, but this time with an opacity of 50% (FFFF0080):

Multi-line text

You can manually break lines of text by separating each line of text with the newline character (%0A). For example, adding the text string "Pretty Flowers" in Verdana bold with a size of 50 pixels at a distance of 10 pixels from the left border of the base video, where each word appears on a new line with line spacing of -15 pixels:

Auto-line breaks

Cloudinary can also automatically wrap your text into multiple lines based on a specified maximum width for the text string. To do this, apply the fit crop mode to the text layer and specify the width to use for word wrapping. This setting tells Cloudinary to automatically wrap the actual text content onto a new line once the width is reached.

c_fit (called textFit in the latest major version of some SDKs) is the only 'resize' option that can be used as a qualifier of text overlays.

For example, to add a long text string in bold Neucha font with a size of 26 pixels to the base video that wraps at a width of 400 pixels:

When using the fit (textFit in some SDKs) crop mode, you must specify a width for your text overlay, but height is optional. Line breaks are applied as needed to achieve the requested width and/or height rectangle.

The specified font size of your overlay stays as is, even if the resulting text overlay height exceeds the height of its hosting video. So, if you don't limit the overlay height, the height of the image expands to accommodate large texts:

If you do limit the height of your overlay, any text that does not fit within the space defined is cut and an ellipsis (...) is added to the end of the text string to indicate that the text was truncated.

To define a maximum height for the multi-line text add the height parameter in addition to width in the 'resize' transformation of your text layer:

You can also set text alignment and line spacing values to further control the text's appearance. Other resize parameters can be applied as an action over the entire overlay (before the fl_layer_apply) to resize the resulting the text-image overlay as a whole after it's created.

For example, to add a long text string in center aligned bold Times font with a size of 14 pixels to the base video, that wraps at a width of 200 pixels and is limited to a height of 150 pixels; and then rotate the text by 9 degrees and set 30 pixels from the north border to better align with the underlying video:

Special characters

Text strings containing special characters need to be modified (escaped) for use with the text overlay feature. This is relevant for any special characters that would not be allowed "as is" in a valid URL path, as well as other special Unicode characters. These text strings should be escaped using %-based UTF-8 encoding to ensure the text string is valid (for example, replace ? with %3F and use %20 for spaces between words). This encoding is done automatically when embedding images using the Cloudinary SDK helper methods and only needs to be done when manually building the asset delivery URL.

Additionally, to include a comma (,) forward slash (/), percent sign (%) or an emoji character in a text overlay, you must double-escape the % sign within those codes. For example:

  • Add a comma to a text overlay as %252C (and not just %2C).

  • The escaped URL code for the flower emoji is %E2%9D%80. To include this emoji in a text overlay, you must also escape each of the % signs in the escaped code: l_text:Arial_80:Comfort%25E2%259D%2580:

Custom fonts

By default, only universally available fonts are supported for text overlays. However, if you want to use a non-standard font, you can upload it to Cloudinary as a raw, authenticated file and then specify the font's full public_id (including extension) as the font for your overlay:

Custom font guidelines

  • .ttf, .otf and .woff2 font types are supported.
  • Custom fonts must be uploaded as raw, authenticated files.

    You can upload custom fonts via the Media Library by creating (or using an existing) signed upload preset where the Delivery type option in the preset is set as Authenticated. You can use this upload preset when uploading files to the Media Library by configuring it as the default Media library upload preset for Raw files.

    Alternatively, you can select the signed upload preset you create for custom fonts in the Media Library upload widget's Advanced settings while uploading assets, if that option is enabled for your account.

  • If your custom font's public ID includes slashes, specify the public ID path using colons as separators. For example: path1:path2:myfont.ttf.

  • Make sure to include the file extension when referencing the public_id of the raw file. The extension must be specified in lower-case letters.

  • To make use of bold or italic font styles, upload separate font files for each emphasis style and specify the relevant file in the overlay transformation.

  • A custom font is available only to the specific product environment where it was uploaded.

  • Underscores are not supported in custom font names. When uploading the font as a raw file, make sure the public_id does not include an underscore.

  • As with any asset you upload to Cloudinary, it is your responsibility to make sure you have the necessary license and redistribution rights for any custom fonts you use.

Predefined text templates

Instead of specifying the styling parameters every time you need to dynamically add a text overlay to an asset, you can use the public ID of a text image created with the text method of the upload API. The same styles that were used to create the text image will also be dynamically applied to the text overlay. The default text string of the text image is also used unless you provide a new text string, which can be useful if you don't want the text string to appear in the URL, or if the text string is very long.

For example, adding the text string "Stylish text" using the same styling applied in creating the text image named sample_text_style (Roboto font, 82 size, bold and red):

Text layer flags

The text content for text layers is often supplied in real time by your application users or another external source. You may want to use the following flags to help handle these scenarios:

  • fl_disallow_overflow: As mentioned in layer overflow behavior above, you can control whether large image or text layers will result in expanding the size of the delivered asset using the fl_no_overflow flag.

    However, for text overlays, if you don't want long text to impact the expected delivery asset size, but an unexpected trim might risk cutting off essential text, you can apply the fl_disallow_overflow flag, which will cause URLs with overflowing text layers to fail and return a 400 (bad request) error that you can check for and handle in your application.

    For more details and examples, see fl_no_overflow and fl_disallow_overflow in the Transformation Reference.

  • fl_text_no_trim: By default, text layers are tightly trimmed on all sides. In some cases, especially if you add a border around the text, or you are using a gravity for your text layer that might place the text too close to the edge of the layer behind it, you can use the fl_text_no_trim flag to add a small amount of padding around the text overlay string. For example:

When placing a background behind text overlays (e.g., l_text:Arial_100:Flowers,b_green), Cloudinary automatically adds this padding, so this padding flag isn't necessary.

Image underlays

Add an underlay image under a partially-transparent base video with the underlay parameter (u in URLs) and the public ID of a previously uploaded image (e.g., u_background for an image with the public ID of background), with the following general syntax.

You can determine the dimension of the underlay using width and height, and adjust the location of the base video over the underlay using the gravity parameter and the x and y parameters. The underlay can also be further transformed like any other image uploaded to Cloudinary, and the underlay parameter supports the same features as for overlays as described above.

For example, add an underlay of an image called site_bg to the base video. The underlay and base video are both resized to the same width and height, and the brightness is increased to 100 using the brightness effect (c_fill,h_200,w_200/u_site_bg/c_scale,h_200,w_200/e_brightness:100/fl_layer_apply):

If the public ID of an image includes slashes (e.g., the public ID of an image is layers/blue), replace the slashes with colons when using the image as an underlay (e.g., the public ID of the image becomes layers:blue when used as an underlay).


You can use a standard image layer for the purpose of applying a watermark to any delivered video. Opacity and/or brightness transformations are often applied to image layers when they are used as a watermark.

You can also use the smart anti-removal effect with your layer transformation in order to achieve your watermark requirements.

Smart anti-removal

You can use the anti-removal effect (e_anti_removal in URLs) to slightly modify image overlays in a random manner, thus making them much harder to remove (e.g., adding your logo as a watermark to assets). In most cases, the default level of modification is designed to be visually hard to notice, but still difficult to remove cleanly. If needed, you can optionally control the level of distortion that this transformation applies by adding a colon followed by an integer (the higher the number, the more the image is distorted). The anti_removal effect is added in the same component as the layer_apply flag.

For example, adding the anti-removal effect (with a high level of 90 for demonstration purposes) to an overlay of the image called cloudinary_icon_blue added to the north-east corner of the base asset, with the overlay's opacity set to 50% and scaled to a width of 150 pixels (c_scale,w_500/l_cloudinary_icon_blue/c_scale,w_150/o_50/e_anti_removal:90,fl_layer_apply,g_north_east):

Adding this effect generates a different result for each derived video, even if the transformation is the same. For example, every transformation URL including an overlay and the anti-removal effect, where only the public_id of the base video is changed, will result in a slightly different overlay.

See full syntax: e_anti_removal in the Transformation Reference.

Special layer applications

In addition to the primary use of layers for placing other assets or text on the base video, some transformation features make use of the layer option to specify a public ID that will be applied to the base video in order to achieve a desired effect. The following features make use of the layer transformation parameter in a special way:

Feature Description
3D LUTs 3D lookup tables (3D LUTs) are used to map the color space in a LUT layer to the color space in a base video.
Concatenating media Concatenate a video layer to the beginning or end of the base video instead of layering it as a video in video effect.

✔️ Feedback sent!

Rate this page: