Video generation (Beta)

Last updated: Dec-20-2022

Important
Video generation is currently in Beta. There may be minor changes to parameter names or other implementation details before the general access release. We invite you to try it out. We would appreciate any feedback via our support team.

Overview

Using Cloudinary's video generation feature you can programmatically create videos at scale. Create a manifest in JSON format to provide a flexible and reusable template allowing you to generate multiple videos using different assets, colors and text.

To create a video you need to supply a manifest_json to the create_video endpoint of the Upload API.

This is the manifest_json that was used to create the video above:


You can use the same manifest as a template from which to create new videos:

Try create_video in Postman

There is no SDK support yet for create_video so you need to call the REST API directly. To help you, you can use our Postman collection.

Key concepts

The key concepts are based on common video editing tools.

Example video editing tool

Example video tool showing tracks on a timeline with multiple clips in each track

Tracks

You can think of a track as a layer of your video. A track can contain a number of clips, which play sequentially in time. You can control the position of a track in space and time using keyframes.

Clips

Clips are the media elements that sit inside tracks. Clips can be of type image, video, text or color. You can apply effects to clips, and transitions between clips.

Keyframes

Keyframes are specific moments in time within a track. For each keyframe you can specify an x,y position of the track, and its opacity. Using opacity, you can make a track visible or invisible, so it only appears at a particular time in the video.

Transitions

Video creation supports GL transitions, which specify how one clip blends into the next. Transitions can be used at the start and end of tracks as well, by using color clips with a transparent color.

Effects

Effects can be added to any clip. Currently, only the Ken Burns effect is supported, which allows you to zoom in and pan from a full frame to a new center, or zoom out to full frame.

Nested tracks

Tracks can be defined and nested as clips within a parent track. This allows you to apply transitions between tracks as if they were clips, and also to apply effects to tracks.

Create a manifest

To create a manifest you need to have an idea of what your end video will look like, then plan out what layers (tracks) you'll need.

The following instructions walk through how to create the video in the Overview section. To follow along, create a blank .json file and paste in the code snippets. Or, you could build on the blank template in the Side-by-side editor.

Specify the global video parameters

The global parameters specify the dimensions of the overall video, its duration in seconds and the frame rate:

Copy to clipboard
{
  "type": "video",
  "width": 1280,
  "height": 720,
  "duration": 10,
  "fps": 30,
}

Add the first track

The first track is made up of one clip, a colored rectangle. This takes on the dimensions of the overall video, and is the bottom layer of the video.

Copy to clipboard
{
  "type": "video",
  "width": 1280,
  "height": 720,
  "duration": 10,
  "fps": 30,

  "tracks": [
    {
      "clips": [
        {
          "color": "#8e998a",
          "type": "color"
        }
      ]
    }
  ]
}

This is how this track looks - just a plain background playing for 10 seconds:

Add the second track

The second track contains an image clip. This image is stored in our demo product environment, so if using the Side-by-side editor, ensure that in the Settings you set the Cloud Name to demo. Otherwise, you can upload the image to your own product environment, and make sure the public ID in the media array matches the one you've uploaded (i.e. replace docs/retirement with the public ID of the image in your product environment).

The image is resized using the fill cropping action to fill the dimensions of the track, and is positioned in the top left corner of the overall video. As its height is the same as the overall video, but its width is less, it appears to take up the area on the left side of the video, overlaid on track 1. The duration of the clip is set in the clipDefaults section for reasons that are explained in the next step.

Copy to clipboard
{
  // Global parameters
  ...

  "tracks": [
    {
      // Track one
      ...
    }
    {
      "width": 920,
      "height": 720,
      "x": 0,
      "y": 0,
      "clipDefaults": {
          "clipDuration": 10000
      },
      "clips": [
        {
          "media": [
            "docs/retirement",
            "image",
            "upload"
          ],
          "type": "image",
          "transformation": "c_fill"
        }
      ]
    }
  ]
}

Here's how the two tracks look together:

Add an effect to the clip

To make the image appear to zoom in and out, duplicate the clip and change the duration of each to four seconds. This is the reason for putting it in clipDefaults, so it doesn't need to be specified in each clip. Add a clipEffect to the clips that alternately zooms in and out for each clip.

Copy to clipboard
{
  // Global parameters
  ...

  "tracks": [
    {
      // Track one
      ...
    },
    {
      "width": 920,
      "height": 720,
      "x": 0,
      "y": 0,
      "clipDefaults": {
        "clipDuration": 4000,
        "clipEffect": {
            "name": "KenBurns",
            "easing": "linear",
            "zoom": 1.2,
            "center": [
              0.5,
              0.5
            ],
            "direction": "alternate"
          }
        },
        "clips": [
        {
          "media": [
            "docs/retirement",
            "image",
            "upload"
          ],
          "type": "image",
          "transformation": "c_fill"
        },
         {
          "media": [
            "docs/retirement",
            "image",
            "upload"
          ],
          "type": "image",
          "transformation": "c_fill"
        }
      ]
    }
  ]
}

Here's the video with the clip effect:

Add the third track

The third track is the rectangle that starts to appear after two seconds, and moves down for one second to its final position. Keyframes are defined to enable this animation.

Copy to clipboard
{
  // Global parameters
  ...

  "tracks": [
    {
      // Track one
      ...
    },
    {
      // Track two
      ...
    },
    {
      "keyframes": {
        "0": {
          "y": 430,
          "opacity": 0
        },
        "2000": {
          "y": 430,
          "opacity": 0
        },
        "3000": {
          "y": 470,
          "opacity": 1
        }
      },
      "x": 950,
      "width": 300,
      "height": 30,
      "radius": "15%",
      "clips": [
        {
          "border": 3,
          "borderColor": "white",
          "type": "color"
        }
      ]
    }
  ]
}

Here's how the video looks now:

Add the fourth track

Next up is the text that goes inside the rectangle. This is animated in the same way as the third track, but five pixels lower.

Copy to clipboard
{
  // Global parameters
  ...

  "tracks": [
    {
      // Track one
      ...
    },
    {
      // Track two
      ...
    },
    {
      // Track three
      ...
    },
    {
      "keyframes": {
        "0": {
          "y": 435,
          "opacity": 0
        },
        "2000": {
          "y": 435,
          "opacity": 0
        },
        "3000": {
          "y": 475,
          "opacity": 1
        }
      },
      "x": 950,
      "width": 300,
      "clipDefaults": {
        "fontSize": 20,
        "fontColor": "white",
        "textMaxWidth": 300,
        "textAlign": "center"
      },
      "clips": [
        {
          "text": "Read more",
          "type": "text"
        }
      ]
    }
  ]
}

Here's the video:

Add the fifth track

The fifth track is the text that says "Elder Care & Nursing Home". Again, this uses keyframes to make the text appear after one second and drift up for one second.

Copy to clipboard
{
  // Global parameters
  ...

  "tracks": [
    {
      // Track one
      ...
    },
    {
      // Track two
      ...
    },
    {
      // Track three
      ...
    },
    {
      // Track four
      ...
    },
    {
      "x": 950,
      "keyframes": {
        "0": {
          "y": 290,
          "opacity": 0
        },
        "1000": {
          "y": 290,
          "opacity": 0
        },
        "2000": {
          "y": 280,
          "opacity": 1
        }
      },
      "clipDefaults": {
        "fontSize": 14,
        "fontColor": "white"
      },
      "clips": [
        {
          "text": "Elder Care & Nursing Home",
          "type": "text"
        }
      ]
    }
  ]
}

Here are the five tracks in the video:

Add the final track

The final track is the main block of text. For this, a textArea clip is used.

Copy to clipboard
{
  // Global parameters
  ...

  "tracks": [
    {
      // Track one
      ...
    },
    {
      // Track two
      ...
    },
    {
      // Track three
      ...
    },
    {
      // Track four
      ...
    },
    {
      // Track five
      ...
    },
    {
      "x": 950,
      "width": 300,
      "height": 300,
      "keyframes": {
        "0": {
          "y": 290
        },
        "1000": {
          "y": 300
        }
      },
      "clipDefaults": {
        "textAlign": "left",
        "fontSize": 32,
        "fontType": "Merriweather",
        "fontColor": "white"
      },
      "clips": [
        {
          "text": "The more we care, the  more beautiful life becomes.",
          "type": "textArea"
        }
      ]
    }
  ]
}

Here's the whole video with all six tracks:

Define variables

The video is now complete, but the manifest cannot yet be used as a template from which to create more videos. In order to unleash the full potential of this manifest, add variables so that you can change different elements of the video.

Define these variables before the tracks:

Copy to clipboard
{
  // Global parameters
  ...

  "vars": {
    "bgColor": "#8e998a",
    "imageUrl": "docs/retirement",
    "sponsoredText": "Elder Care & Nursing Home",
    "titleText": "The more we care, the  more beautiful life becomes.",
    "ctaText": "Read more"
  },
  "tracks": [
    {
      // Track one
      ...
    },
    {
      // Track two
      ...
    },
    {
      // Track three
      ...
    },
    {
      // Track four
      ...
    },
    {
      // Track five
      ...
    },
    {
      // Track six
      ...
    }
  ]
}

Now, replace the relevant values in the tracks, with variables. For example, bgColor in the first track:

Copy to clipboard
{
  // Global parameters
  ...

  "vars": {
    "bgColor": "#8e998a",
    "imageUrl": "docs/retirement",
    "sponsoredText": "Elder Care & Nursing Home",
    "titleText": "The more we care, the  more beautiful life becomes.",
    "ctaText": "Read more"
  },
  "tracks": [
    {
      "clips": [
        {
          "color": "{{bgColor}}",
          "type": "color"
        }
      ]
    },
    {
      // Track two
      ...
    },
    {
      // Track three
      ...
    },
    {
      // Track four
      ...
    },
    {
      // Track five
      ...
    },
    {
      // Track six
      ...
    }
  ]
}

Now you have your manifest, you can generate a video using the create_video endpoint of the Upload API, plus you can create a CLT template file from your manifest, and generate a different video using the variables.

Generate a video using the create_video endpoint

You can generate a video using the create_video endpoint of the Upload API.

For example:

Copy to clipboard
https://api.cloudinary.com/v1_1/demo/video/create_video -X POST --data 'public_id=test_video&resource_type=video&manifest_json={ "type": "video", "width": 1280, "height": 720, "duration": 10, "fps": 30, "tracks": [ { "clips": [ { "color": "%238e998a", "type": "color" } ] }, { "width": 920, "height": 720, "x": 0, "y": 0, "clipDefaults": { "clipDuration": 10000, "clipEffect": { "name": "KenBurns", "easing": "easeinout", "zoom": 1.2, "center": [ 0.5, 0.5 ], "direction": "backward" } }, "clips": [ { "media": [ "docs/retirement", "image", "upload" ], "type": "image", "transformation": "c_fill" } ] } ] }&timestamp=173719931&api_key=436464676&signature=a781d61f86a6f818af'

You can use an unsigned upload preset instead of signing the request.

Tip
You can see examples of signed and unsigned requests in the Postman collection.

Create a CLT template file

To create a CLT template file:

  1. Copy your manifest_json into a file called CltManifest.json.
  2. Zip up the CltManifest.json file. You can give the ZIP file any name, e.g., care-template.zip.
  3. Upload the ZIP file to Cloudinary as a video or auto asset type. If using the Media Library to upload the ZIP file, it will automatically be recognized as a CLT file, and uploaded as a video asset type, making its delivery URL of the form:
    https://res.cloudinary.com/<cloud_name>/video/upload/<file_name>.clt.

Generate a video from a CLT template file

To generate a video from a CLT template file:

  1. Start with the delivery URL of the CLT template, e.g.: https://res.cloudinary.com/demo/video/upload/docs/care-template.clt
  2. Set the extension to a video format, such as .mp4, e.g.: https://res.cloudinary.com/demo/video/upload/docs/care-template.mp4
  3. Optionally set any global settings that you want to apply to the video and any variables that you want to change using the custom_function transformation parameter with the function type set to render (fn_render in URLs), e.g.: fn_render:fps_25;vars_(bgColor_mediumpurple;imageUrl_oldhands;titleText_Loving%20and%20caring%20always%20go%20hand%20in%20hand)

For example:

Side-by-side editor and viewer

You can try out some example templates and create your own, using the Side-by-side editor and viewer.

Tip
Ensure that in the Settings you set the Cloud Name to demo.

Reference

Global video parameters

Global video parameters apply to the video as a whole.

Parameter Description
width Required. The width of the video in pixels.
height Required. The height of the video in pixels.
duration The duration of the video in seconds. Default: 10.
frame_rate The frames rate (frames per second) of the video. Default: 20.

Example:

Copy to clipboard
{
    "width": 1280,
    "height": 720,
    "duration": 12,
    "frame_rate": 30,
    ...
}

Track parameters

Track parameters specify the behavior of a track. Tracks can be thought of as layers. The first track in the manifest is the bottom layer, and the last track is the top layer.

Parameter Description
x The horizontal position of the track in pixels relative to the top left of the video.
y The vertical position of the track in pixels relative to the top left of the video.
width The width of the track in pixels.
height The height of the track in pixels.
startOffset The time at which the track starts relative to the whole video.
endOffset The time at which the track ends relative to the whole video.
radius A value indicating how much to round the corners of the track. Specify the number of pixels (e.g. 100) or a percentage (e.g. 50%).
keyframes See Keyframes.
clipDefaults The default parameters to be used by all clips in the track. Specify any of the relevant clip parameters.
clips See Clip parameters.

Example:

Copy to clipboard
"tracks": [
  {
      "radius": "5%",
      "clips": [
          { 
              "color": "#639DD5", 
              "type": "color"  
          }
      ]
  },
  {
    "keyframes": {
        "0": {
            "y": 300,
            "opacity": 0
        },
        "2000": {
            "y": 270,
            "opacity": 1
        }
    },
    "x": 750,
    "clips": [
        {
            "text": "1917, Private collection",
            "type": "text",
            "startOffset": 0,
            "endOffset": 3500,
            "fontSize": 24,
            "fontColor": "#ffd"
        }
    ]
  },
]

Keyframes

Keyframes are specific moments in time within a track. Each keyframe is an object with a key specifying the time. Specify either:

  • The number of milliseconds from the start of the track (e.g. 2000).
  • A percentage relative to the track duration (e.g. 40%).

If specifying keyframes, you must specify one for time 0.

The parameters of each key are:

Parameter Description
x The horizontal position of the track in pixels relative to the top left of the video.
y The vertical position of the track in pixels relative to the top left of the video.
opacity The level of opacity. 1 means opaque, while 0 is completely transparent. Range: 0 to 1.

Example:

Copy to clipboard
"keyframes": {
  "0": {
      "y": 300,
      "opacity": 0
  },
  "2000": {
      "y": 270,
      "opacity": 1
  },
  "3000": {
      "y": 270,
      "opacity": 1
  },
  "4000": {
      "y": 300,
      "opacity": 0
  }
},

Clip parameters

Clips can be of type: image, video, text, textArea, color or nested.

The parameters to use in the clip depend on the type.

Image and video clip parameters

Use the image or video type for media clips.

Parameter Description
type Required. The type of clip. Possible values: image, video.
media Required. An object containing the URL or public ID of the image or video, the asset type (image or video), and the delivery type (e.g. upload or fetch).
transformation A Cloudinary transformation in URL syntax.
clipDuration The duration of the clip in milliseconds.
transitionDuration The duration of the transition between clips in milliseconds.
transition The name of the transitions (from GL transitions).
clipEffect An object of effect parameters.

Example:

Copy to clipboard
"clipDefaults": {
  "clipDuration": 3000,
  "transitionDuration": 1000,
  "transition": "wind",
  "clipEffect": {
    "name": "KenBurns",
    "easing": "easein",
    "zoom": 0.9,
    "center": [1, 1],
    "direction": "forward"
  },
},
"clips": [
    {
        "media": "videogen/modigliani/Young-Woman",
        "type": "image"
    },
    {
        "media": "video/upload/dog",
        "type": "video"
    },
    {
        "media": "image/upload/videogen/modigliani/Gypsy-Woman",
        "type": "image"
    }
]

Text clip parameters

Use the text type for short, one-line texts.

Parameter Description
type Required. The type of clip. Possible values: text.
text The text to display.
textMaxWidth The maximum width of the text (it will squeeze the text to fit the width).
textAlign The alignment of the text within the maximum width. Possible values: left, right, center. Default: left (or center if invalid value).
fontSize The size of the font in pixels.
fontType The type of font. Google fonts are supported.
fontColor The color of the font. Specify a name or hex code (e.g. #ffd).
textBgColor The background color of the text. Specify a name or hex code (e.g. #ffd).
textBgRadius A value indicating how much to round the corners of the background color rectangle. Specify the number of pixels (e.g. 100) or a percentage (e.g. 50%).
textPadding The amount of padding to add to the background color rectangle. Specify the number of pixels.

Example:

Copy to clipboard
"clips": [
  {
    "text": "1919, National Gallery of Art Washington, D.C.",
    "type": "text",
    "textMaxWidth": 200,
    "width": 400,
    "textAlign": "center",
    "fontSize": 32,
    "fontType": "Lato",
    "fontColor": "#ffd",
    "textBgColor": "black",
    "textBgRadius": "50%",
    "textPadding": 10
  }
]

Text area clip parameters

Use the textArea type for multi-line texts.

Parameter Description
type Required. The type of clip. Possible values: textArea.
textAlign The horizontal alignment of the text within the maximum width. Possible values: left, right, center. Default: left (or center if invalid value).
textVerticalAlign The vertical alignment of the text within the maximum height. Possible values: top, middle, bottom. Default: top (or middle if invalid value).
fontSize The size of the font in pixels.
fontType The type of font. Google fonts are supported.
fontType.family The same as fontType.
fontType.weight The font weight. Possible values: regular, bold. A number is also supported 100 to 900.
fontColor The color of the font. Specify a name or hex code (e.g. #ffd).

Example:

Copy to clipboard
{
  "startOffset": 15000,
  "clipDefaults": {
    "textAlign": "center",
    "textVerticalAlign": "middle",
    "fontSize": 72,
    "fontColor": "white",
    "fontType": "Roboto"
  },
  "clips": [
    {
      "type": "textArea",
      "text": "1919, National Gallery of Art Washington, D.C."
    }
  ]
}

Color clip parameters

Use the color type to create a colored rectangle.

Parameter Description
type Required. The type of clip. Possible values: color.
color The color to fill the rectangle. Specify a name or hex code (e.g. #ffd). You can also specify transparent to make the rectangle unfilled.
border the width of the border in pixels.
borderColor The color of the border. Specify a name or hex code (e.g. #ffd).
radius A value indicating how much to round the corners of the rectangle. Specify the number of pixels (e.g. 100) or a percentage (e.g. 50%).

Example:

Copy to clipboard
"clips": [
  { 
    "color": "#639DD5", 
    "type": "color",
    "radius": "5%"
  }
]

Nested clip parameters

Use the nested type to insert a track as a clip within a parent track. The nested track is defined at the top level of the JSON.

Parameter Description
type Required. The type of clip. Possible values: nested.
nested The name of the top-level track to insert.

Example:

Copy to clipboard
 "clips": [
  {
    "nested": "scene1",
    "type": "nested"
  },
  {
    "nested": "scene2",
    "type": "nested"
  },
  {
    "nested": "scene3",
    "type": "nested"
  }
]
"scene1": [
  {
    "clips": [
      {
        "media": [
          "sample",
          "video",
          "upload"
        ],
        "type": "video",
        "transformation": "du_6"
      }
    ]
  },
  ...
],
"scene2": [
  ...
],
"scene3": [
  ...
]

Effects

You can apply effects to clips.

Ken Burns

The Ken Burns effect allows you to zoom in and pan from a full frame to a new center, or zoom out to full frame. It is used on images.

Parameter Description
name The name of the effect: KenBurns.
easing The type of easing. Possible values: linear, ease, easein, easeout, easeinout.
zoom The level of zoom.
center The coordinates to center the zoom [x, y], where x and y are floats representing a fraction of the width of height (e.g. [0.5,0.5] is the center of the image).
direction The direction of zoom. Possible values:

  • forward: zoom in
  • backward: zoom out

Example:

Copy to clipboard
"clipEffect": {
  "name": "KenBurns",
  "easing": "easein",
  "zoom": 0.9,
  "center": [1, 1],
  "direction": "forward"
},

Template vars

Using the vars key, you can specify variables that can be used in templates. Variables allow you to replace elements of your video, such as text, colors and media etc., to dynamically create new videos through the transformation URL.

The values set in vars are used by default if not overwritten by the transformation URL.

Variables are embedded in the manifest using the variable name surrounded by double curly braces.

Example:

Copy to clipboard
{
  "vars": {
    "someColor": "red",
    "someRadius": 5
  },
  "tracks": [
    {
      "type": "color",
      "color": "{{someColor}}",
      "radius": "{{someRadius}}"
    }
  ]
}

✔️ Feedback sent!

Rate this page: