Skip to content

Automating Translations of Asset Metadata

Given the ever-growing volume of digital assets organizations must create and manage, searching for and locating assets requires accurate metadata. The tasks involved in generating metadata, the source of which is manually added, become dauntingly labor-intensive if you must translate it into multiple languages for various geographical regions.

This tutorial walks you through the steps of creating an automated and streamlined translation workflow for speed and efficacy.

The process calls for creating a workflow for automating the translation. To ensure accuracy of the result, you tag the assets for approval and regularly create and dispatch a report of each of the new translations for approval.

You need the following software for this tutorial:

The sections below describe the procedure step by step. For a summary, watch the Video Recap below.

Note:

Once the Media Library has added the metadata for an existing or newly uploaded asset, the system automatically translates that data and updates the asset in only a few minutes. In case approvals for translations are required, the system automatically notifies the approvers by email, who can make corrections, if necessary, before granting approval by deleting the corresponding tag.

In this tutorial, you will store asset metadata for two values, Description and Alt, and generate translations in English, Spanish, and Hebrew.

Follow the steps below:

  1. If you don’t have one yet, sign up for a free Cloudinary account, which offers you credits for transformations, storage, and bandwidth. Alternatively, upgrade to a paid account for more credits and other features.

  2. Add the metadata fields for your assets with either the Media Library administration tool or the Metadata API.

  3. Add three fields for each value and specify the language code in the external ID. For example, add description_en, description_es, and decription_he for Description. For a list of the supported languages and their corresponding codes, see the related documentation of AWS Translate.

Structured metadata fields with corresponding external IDs
Structured metadata fields with corresponding external IDs
  1. Create a JSON configuration file in which to update the workflow settings without revising the workflow itself. Add to the content the metadata fields for translation, the three languages you support, and the approvers for the translations. See the example below.
{ "metadata_fields_to_translate":["alt","description"], "languages":["he","es","en"], "approvers": [ { "he": [ { "email": "hebrew_approvers@example.com", "name": "Hebrew Approver" } ] }, { "es": [ { "email": "spanish_approvers@example.com", "name": "Spanish Approver" } ] }, { "en": [ { "email": "english_approvers@example.com", "name": "English Approver" } ] } ] }
Code language: JSON / JSON with Comments (json)
  1. Upload the configuration file to your Cloudinary account with the public ID

    /config/media-flows/translation-settings.json.

Next, set up three media flows to manage the workflow.

The Process Webhooks media flow can initiate all the workflows that you manage in MediaFlows. In this tutorial, Process Webhooks filters out and prepares the data of the webhooks sent from Cloudinary for translation.

Media flow for Process Webhooks
Media flow for Process Webhooks

Add to your account the first flow, which contains a Catch Webhook block, with the share link. Accessing that link enables you to select the project to which to add to the flow, ensuring that the project is one that’s linked to the Cloudinary account you configured above.

Once you have added the flow, edit the Catch Webhook block and copy the URL: Go to your Cloudinary console, click the Settings icon (in the form of a gear) at the top and then the Upload tab. Under Notification URL, enter the URL so that all webhook notifications will be passed to the Process Webhooks media flow. Finally, remember to save your changes before navigating away from this page.

The notification URL, which matches the webhook URL of the Process Webhooks media flow
  1. The Media Function block loads the configuration file you created above.

    Before building a new flow with a MediaFlows Function block, which is off by default, contact support@cloudinary.com with a request to switch it on. No need to do that for this tutorial, however.

  2. The Filter block filters the notifications so that the flow sends only notifications that reflect metadata changes, i.e., the types upload and resource_metadata_changed. For details on Cloudinary’s notification types, see the related documentation.

    Since upload notifications are for just one asset, the flow simply passes the asset data along the workflow. On the other hand, “resource_metadata_changed” notifications could mean that multiple assets have been updated so the flow processes them, one by one, with the Using Fan Out block.

    For simplicity, the flow has added a block to each of the workflows that prepare asset metadata. That way, you can obtain that data with event.data.current_payload in the block in step 3 below—no matter the workflow.

  3. The next block determines the required translations based on the configuration parameters and asset metadata. If at least one language variant of the field is populated and one or more language variants of the field are empty, the block adds the field to the list of fields to be translated. Below is the code snippet along with comments.

module.exports = function (event, cloudinary, callback) { const config = event.data.mf_qVkNNZn34P2u0ExZIiAb.config; //The configuration file as loaded in a previous block. const assetData = event.data.current_payload; //Load asset data from the previous block. if(assetData.metadata !== undefined) { //Check if the metadata has been set; if not, end the flow. const response = { "public_id": assetData.public_id, "resource_type": assetData.resource_type, "type": assetData.type, "fields_to_translate": [] }; //Prepare the response in the required format for the next flow. for(const field of config.metadata_fields_to_translate) { //Loop over each of the metadata fields defined in the configuration file. const translationSettings = { "metadataField": field } for(const language of config.languages) { //Loop over each of the languages defined in the configuration file. const currentLangField = assetData.metadata[`${field}_${language}`]; if(currentLangField === undefined) { //This field hasn't been set so translate it. if(translationSettings.outputLanguages === undefined) { translationSettings.outputLanguages = language } else { translationSettings.outputLanguages += `,${language}`; } } else if (translationSettings.sourceLanguage === undefined) { //This field has been set and is the first language so use it as the source. translationSettings.sourceText = currentLangField translationSettings.sourceLanguage = language } } if(translationSettings.sourceLanguage !== undefined && translationSettings.outputLanguages !== undefined) { //If you have a sourceLanguage setting and at least one field to translate, add this field to the list of translations. response.fields_to_translate.push(translationSettings) } } if(response.fields_to_translate.length > 0) { //Continue the flow as long as there is one field to translate. callback(null, response) } } }
Code language: JavaScript (javascript)

A MediaFlows function that determines the fields to be translated

  1. The final block, Send HTTP Request, calls the next media flow, which translates and updates the metadata fields.
Important

Be sure to set the Request URI of the final block to the Catch webhook URL of the next media flow you’ll create.

Now add a media flow—to be called by the one you just created—for processing translations. Each of the flow’s Translate metadata fields contains an asset’s metadata to be translated.

Add the new flow to your account with the share link. Be sure to update the previous flow so that it calls this new flow’s Catch Webhook block.

You’ll build a custom block for translating metadata later; see the section Creating an AWS Block

Media flow without the custom block
Media flow without the custom block

The flow performs three steps:

  1. Loop over each of the metadata fields that require translation and call the custom block with sourceText, sourceLanguage, and outputLanguages.
  2. When all the translations are complete, group the updates together and update the asset.
  3. Add the tag approval_needed_<language> to each of the translations so that the approvers can search for and locate the assets that are up for review.

Next, set up the Translation Report media flow, which notifies you if any translations require approval and, if so, sends email requests to the approvers concerned. This flow regularly generates and sends reports to the approvers.

Relative to the other flows, Translation Report works slightly differently. Instead of triggering and running the process with an HTTP request, this flow does that according to the schedule set by a cron string with the Scheduler trigger. For this tutorial, set the schedule to every Friday with the cron expression 0 12 * * FRI.

To help testing, included in Translation Report is a Catch Webhook block that manually starts the flow by loading the webhook URL in a browser. Feel free to delete Manual kick off in production.

As before, add this flow to your account with the share link.

Translation report Media flow
Translation report Media flow

Translation Report starts with loading of the configuration file, which then loops over each of three languages and performs a search based on the approval tag, e.g., approval_needed_es for Spanish. If assets that require approval exist, the flow sends the information to a SendGrid block, which then dispatches template-based emails to the approvers concerned. At the bottom of each of the emails is a Review Assets button, clicking which takes the approver to the assets in question in the Media Library.

Notification email to approvers for Spanish translations
Notification email to approvers for Spanish translations

Next, create a block for translation tasks. MediaFlows does not offer such a block so build one by calling a Lambda function running on your AWS account from a flow on the MediaFlows Developers Platform.

You must deploy Lambda in the U.S. East 1 (us-east-1) region.

Do the following:

1. Create an AWS Lambda to perform the required translations. For reference, see the sample code on GitHub.

The Lambda processes sourceText, sourceLanguage, and outputLanguages, subsequently returning the translated text. The flow passes a callback_webhook parameter to return the response and control to MediaFlows. For details on how to set up the sample code, see the README on GitHub.

2. Configure the Lambda in your MediaFlows account.

  • Grant permissions to MediaFlows to run the Lambda by following the procedure in the related documentation. Next, set up a role and permission for MediaFlows’ AWS account (201991434007) for invoking the Lambda. Below are the trust policy and permission policy. Be sure to update resource in the permission policy with the ARN for the Lambda.
{ "Version": "2012-10-17", "Statement": { "Effect": "Allow", "Sid": "MediaFlowsTrustPolicy", "Principal": { "AWS": "201991434007" }, "Action": "sts:AssumeRole" } }
Code language: JSON / JSON with Comments (json)
{ "Version": "2012-10-17", "Statement": { "Sid": "MediaFlowsTranslateBlock", "Effect": "Allow", "Action": [ "lambda:InvokeFunction", "lambda:InvokeAsync" ], "Resource": "arn:aws:lambda:<region>:<account_id>:function:<function_name>" } }
Code language: JSON / JSON with Comments (json)
  • Install the MediaFlows CLI tool with these command lines:

npm install -g @cloudinary/mediaflows-cli

  • Run this command to open a browser window in which to grant access to the CLI tool to your MediaFlows account:

mediaflows login

  • Configure the block by typing this command:

mediaflows init

Type in the values for the attributes that are displayed:

  • Block name: The name of the block in the MediaFlows UI
  • Block description: The description in the block’s dialog box for editing
  • Block icon: Optional. The URL of the block’s icon
  • Documentation link: Optional. The URL of block-related documentation
  • AWS resource name: The ARN for the Identity and Access Management (IAM) role you created
  • AWS Lambda function name: The name of the Lambda function

AWS Lambda invocation type: RequestResponse for synchronous invocation of Lambda  and Event for asynchronous invocation of Lambda. For this tutorial, type Event.

3. Deploy the changes with this command:

mediaflows publish

Subsequently, after logging in to your Cloudinary account, you’ll see your block under MediaFlows Connect. Add the block to the flow and edit the block. The related dialog box is then displayed, showing a text field for Set input source.

Default fields for block
Default fields for block

To make the block more flexible, i.e., enable configuration of the data source, edit the .connect.json file that was generated by the mediaflows init command before and update the form_fields array, like this:

{ "name": "AWS Translate", "description": "Translates the source text", "icon_url": "", "documentation_url": "", "function_data": { "role_arn": "<<The ARN for the IAM role>>", "aws_lambda_function_name": "<<The name of the AWS Lambda>>", "invocation_type": "Event" }, "external_id": "<<The ID set by the MediaFlows CLI>>", "form_fields": [ { "value": "en", "key": "sourceLanguage", "label": "The source language", "description": "The language code of the source text" }, { "value": "Table of breakfast food, including savory and sweet plates", "key": "sourceText", "label": "The text to be translated", "description": "The source of the text to be translated" }, { "value": "es,he", "key": "outputLanguages", "label": "The output languages", "description": "A comma-separated list of the languages to which the source text is to be translated" }, { "value": "{}", "key": "dynamic_template_data", "label": "A name for the miscellaneous details", "description": "A description of the miscellaneous details to be included in the response", "type": "CODE", "options": { "height": "150px", "theme": "monokai", "showGutter": false, "width": "100%", "mode": "json" } } ] }
Code language: JSON / JSON with Comments (json)

Here are the attributes:

  • value: The default value of the block at creation
  • key: The name of the variable to be passed to the Lambda
  • label: The label of the block’s dialog box for editing
  • description: The Help text for the field that is displayed in the dialog box for editing
  • type: Optional. The field type, which defaults to String. For a list of the types, see the MediaFlows documentation.
  • options: Optional. The field’s other settings, which vary according to the field type. In the code above, where the field type is CODE, the options fields (height, theme, etc.) specify the size of the code editor.

Next, perform three more steps:

  1. Republish the block with the command mediaflows publish, which does not update the existing instances of the block, however. So, manually add the block with the updated fields to the flow.
  2. Change the default values of the block’s extra fields to the values from the previous block, as shown in the screenshot below. Updated block with data from the previous block
  3. Connect the newly created block to the flow and test the entire workflow. Completed translation media flow
  • Automate the generation of the source text: Rather than manually entering the source text, automatically generate it with a third-party API like the one offered by CloudSight.
  • Update the Approval workflow: Simplify the approval process by adding more relevant details to the email requests. You could also link from the email to the approval UI, saving the approvers the login step.
Back to top