The Portable Document Format (PDF) is a popular format developed by Adobe for delivering formatted text and images. A PDF file includes a complete description of all layout elements, including text, fonts, images and layers, ensuring that the file will look identical when opened on any device. The PDF format also has the big advantage of compressing high-quality files to a relatively small file size.
Cloudinary has built-in support for uploading, managing, converting, creating, analyzing and delivering PDF files, as well as third party add-ons for creating PDFs from Office documents, and extracting the text from PDFs. Furthermore, Cloudinary allows you to manage your PDFs like any other image file, benefiting from easy and simple transformations, fast CDN delivery, and more.
The following sections give a brief summary and example of each of the features available for working with PDfs as well as links to further information and details in the Cloudinary documentation.
Uploading PDFs to your Cloudinary account is really simple and straightforward. You can upload a PDF file using any of the options available for uploading files to your Cloudinary account. If you upload a PDF with multiple pages, the upload response will include a
pages parameter indicating the number of pages within the file.
If you need to determine the number of pages in a PDF that has already been uploaded to your account, you can use the Admin API resource method and include the
pages parameter set to ‘true’. For example:
Cloudinary treats PDF files the same as any other image file. This is a great feature, as it means you can upload PDF files and then deliver them using a standard Cloudinary dynamic URL, for example:
In our efforts to make sure no malicious files are distributed through Cloudinary, free customers can’t deliver files in the PDF format by default. Please contact support if you want to enable this feature for your account.
Delivering the PDF in a different image format is as simple as changing the file extension to the desired format. This will deliver the first page of the PDF by default, but you can use the
page parameter (
pg in URLs) to select a different page. You can also use any other image transformation to deliver the new file, bringing Cloudinary’s advanced image transformation functionality to your PDFs.
For example, cropping out a specific 300×300 square from the second page, with rounded corners and a black background:
PDFs can also be a great way to organize your content and deliver a set of images in a single file. Cloudinary provides the multi method for creating a PDF file from images in your Cloudinay account that all have the same tag. All the images are then merged into a single multi-page PDF, where each image is a separate page, and they are ordered alphanumerically by their public ID. If you want the images to be included in a specific order, make sure to rename them accordingly (e.g., 01img, 02img, etc). The multi method is also useful for creating animated images, so make sure to specify the
For example, to create a PDF file from a all images that have the tag “animal”, and limit all images to a size of 400×600:
If you need a way to automatically convert your Office documents, spreadsheets and presentations to PDF documents, Cloudinary offers the Aspose’s document conversion add-on.
You start by uploading the Office file to your Cloudinary account as a
raw file and adding the
raw_convert parameter with a value of ‘aspose’.
Once the conversion has finished, a PDF with the same public ID is added to your Cloudinary account as type
image (the original still exists as type
raw). In the example above, a new image (PDF) is added called
sample_doc.docx, and can be delivered and/or transformed as any other PDF uploaded to your account.
Analyzing the context of a PDF file can be useful in a number of scenarios. It may be as simple as knowing if the file has any text at all, finding out what the text contains and tagging the file appropriately, or even to redact or blur specific text in the file. Cloudinary provides the OCR Text Detection and Extraction add-on for extracting all the text from a PDF, and it includes a summary of the text, coordinates of the captured text, as well as every individual text element.
To request the detected text, add the
ocr parameter with a value of
For example, when using the upload method:
Rasterizing a PDF is useful if you still want to deliver a PDF file, but also want to resize the file, or maybe add an overlay. Rasterization reduces the PDF to a single flat pixelated layer (as opposed to vector based multiple layers). Simply add the
rasterize flag to the dynamic URL (
fl_rasterize) and add any resizing or overlays as desired. The following example
rasterizes a PDF, scales it down to a width of 800 pixels and adds the Cloudinary icon to the top right corner of each page:
Cloudinary has some great features that make working with PDF files simple and easy, with multiple options for uploading PDFs to your account, extensive transformations available, fast CDN delivery, and multiple methods for managing and extracting useful information from your files. And since it’s a simple matter to convert PDF files to other image formats, they can also benefit from the multitude of transformations effects available to all images. Make sure to also check the Cloudinary documentation for more details, and if you don’t have a Cloudinary account, make sure to sign up for your free account now and try it out.