MEDIA GUIDES / Front-End Development

How to Capture Images from Videos Using OpenCV

Have you ever wanted to pull still images from a video? Whether you’re analyzing motion, building a dataset, or just capturing the perfect frame, you’re not alone—almost 93% of the global population online are watching videos, and processing that content is more important than ever.

One powerful tool for working with videos is OpenCV, a free, open-source library used for computer vision and machine learning. Originally written in C++, OpenCV now works with popular languages like Python and Java, making it accessible to developers, researchers, and hobbyists alike.

One common task in video processing is extracting individual frames from a video file. This is great for lots of computer vision tasks like finding objects, tracking movement, and making datasets for machine learning. In this guide, you’ll learn how to capture images from videos using OpenCV. Let’s get into it!

In this article:

How Do You Capture Images from a Video?
Capture Images from Video Using OpenCV
Streamlining Your Workflow with Cloudinary
When to Use Cloudinary vs. OpenCV

How Do You Capture Images from a Video?

Capturing images from videos might sound like a technically challenging task, but it’s actually simpler than you might think. Videos are essentially a sequence of individual images (which we call “frames”) played back at a specific speed (measured in frames per second, or FPS). And by extracting these frames, you can capture still images from any point in the video.

There are several ways to extract frames from a video, depending on your needs and the tools you have. For example, you can use programming libraries like OpenCV in Python or a media player software like VLC. However, in this guide, we’ll focus on how to capture images from videos using OpenCV.

Capture Images from Video Using OpenCV

Before we learn how to capture images from videos using OpenCV, we need to create a virtual environment (to isolate dependencies and avoid conflict with other projects on your computer) and install the necessary dependencies. To create and activate a virtual environment, run the following command:

# For Windows
python -m venv myenv
myenv\Scripts\activate

# For macOS/Linux
python3 -m venv myenv
source myenv/bin/activate

Next, install OpenCV:

pip install opencv-python

Step 1 – Setup OpenCV

Here, we’ll import necessary libraries, set up the video file path, create a directory to store extracted frames, and initialize the video capture object while checking if the video opens successfully. Also, we’ll retrieve the video’s frame rate (FPS).

import cv2
import os

video_path = "swimmer.mp4"  # Replace with your actual video file

os.makedirs("extracted_frames", exist_ok=True)

cap = cv2.VideoCapture(video_path)

if not cap.isOpened():
    print("Error: Could not open video.")
    exit()

fps = int(cap.get(cv2.CAP_PROP_FPS))  # Get video frame rate

The cv2.VideoCapture object from OpenCV is used to load a video file or capture live video from a camera. It has the following methods:

read(): For reading the next frame from the video.
isOpened(): Checks if the video file or camera is opened successfully.
get(): Used for retrieving properties of the video, such as frame rate or total frame count.
cv2.CAP_PROP_FPS: A VideoCapture property that returns the video frame rate.

Step 2 – Extract Images as Frames and Save to Disk

Next, we’ll extract frames from the video at each second and save them as images in the extracted_frames directory.

frame_count = 0
second = 0

while True:
    cap.set(cv2.CAP_PROP_POS_FRAMES, second * fps)  # Jump to the frame at each second
    ret, frame = cap.read()
    if not ret:
        break


    frame_filename = os.path.join("extracted_frames", f"frame_{frame_count}.jpg")
    cv2.imwrite(frame_filename, frame)  # Save frame as an image
    frame_count += 1
    second += 1  # Move to the next second

cap.release()
cv2.destroyAllWindows()

print(f"Frames extracted and saved successfully")

In this example, we’re extracting the frames from the video at one frame per second. Technically, the number of frames in a video depends on its length and frame rate. For example, a 20-second video recorded at 24 FPS will have a total of 480 frames:

24 FPS * 20 seconds = 480 frames).

However, to capture a single frame per second, that is, 1 FPS, the total number of frames in a 20-second video will be:

1 FPS * 20 seconds = 20 frames.

Capture Frames at Custom FPS

Let’s assume you want to capture all the 480 frames in a 20-second video recorded at 24 FPS or any frame rate of your choice. To do that, we can modify our code to allow us set a custom FPS to capture the images:

import cv2
import os

video_path = "swimmer.mp4"  # Replace with your actual video file
output_folder = "extracted_frames"
custom_fps = 5  # Change this to your desired FPS

os.makedirs(output_folder, exist_ok=True)

cap = cv2.VideoCapture(video_path)

if not cap.isOpened():
    print("Error: Error opening the video file.")
    exit()

video_fps = int(cap.get(cv2.CAP_PROP_FPS))
frame_interval = int(video_fps / custom_fps)  # Calculate the interval to extract frames

frame_count = 0
saved_frames = 0

while True:
    ret, frame = cap.read()
    if not ret:
        break

    # Save frame at the specified interval
    if frame_count % frame_interval == 0:
        frame_filename = os.path.join(output_folder, f"frame_{saved_frames}.jpg")
        cv2.imwrite(frame_filename, frame)
        saved_frames += 1

    frame_count += 1

cap.release()
cv2.destroyAllWindows()

Streamlining Your Workflow with Cloudinary

While OpenCV makes an excellent choice for capturing frames from videos clips, there are scenarios where it might not be the most efficient solution. For instance, if you’re working with a large dataset where performance and scalability are essential, a simple yet powerful solution like Cloudinary is a better alternative.

With Cloudinary, you can extract or capture images from as many videos as you want without worrying about performance or developer experience. In this section, we’ll show you how to use Cloudinary Python SDK to capture images from videos in your Python application.

Step 1 – Setting Up Cloudinary

Before you begin, you’ll need to:

Sign up for a free Cloudinary account if you don’t have one already. After signing up, you’ll need your Cloudinary credentials (cloud name, API key and secret) for programmatic access to your product environment. You can follow this guide to get yours.
Install the Cloudinary Python SDK:
```
pip install cloudinary
```

Step 2 – Upload Video to Cloudinary And Extract Its Images

import cloudinary
import cloudinary.uploader
import cloudinary.api
import requests
import os

cloudinary.config(
    cloud_name="your_cloud_name",
    api_key="your_api_key",
    api_secret="your_api_secret"
)

# Function to upload a video to Cloudinary
def upload_video(video_path):
    response = cloudinary.uploader.upload(
        video_path,
        resource_type="video"
    )
    return response["public_id"], float(response["duration"])  # Return the public ID of the uploaded video and its duration

# Function to extract frames at one frame per second
def extract_frames_one_per_second(public_id, output_folder, duration):
    for timestamp_sec in range(0, int(duration)):
        # Generate the frame URL using Cloudinary's transformation
        frame_url = cloudinary.CloudinaryImage(public_id).build_url(
            resource_type="video",
            format="png",
            transformation=[
                {"start_offset": timestamp_sec}  # Extract frame at the specified timestamp
            ]
        )


        # Download the frame
        response = requests.get(frame_url)
        if response.status_code == 200:
            output_path = os.path.join(output_folder, f"frame_at_{timestamp_sec}_sec.png")
            with open(output_path, "wb") as f:
                f.write(response.content)
        else:
            print(f"Error: Could not extract frame at {timestamp_sec} seconds.")

if __name__ == "__main__":
    video_path = "swimmer.mp4"
    output_folder = "extracted_frames"


    # Create the output folder if it doesn't exist
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)


    public_id, duration = upload_video(video_path)


    extract_frames_one_per_second(public_id, output_folder, duration)

Step 3 – Capture Images at a Custom FPS

The previous example shows how to capture one image per second from the uploaded video, but we can modify the code to allow us to set a custom FPS. Here, we define a function extract_frames_at_fps to extract images from the video at 2 FPS:

import cloudinary
import cloudinary.uploader
import cloudinary.api
import requests
import os

cloudinary.config(
    cloud_name="your_cloud_name",
    api_key="your_api_key",
    api_secret="your_api_secret"
)

# Function to upload a video to Cloudinary
def upload_video(video_path):
    response = cloudinary.uploader.upload(
        video_path,
        resource_type="video"
    )
    return response["public_id"], float(response["duration"])  # Return public ID and duration

# Function to extract frames at a custom FPS
def extract_frames_at_fps(public_id, output_folder, duration, fps):
    interval = 1 / fps  # Time interval between frames
    timestamps = [round(i * interval, 2) for i in range(int(duration * fps))]




    for timestamp_sec in timestamps:
        # Generate frame URL using Cloudinary's transformation
        frame_url = cloudinary.CloudinaryImage(public_id).build_url(
            resource_type="video",
            format="png",
            transformation=[
                {"start_offset": timestamp_sec}  # Extract frame at the specified timestamp
            ]
        )


        # Download the frame
        response = requests.get(frame_url)
        if response.status_code == 200:
            output_path = os.path.join(output_folder, f"frame_at_{timestamp_sec}_sec.png")
            with open(output_path, "wb") as f:
                f.write(response.content)
        else:
            print(f"Error: Could not extract frame at {timestamp_sec} seconds.")

if __name__ == "__main__":
    video_path = "swimmer.mp4"
    output_folder = "extracted_frames"
    fps = 2  # Set your desired FPS here


    # Create the output folder if it doesn't exist
    if not os.path.exists(output_folder):
        os.makedirs(output_folder)


    public_id, duration = upload_video(video_path)


    extract_frames_at_fps(public_id, output_folder, duration, fps)

When to Use Cloudinary vs. OpenCV

Choosing between OpenCV and Cloudinary depends on your specific use case, the scale of your project, and your performance requirements. Here’s a breakdown of the capabilities of each tool to help you decide which is best for your needs.

When to Use OpenCV

When Working with Small Datasets: OpenCV is more suited for small-scale projects where the number of videos or frames you’re extracting is manageable. For instance, if you’re processing a few videos for a personal project, OpenCV’s local processing capabilities are sufficient.
Local Testing: If you’re working in a test environment, processing videos on a local machine or in an offline environment, OpenCV is a great choice. It allows you to work directly with video files without needing an internet connection or external services, allowing you to save cost and resources.

When to Use Cloudinary

For Large-Scale Video Processing: If your project involves processing hundreds or thousands of videos, Cloudinary’s cloud-based infrastructure can handle the load efficiently without overwhelming your local system resources. Cloudinary offers a global content delivery network (CDN) that ensures quick access to media files, and its infrastructure is designed to handle high traffic and large workloads.
When You Want to Offload Processing to the Cloud: Cloudinary eliminates the need for local processing, which can be resource-intensive. By offloading video processing to the cloud, you can save time, reduce hardware requirements, and focus on building your application rather than managing infrastructure.

Conclusion

In computer vision and other machine learning tasks, image or frame extraction is a common technique for many use cases, including model training, object detection, motion analysis and more. OpenCV is a popular and powerful Python library that simplifies image and video processing through its collection of optimized algorithms. In this guide, we explored how to use OpenCV, and Cloudinary, a cloud-based media management platform, for capturing images from videos.

While OpenCV is well-suited for smaller, simpler projects where local processing is sufficient. For large-scale, complex projects requiring high performance, scalability, and automated asset management, Cloudinary provides a more robust and efficient cloud-based solution. And ultimately, your choice of between the two tools will depend on your own specific use cases and requirements.

QUICK TIPS

Colby Fayock

In my experience, here are tips that can help you better capture images from videos using OpenCV and Cloudinary:

Use scene change detection to avoid redundant frames
Instead of capturing frames at fixed intervals, use techniques like histogram comparison or structural similarity (SSIM) to detect significant visual changes between frames. This captures only unique or meaningful frames.
Optimize for color space before saving frames
Convert frames from BGR to other color spaces like HSV or LAB depending on your analysis goal. This can improve downstream tasks like object detection or clustering.
Leverage multi-threading for real-time capture
If you’re processing high-FPS video or capturing in real-time, use multi-threading with queues to decouple frame capture from frame processing. This avoids dropped frames.
Store frame metadata alongside images
Save timestamp, frame number, and video source info in a JSON or database. This makes frame provenance traceable, especially in datasets used for ML training.
Use OpenCV’s CAP_PROP_POS_MSEC for precise timing
Instead of jumping by frame index, use milliseconds to position the capture. This is particularly useful for non-integer frame rates or time-based annotations.
Apply image preprocessing before saving
Normalize lighting, apply histogram equalization, or denoise frames before saving. This enhances quality and consistency for future analysis or labeling.
Use GPU-accelerated libraries where possible
Integrate OpenCV with CUDA or use PyTorch/TensorFlow for preprocessing-heavy pipelines to dramatically improve frame extraction speed on compatible hardware.
Batch-download frames from Cloudinary using async requests
When extracting many frames from Cloudinary, use aiohttp or requests-futures to parallelize downloads. This cuts network wait time significantly.
Automate frame curation with clustering
After extraction, use clustering algorithms (e.g., k-means, DBSCAN) to group visually similar frames. This helps reduce dataset redundancy and improves training efficiency.
Embed watermark or ID in frames during extraction
For traceability or compliance, use OpenCV’s cv2.putText() to embed project IDs, timestamps, or watermarks into the frames during capture.

Last updated: Mar 22, 2025