You are iterating quickly on a project and want a clean, reliable way to duplicate files or whole folders as part of your build or backup script. Here’s a common question developers pose, along with a ready-to-use answer for your code.
How to copy a file in Python so that it works on Windows, macOS, and Linux? I need to handle both single files and entire directories, preserve timestamps and permissions where possible, avoid accidental overwrites, and keep performance in mind for large files. What are the right functions from the standard library, any pitfalls to watch for, and can you share some minimal examples?
Python’s standard library provides everything you need for robust file and directory copying. The shutil and pathlib modules cover most use cases, and they keep your code portable across platforms.
- shutil.copy(src, dst): Copies file data and permission bits.
- shutil.copy2(src, dst): Like copy, plus attempts to preserve metadata such as timestamps.
- shutil.copyfile(src, dst): Copies file data only. Fast, but does not preserve permissions or metadata.
- shutil.copytree(src, dst, …): Recursively copies directories.
# 1) Copy a single file and preserve metadata
from pathlib import Path
import shutil
src = Path("data/report.csv")
dst = Path("backup/report.csv")
dst.parent.mkdir(parents=True, exist_ok=True) # ensure destination folder exists
shutil.copy2(src, dst) # preserves timestamps where supported
# 2) Copy a file without overwriting accidentally
from pathlib import Path
import shutil
src = Path("data/report.csv")
dst = Path("backup/report.csv")
if dst.exists():
raise FileExistsError(f"{dst} already exists")
shutil.copy2(src, dst)
# 3) Copy an entire directory tree (Python 3.8+)
import shutil
shutil.copytree("assets", "backup/assets", dirs_exist_ok=True)
# Ignore patterns (e.g., .git and *.tmp):
# shutil.copytree("assets", "backup/assets", dirs_exist_ok=True,
# ignore=shutil.ignore_patterns(".git", "*.tmp"))
# 4) Efficient copy for large files using a buffer
import shutil
with open("big.iso", "rb") as fsrc, open("backup/big.iso", "wb") as fdst:
shutil.copyfileobj(fsrc, fdst, length=1024 * 1024) # 1 MB chunks
# 5) Atomic-like replace to avoid partial files on failures
from pathlib import Path
import shutil
src = Path("data/report.csv")
dst = Path("backup/report.csv")
tmp = dst.with_suffix(dst.suffix + ".tmp")
shutil.copy2(src, tmp)
tmp.replace(dst) # os.replace under the hood; atomic on many filesystemsCode language: PHP (php)
- Use shutil.copy2 for backups where you care about timestamps and permissions.
- Use shutil.copyfile when you need speed and do not care about metadata.
- Use shutil.copytree for directories, optionally with ignore patterns and symlink handling (symlinks=True).
- Use pathlib.Path for safe, cross-platform path handling.
- Cross-platform paths: Always build paths with pathlib instead of string concatenation.
- Permissions: Metadata preservation varies by OS and filesystem; copy2 does its best but is not guaranteed on all platforms.
- Existing targets: Decide whether to overwrite or fail fast. copyfile and copy2 will overwrite by default.
- Concurrency: For multi-process copy flows, use a temp file then replace to avoid readers seeing partial content.
- Very large folders: Consider filtering with ignore_patterns to skip unnecessary files.
If you are copying images or videos mainly to create resized or reformatted variants, a media platform can help avoid local duplication altogether. For example, you can upload a master asset once, then deliver many variants from a single canonical source using transformation parameters in the asset URL.
# Example: upload once, then deliver variants via URL without making file copies
import cloudinary
import cloudinary.uploader
cloudinary.config(
cloud_name="demo",
api_key="1234567890",
api_secret="a_very_secret_key",
secure=True
)
# Upload original
res = cloudinary.uploader.upload(
"assets/photo.jpg",
folder="backups",
public_id="photo-v1",
overwrite=False
)
# Later, request a resized or compressed variant by URL (no new file needed):
https://res.cloudinary.com//image/upload/f_auto,q_auto,w_1024/backups/photo-v1.jpgCode language: PHP (php)
This pattern reduces storage and I/O, centralizes asset management, and helps enforce a consistent file management system. Since variants are created on the fly using transformation parameters, the image URL itself becomes the API for delivery.
- Use shutil.copy2 for copying single files with metadata, and shutil.copytree for directories.
- Guard against accidental overwrites and prefer atomic replace patterns for reliability.
- Pathlib keeps your paths portable and readable.
- For media, upload once and generate variants via URL to avoid local file copies and save storage.
Ready to streamline how you store, transform, and deliver media while keeping your Python code simple? Create a free Cloudinary account and start optimizing your workflow today.