Skip to content

Instantly share code, notes, and snippets.

@Humboldt-Penguin
Created July 30, 2023 15:19
Show Gist options
  • Select an option

  • Save Humboldt-Penguin/363f5e9bfe42ad4596f135217e8d1bee to your computer and use it in GitHub Desktop.

Select an option

Save Humboldt-Penguin/363f5e9bfe42ad4596f135217e8d1bee to your computer and use it in GitHub Desktop.

Question:

I have a folder with nested subdirectories that contain many .npy files. How can I create a copy of this folder but the .npy files are converted to text files?


Answer:

To copy a directory structure and convert files of a certain type to another type, you can use Python's built-in libraries, such as os and shutil. Here is a step-by-step plan:

  1. Iterate over the directory structure using os.walk(). This function generates the file names in a directory tree by walking the tree either top-down or bottom-up.
  2. For each directory, create a corresponding directory in the destination root using os.makedirs().
  3. For each file, check the extension. If it's .npy, load the file using numpy.load(), convert it to a text format using numpy.savetxt(), and save it in the corresponding location in the destination root.
  4. If the file is not a .npy file, simply copy it to the corresponding location in the destination root using shutil.copy2().

Here's some skeleton code:

import os
import shutil
import numpy as np

def convert_npy_to_txt(src_root, dst_root):
    for src_dir, dirs, files in os.walk(src_root):
        dst_dir = src_dir.replace(src_root, dst_root, 1)
        if not os.path.exists(dst_dir):
            os.makedirs(dst_dir)
        for file_ in files:
            src_file = os.path.join(src_dir, file_)
            dst_file = os.path.join(dst_dir, file_)
            if file_.endswith('.npy'):
                data = np.load(src_file)
                np.savetxt(dst_file.replace('.npy', '.txt'), data)
            else:
                shutil.copy2(src_file, dst_file)

Please replace src_root and dst_root with the source and destination directories, respectively.

This script will convert all .npy files in the source directory and its subdirectories to .txt files in the destination directory, preserving the directory structure and other files. Be aware that the .npy files can be multi-dimensional arrays, and when these are saved as text files, they may lose some information or may not be easily reloaded. If you have multi-dimensional arrays and want to preserve the structure, you'll need a more complex solution.

This code is safe to run because it won't delete or overwrite any files in the source directory. However, I recommend making a backup of your data before running this or any other batch operation on your files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment