Skip to content

Instantly share code, notes, and snippets.

@jasonmklug
Last active July 16, 2025 17:06
Show Gist options
  • Select an option

  • Save jasonmklug/ae7817b59fbfd5e9b0238693ecfb5ac4 to your computer and use it in GitHub Desktop.

Select an option

Save jasonmklug/ae7817b59fbfd5e9b0238693ecfb5ac4 to your computer and use it in GitHub Desktop.
# https://man7.org/linux/man-pages/man1/split.1.html
# split [-l line_count] [-b byte_count] [-n chunk_count] [file [prefix]]
TARGET_LINE_COUNT=1000
ORIGINAL_FILENAME=nr_file.csv
NEW_NAME_PREFIX=chunk_
# Split original file into smaller "chunk" files of 5000 lines
split -l $TARGET_LINE_COUNT -d $ORIGINAL_FILENAME $NEW_NAME_PREFIX
# Rename each of the resultant chunk files (numerical incrementer and .csv)
for i in $(find ${NEW_NAME_PREFIX}*); do mv $i "$i.csv"; done
# Grab the header row from the first chunk file and apply it to the rest of the chunk files
for i in $(find . -type f -name "${NEW_NAME_PREFIX}*.csv" -not -name "${NEW_NAME_PREFIX}00.csv");
do echo -e "$(head -1 ${NEW_NAME_PREFIX}00.csv)\n$(cat $i)" > $i;
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment