Skip to content

Instantly share code, notes, and snippets.

@nibzard
Last active January 23, 2024 22:07
Show Gist options
  • Select an option

  • Save nibzard/15580d6043b653e589b98973d967e4fb to your computer and use it in GitHub Desktop.

Select an option

Save nibzard/15580d6043b653e589b98973d967e4fb to your computer and use it in GitHub Desktop.
#!/bin/zsh
# Description: This script automates the process of transcribing the audio to text, and optionally
# generating a YouTube description from the transcription. Transcription is done using Whisper.cpp.
# It takes a video file as input and processes it to output a transcription text file. If a second optional
# argument is provided (a filename containing a prompt), it further processes the transcription to generate
# a YouTube description based on the provided prompt.
#
# Usage:
# ./scribe.sh <video_file_name> [prompt_file_name]
# - <video_file_name>: Mandatory. The name of the video file to be processed.
# - [prompt_file_name]: Optional. A text file containing the full prompt for generating the YouTube description.
#
# The script performs the following steps:
# 1. Converts the specified video file to a WAV audio file using ffmpeg.
# 2. Transcribes the audio to text using a transcription tool.
# 3. If a prompt file is provided, it generates a YouTube video description using the transcription and the specified prompt.
#
# Requirements:
# - ffmpeg must be installed and accessible in the path for video to audio conversion.
# - whisper.cpp must be installed and accessible in the path for transcription.
# - mods should be installed and configured for generating YouTube descriptions (if prompt file is provided).
#
# Note: Ensure the script has execute permissions before running: chmod +x scribe.sh
# Config
FFMPEG_PATH="/opt/homebrew/bin/ffmpeg"
WHISPER_PATH="/Users/nikola/dev/whisper.cpp/main"
MODS_PATH="/opt/homebrew/bin/mods"
MODELS_DIR="/Users/nikola/dev/whisper.cpp/models"
# Functions
convert() {
local input=$1
local output_wav=${input%.*}.wav
$FFMPEG_PATH -i "$input" -ar 16000 -ac 1 -c:a pcm_s16le "$output_wav"
}
transcribe() {
local output_wav=${1%.*}.wav
local output_text=${1%.*}
$WHISPER_PATH -m $MODELS_DIR/ggml-large.bin -l en -f "$output_wav" -otxt -of "$output_text"
}
generate_description() {
local prompt_file=$1
echo "$prompt_file"
local transcription=${2%.*}.txt
echo "$transcription"
local output_desc=${transcription%.*}_description.md
local prompt=$(cat "$prompt_file")
cat "$transcription" | $MODS_PATH -f markdown "$prompt" > "$output_desc"
}
# Main
if [ $# -lt 1 ]; then
echo "Usage: $0 <video_file> [prompt_file]"
exit 1
fi
convert "$1"
transcribe "$1"
if [ $# -eq 2 ]; then
prompt_file=$2
generate_description "$prompt_file" "$1"
fi
echo "Workflow completed for $1"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment