mttjohnson-jf/translate_transcribe_videos.md

## translate_transcribe_videos.md

      
    Raw
  

              translate_transcribe_videos.md
            
          
    Dependencies

Install the following dependencies
brew install pipenv
brew install ffmpeg
pipenv install --python 3.10

Instructions

Activate a pipenv shell and make sure the following python packages are installed
pipenv shell --python 3.10
pip3 install ffmpeg-python
pip3 install git+https://github.com/openai/whisper.git
pip3 install --pre --force-reinstall torch --index-url https://download.pytorch.org/whl/nightly/cpu
pip3 install --upgrade pip setuptools

Setup input/output directories
mkdir -p input
mkdir -p output

Move files you want to process to a directory called "input"
Copy and paste commands from the vprocess.sh file into your pipenv shell
Run Time Expectations

I ran this on a 15 minute video file and it took about 2 hours to process.
Troubleshooting

Reference defaults and command line option settings: https://github.com/openai/whisper/blob/main/whisper/transcribe.py
Use caffeinate to set assertions to prevent sleeping during execution of a command
Change system settings to prevent system sleep, then reset the setting afterwards
sudo systemsetup -getcomputersleep
sudo systemsetup -setcomputersleep Never
sudo systemsetup -setcomputersleep 60

Special Thanks

To Alex for figuring all this out in the first place and writing the initial script

  
## vprocess.sh
# While this could potentially be run as a script, I've typically just copy and pasted these commands

input_video_file="2023.10.06.mp4"
subtitle_file="2023.10.06.srt"
output_video_file="2023.10.06_engsub.mp4"

echo "$(date) ---------- starting whisper ----------"
caffeinate -i -s \
    whisper \
        --model large \
        --language Turkish \
        --task translate \
        --compression_ratio_threshold 2.4 \
        --logprob_threshold -1.0 \
        --no_speech_threshold 0.6 \
        --condition_on_previous_text False \
        --verbose True \
        --threads 6 \
        --fp16 False \
        "input/${input_video_file}"
echo "$(date) ---------- finished ----------"
echo "$(date) ---------- starting ffmpeg ----------"
ffmpeg -i "input/${input_video_file}" -vf subtitles="${subtitle_file}" "output/${output_video_file}"
echo "$(date) ---------- finished ----------"
	# While this could potentially be run as a script, I've typically just copy and pasted these commands

	input_video_file="2023.10.06.mp4"
	subtitle_file="2023.10.06.srt"
	output_video_file="2023.10.06_engsub.mp4"

	echo "$(date) ---------- starting whisper ----------"
	caffeinate -i -s \
	whisper \
	--model large \
	--language Turkish \
	--task translate \
	--compression_ratio_threshold 2.4 \
	--logprob_threshold -1.0 \
	--no_speech_threshold 0.6 \
	--condition_on_previous_text False \
	--verbose True \
	--threads 6 \
	--fp16 False \
	"input/${input_video_file}"
	echo "$(date) ---------- finished ----------"
	echo "$(date) ---------- starting ffmpeg ----------"
	ffmpeg -i "input/${input_video_file}" -vf subtitles="${subtitle_file}" "output/${output_video_file}"
	echo "$(date) ---------- finished ----------"
No results found