zoltanctoth/transcribe-comparison.md

## transcribe-comparison.md

      
    Raw
  

              transcribe-comparison.md
            
          
Solution
Transcription Speed (1hr audio)
Cost/hr
Hungarian Quality
Speaker Diarization
Notes


MacBook Air (local)
~45-60 min
Free
Good
Needs separate model (pyannote)
Slow, pyannote adds ~20-30 min extra per hour


OpenAI Whisper API
~1-3 min
$0.36
Good
❌ Not supported
Need to combine with separate diarization


Whisper + pyannote
~5-10 min (GPU)
Free
Good
✅ Yes (local)
Best free option, needs GPU + HuggingFace token


Deepgram
~30-60 sec
$0.25
Decent
✅ Built-in
Fast, easy API, HU quality not the best


Google Speech-to-Text
~1-2 min
$1.44
Good
✅ Built-in
Good HU support, expensive


Azure Speech
~1-2 min
$1.00
Good
✅ Built-in
Good quality, mid-price


Amazon Transcribe
~2-5 min
$0.72
Decent
✅ Built-in
HU supported, decent quality, nice S3 integration


For 100 hours of audio:


Solution
Total Time
Total Cost
Has Diarization
Recommended?


MacBook Air (local)
5-7 days
$0
⚠️ Extra setup
Only if budget is zero


Whisper API only
2-5 hours
$36
❌
Not if you need speakers


Whisper + pyannote
8-16 hours
$0
✅
Best free option, needs decent GPU


Deepgram
1-2 hours
$25
✅
Cheapest managed with diarization


Amazon Transcribe
3-8 hours
$72
✅
Good if already on AWS


Azure Speech
2-3 hours
$100
✅
Good quality, mid-price


Google STT
2-3 hours
$144
✅
Best quality, most expensive
Solution	Transcription Speed (1hr audio)	Cost/hr	Hungarian Quality	Speaker Diarization	Notes
MacBook Air (local)	~45-60 min	Free	Good	Needs separate model (pyannote)	Slow, pyannote adds ~20-30 min extra per hour
OpenAI Whisper API	~1-3 min	$0.36	Good	❌ Not supported	Need to combine with separate diarization
Whisper + pyannote	~5-10 min (GPU)	Free	Good	✅ Yes (local)	Best free option, needs GPU + HuggingFace token
Deepgram	~30-60 sec	$0.25	Decent	✅ Built-in	Fast, easy API, HU quality not the best
Google Speech-to-Text	~1-2 min	$1.44	Good	✅ Built-in	Good HU support, expensive
Azure Speech	~1-2 min	$1.00	Good	✅ Built-in	Good quality, mid-price
Amazon Transcribe	~2-5 min	$0.72	Decent	✅ Built-in	HU supported, decent quality, nice S3 integration
Solution	Total Time	Total Cost	Has Diarization	Recommended?
MacBook Air (local)	5-7 days	$0	⚠️ Extra setup	Only if budget is zero
Whisper API only	2-5 hours	$36	❌	Not if you need speakers
Whisper + pyannote	8-16 hours	$0	✅	Best free option, needs decent GPU
Deepgram	1-2 hours	$25	✅	Cheapest managed with diarization
Amazon Transcribe	3-8 hours	$72	✅	Good if already on AWS
Azure Speech	2-3 hours	$100	✅	Good quality, mid-price
Google STT	2-3 hours	$144	✅	Best quality, most expensive