Audio
Misc
- Packages
- {carelesswhisper} - Speech-to-Text; Automatic speech recognition in R using whisper.cpp; Translation; Can use huggingface and ggerganov models; no dependencies
- Strong Baseline Model: ResNet/EffNet
- Typically these are image models, but converting the audio to a spectrogram and using these models produces good results
- Also see this guide for suitable baseline models: link