232 lines
9.6 KiB
HTML
232 lines
9.6 KiB
HTML
<div data-align="center">
|
||
<pre><code><br>
|
||
<br>
|
||
<div>
|
||
<img src="media/logo.png" alt="Awesome Whisper">
|
||
<br>
|
||
</div>
|
||
<br>
|
||
<p>
|
||
<a href="https://openai.com/research/whisper">Whisper</a> is an open-source AI-powered speech recognition system developed by <a href="https://openai.com">OpenAI</a>
|
||
</p>
|
||
<br>
|
||
<a href="https://awesome.re">
|
||
<img src="https://awesome.re/badge-flat2.svg" alt="Awesome">
|
||
</a>
|
||
<br>
|
||
<br>
|
||
<br>
|
||
<br>
|
||
<br></code></pre>
|
||
</div>
|
||
<h2 id="contents">Contents</h2>
|
||
<ul>
|
||
<li><a href="#official">Official</a></li>
|
||
<li><a href="#model-variants">Model variants</a></li>
|
||
<li><a href="#apps">Apps</a></li>
|
||
<li><a href="#web-apps">Web apps</a></li>
|
||
<li><a href="#cli-tools">CLI tools</a></li>
|
||
<li><a href="#playgrounds">Playgrounds</a></li>
|
||
<li><a href="#packages">Packages</a></li>
|
||
<li><a href="#articles">Articles</a></li>
|
||
<li><a href="#videos">Videos</a></li>
|
||
<li><a href="#community">Community</a></li>
|
||
<li><a href="#third-party-apis">Third-party APIs</a></li>
|
||
<li><a href="#related-lists">Related lists</a></li>
|
||
</ul>
|
||
<h2 id="official">Official</h2>
|
||
<ul>
|
||
<li><a href="https://openai.com/research/whisper">Introduction</a></li>
|
||
<li><a href="https://github.com/openai/whisper">Source code</a></li>
|
||
<li><a href="https://cdn.openai.com/papers/whisper.pdf">White
|
||
paper</a></li>
|
||
</ul>
|
||
<h2 id="model-variants">Model variants</h2>
|
||
<ul>
|
||
<li><a href="https://github.com/ggerganov/whisper.cpp">Whisper.cpp</a> -
|
||
Port of Whisper in C++.
|
||
<ul>
|
||
<li><a href="https://github.com/ggerganov/whisper.cpp#bindings">Bindings
|
||
for many languages</a></li>
|
||
</ul></li>
|
||
<li><a href="https://github.com/m-bain/whisperX">WhisperX</a> - Adds
|
||
fast automatic speaker recognition with word-level timestamps and
|
||
speaker diarization.</li>
|
||
<li><a
|
||
href="https://github.com/guillaumekln/faster-whisper">faster-whisper</a>
|
||
- Faster reimplementation of Whisper using CTranslate2.</li>
|
||
<li><a href="https://github.com/sanchit-gandhi/whisper-jax">Whisper
|
||
JAX</a> - JAX implementation of Whisper for up to 70x speed-up on
|
||
TPU.</li>
|
||
<li><a
|
||
href="https://github.com/linto-ai/whisper-timestamped">whisper-timestamped</a>
|
||
- Adds word-level timestamps and confidence scores.</li>
|
||
<li><a
|
||
href="https://github.com/zhuzilin/whisper-openvino">whisper-openvino</a>
|
||
- Whisper running on OpenVINO.</li>
|
||
<li><a
|
||
href="https://github.com/usefulsensors/openai-whisper">whisper.tflite</a>
|
||
- Whisper running on TensorFlow Lite.</li>
|
||
<li><a href="https://huggingface.co/models?other=whisper">Whisper
|
||
variants</a> - Various Whisper variants on Hugging Faces.</li>
|
||
<li><a href="https://github.com/YuanGongND/whisper-at">Whisper-AT</a> -
|
||
Whisper that can recognize non-speech audio events in addition to
|
||
speech.</li>
|
||
</ul>
|
||
<h2 id="apps">Apps</h2>
|
||
<ul>
|
||
<li><a href="https://sindresorhus.com/aiko">Aiko</a> - Audio
|
||
transcription iOS and macOS app.</li>
|
||
<li><a href="https://goodsnooze.gumroad.com/l/macwhisper">MacWhisper</a>
|
||
- Audio transcription macOS app. (Freemium)</li>
|
||
<li><a href="https://apps.apple.com/app/id6443658039">Whisper Memos</a>
|
||
- Audio transcription iOS app. (Freemium)</li>
|
||
<li><a href="https://apps.apple.com/app/id1671616134">FourYou</a> -
|
||
Audio journal iOS app.</li>
|
||
<li><a href="https://apps.apple.com/app/id1659864300">Jojo
|
||
Transcribe</a> - Audio transcription macOS app.</li>
|
||
<li><a href="https://github.com/chidiwilliams/Buzz">Buzz</a> - Audio
|
||
transcription and translation macOS app.</li>
|
||
<li><a
|
||
href="https://store.getwavery.com/l/whisperscript">WhisperScript</a> -
|
||
Audio transcription macOS app. (Freemium · Electron)</li>
|
||
<li><a href="https://apps.apple.com/app/id6449008295">Audio Podium</a> -
|
||
Audio/video management macOS app.</li>
|
||
<li><a href="https://superwhisper.com">superwhisper</a> - Global audio
|
||
transcription macOS menu bar app.</li>
|
||
<li><a href="https://github.com/mkiol/dsnote">Speech Note</a> - Audio
|
||
transcription Linux app.</li>
|
||
<li><a href="https://www.fridaygpt.app">FridayGPT</a> - Dictation macOS
|
||
app powered by OpenAI API.</li>
|
||
<li><a href="https://easywhisper.io">EasyWhisper</a> - Windows and macOS
|
||
app for audio transcription and speaker diarization. (Freemium)</li>
|
||
<li><a href="https://audionote.app">Audio Note</a> - Real-time audio
|
||
transcription on macOS and Windows. (Freemium · Electron)</li>
|
||
<li><a href="https://github.com/woheller69/whisperIME">Whisper</a> -
|
||
Android app for transcription and translation. (FOSS)</li>
|
||
<li><a href="https://github.com/Beingpax/VoiceInk">VoiceInk</a> -
|
||
Dictation and transcription macOS app. (FOSS)</li>
|
||
</ul>
|
||
<h2 id="web-apps">Web apps</h2>
|
||
<!-- ### Hosted and self-hosted -->
|
||
<h3 id="hosted">Hosted</h3>
|
||
<ul>
|
||
<li><a href="https://bigwav.app">bigWav</a> - Audio transcription and
|
||
annotation tool.</li>
|
||
<li><a href="https://freepodcasttranscription.com">Free Podcast
|
||
Transcription</a> - Runs locally in your browser.</li>
|
||
<li><a href="https://www.gladia.io">Gladia</a> - Transcription with
|
||
real-time processing.</li>
|
||
</ul>
|
||
<h3 id="self-hosted">Self-hosted</h3>
|
||
<ul>
|
||
<li><a href="https://github.com/abdeladim-s/subsai">Subs AI</a> -
|
||
Subtitle generation.</li>
|
||
<li><a href="https://github.com/schibsted/WAAS">WaaS</a> - GUI and API
|
||
for Whisper.</li>
|
||
<li><a href="https://github.com/beyondcode/writeout.ai">writeout.ai</a>
|
||
- Laravel app to transcribe and translate audio files.</li>
|
||
<li><a href="https://github.com/pas1ko/meeper">Meeper</a> -
|
||
Transcriptions, summary and more for meetings and any browser tab.
|
||
(Chrome app)</li>
|
||
</ul>
|
||
<h2 id="cli-tools">CLI tools</h2>
|
||
<ul>
|
||
<li><a href="https://github.com/m1guelpf/yt-whisper">yt-whisper</a> -
|
||
YouTube subtitle generation.</li>
|
||
<li><a href="https://github.com/platisd/phonix">phonix</a> - Generate
|
||
captions for videos.</li>
|
||
<li><a
|
||
href="https://github.com/Purfview/whisper-standalone-win">whisper-standalone-win</a>
|
||
- Standalone Windows executable for Whisper and Faster Whisper.</li>
|
||
<li><a
|
||
href="https://github.com/Softcatala/whisper-ctranslate2">whisper-ctranslate2</a>
|
||
- Whisper command-line tool based on CTranslate2, compatible with the
|
||
original.</li>
|
||
<li><a
|
||
href="https://github.com/ochen1/insanely-fast-whisper-cli">insanely-fast-whisper-cli</a>
|
||
- Achieve transcription speeds near 30x real-time with several
|
||
optimizations.</li>
|
||
<li><a
|
||
href="https://github.com/MahmoudAshraf97/whisper-diarization">whisper-diarization</a>
|
||
- Automatic speech recognition with speaker diarization.</li>
|
||
</ul>
|
||
<h2 id="playgrounds">Playgrounds</h2>
|
||
<ul>
|
||
<li><a href="https://huggingface.co/spaces/openai/whisper">Hugging
|
||
Faces</a> - Whisper demo running on Hugging Faces. (<a
|
||
href="https://huggingface.co/spaces/openai/whisper/tree/main">Source</a>)</li>
|
||
<li><a href="https://whisperui.monsterapi.ai">Monster API</a> - Whisper
|
||
demo running on Monster API. (<a
|
||
href="https://github.com/saharmor/whisper-playground">Source</a>)</li>
|
||
<li><a href="https://whisper.r3d.red">Web Whisper</a> - Whisper demo by
|
||
Pluja. (<a
|
||
href="https://codeberg.org/pluja/web-whisper">Source</a>)</li>
|
||
<li><a href="https://github.com/ArthurFDLR/whisper-youtube">YouTube
|
||
Video Transcription</a> - Running on Colab.</li>
|
||
</ul>
|
||
<h2 id="packages">Packages</h2>
|
||
<h3 id="javascript">JavaScript</h3>
|
||
<ul>
|
||
<li><a
|
||
href="https://github.com/chengsokdara/use-whisper">use-whisper</a> -
|
||
React hook.</li>
|
||
</ul>
|
||
<h2 id="articles">Articles</h2>
|
||
<ul>
|
||
<li><a
|
||
href="https://www.newyorker.com/tech/annals-of-technology/whispers-of-ais-modular-future">Whispers
|
||
of A.I.’s Modular Future</a> - The future of machine learning lies in
|
||
adaptable and accessible open-source speech-transcription programs.</li>
|
||
<li><a
|
||
href="https://www.assemblyai.com/blog/how-to-run-openais-whisper-speech-recognition-model/">How
|
||
to Run Whisper Speech Recognition Model</a> - Explains how to install
|
||
and run the model, as well as providing a performance analysis comparing
|
||
Whisper to other models.</li>
|
||
<li><a
|
||
href="https://blog.paperspace.com/whisper-openai-flask-application-deployment/">Create
|
||
your own speech to text app using Flask</a> - The tutorial demonstrates
|
||
Whisper’s speech-to-text model, with a demo on running it in a Gradient
|
||
Notebook and a guide for setting up a Flask app with Gradient
|
||
Deployments.</li>
|
||
<li><a
|
||
href="https://betterprogramming.pub/openais-whisper-tutorial-42140dd696ee">Convert
|
||
Podcasts to Text</a> - Tutorial on the Whisper API with Python for
|
||
speech-to-text transcription, showcasing GPU’s faster transcription and
|
||
advanced technology.</li>
|
||
</ul>
|
||
<h2 id="videos">Videos</h2>
|
||
<ul>
|
||
<li><a href="https://www.youtube.com/watch?v=OCBZtgQGt1I">Open AI’s
|
||
Whisper is Amazing!</a> - Introduction to Whisper.</li>
|
||
<li><a href="https://www.youtube.com/watch?v=msj3wuYf3d8">How to do Free
|
||
Speech-to-Text Transcription Better Than Google Premium API</a> -
|
||
Tutorial.</li>
|
||
<li><a href="https://www.youtube.com/watch?v=ywIyc8l1K1Q">Multilingual
|
||
AI Speech Recognition Live App</a> - Tutorial.</li>
|
||
</ul>
|
||
<h2 id="community">Community</h2>
|
||
<ul>
|
||
<li><a
|
||
href="https://github.com/openai/whisper/discussions">Discussions</a></li>
|
||
<li><a href="https://discord.com/invite/openai">Discord</a></li>
|
||
</ul>
|
||
<h2 id="third-party-apis">Third-party APIs</h2>
|
||
<p><em>APIs that use Whisper.</em></p>
|
||
<ul>
|
||
<li><a href="https://www.oneai.com/speech-to-text">Whisper+</a> -
|
||
Extension of the Whisper model which adds powerful features such as
|
||
speaker identification custom vocabulary, summarization, and chapter
|
||
generation.</li>
|
||
<li><a href="https://replicate.com/openai/whisper">Replicate</a> - Use
|
||
Whisper running on Replicate.</li>
|
||
</ul>
|
||
<h2 id="related-lists">Related lists</h2>
|
||
<ul>
|
||
<li><a
|
||
href="https://github.com/sindresorhus/awesome-chatgpt">awesome-chatgpt</a>
|
||
- ChatGPT resources.</li>
|
||
</ul>
|
||
<p><a href="https://github.com/sindresorhus/awesome-whisper">whisper.md
|
||
Github</a></p>
|