Speech-to-text on a Raspberry Pi
A very simple way to do speech-to-text directly on the Raspberry Pi.
This closely follows this but also includes the Pi dependencies:
sudo apt-get install swig oss-compat pulseaudio libpulse-dev automake autoconf libtool bison python-dev
Sorry, you can’t use 8-bit audio.
For all benchmarks I recorded one file using
arecord -f S16_LE -c1 -r16000 goforward.raw
To trasncribe the audio, you can use
time pocketsphinx_continuous -fwdflat no -bestpath no -maxwpf 5 -maxhmmpf 10000 -topn 2 -pl_window 7 -infile goforward.raw
Intel(R) Core(TM) i5-4310U CPU @ 2.00GHz: ~1-2 second for 2 seconds audio
Raspberry Pi 2: ~7-8 seconds for 2 seconds audio
Transcribe Audio (limited dictionary)
Make a dictionary first. Generate a file language.txt
with some words, like
open browser
new e-mail
forward
backward
next window
last window
open music player
okay computer
then go generate a dictionary by uploading language.txt
. Download the resulting tar and use tar -xvzf TAR*.tgz
and then use the command:
time pocketsphinx_continuous -fwdflat no -bestpath no -maxwpf 5 -maxhmmpf 1000 -topn 2 -pl_window 7 -dict 4182.dic -lm 4182.lm -infile goforward.raw
Intel(R) Core(TM) i5-4310U CPU @ 2.00GHz: ~0.1 seconds
Raspberry Pi 2: ~1.4 seconds
Keyword search
pocketsphinx_continuous -kws phrases.kws -kws_threshold 1e-20 -infile goforward.raw
where phrases.kws
has a couple of phrases to look for.
Intel(R) Core(TM) i5-4310U CPU @ 2.00GHz: ~0.25 seconds
Raspberry Pi 2: ~2.5 seconds
Download
git clone https://github.com/cmusphinx/sphinxbase.git
git clone https://github.com/cmusphinx/pocketsphinx
Configure and Install Pocketsphinx
cd sphinxbase
./autogen.sh
./configure
make && sudo make install
Add the following to ~/.profile
:
export LD_LIBRARY_PATH=/usr/local/lib
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
then
source ~/.profile
cd ../
cd pocketsphinx
./autogen.sh
make && sudo make install