Synthèse vocale : Installation de ESpeak/Mbrola.
Distribution de travail : Raspbian
1) Répertoire de travail :
[~] ➔ mkdir SYNTHESEVOCALE [~] ➔ cd SYNTHESEVOCALE/
2) Installation de PortAudio :
– Télécharger l’archive :
[~/SYNTHESEVOCALE] ➔ wget http://www.portaudio.com/archives/pa_stable_v19_20140130.tgz
– Décompresser l’archive :
[~/SYNTHESEVOCALE] ➔ tar xvfz pa_stable_v19_20140130.tgz [~/SYNTHESEVOCALE] ➔ cd portaudio/
– Configuration :
[~/SYNTHESEVOCALE/portaudio] ➔ ./configure
– Compilation :
[~/SYNTHESEVOCALE/portaudio] ➔ make
– Installation :
[~/SYNTHESEVOCALE/portaudio] ➔ sudo make install [~/SYNTHESEVOCALE/portaudio] ➔ sudo /sbin/ldconfig [~/SYNTHESEVOCALE/portaudio] ➔ cd .. [~/SYNTHESEVOCALE] ➔
3) Installation de ESpeak :
– Télécharger l’archive :
[~/SYNTHESEVOCALE] ➔ wget http://downloads.sourceforge.net/project/espeak/espeak/espeak-1.48/espeak-1.48.04-source.zip
– Décompresser l’archive :
[~/SYNTHESEVOCALE] ➔ unzip espeak-1.48.04-source.zip [~/SYNTHESEVOCALE] ➔ cd espeak-1.48.04-source/src/
– Compilation :
[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ make
Erreur :
g++ -o speak speak.o compiledict.o dictionary.o intonation.o readclause.o setlengths.o numbers.o synth_mbrola.o synthdata.o synthesize.o translate.o mbrowrap.o tr_languages.o voices.o wavegen.o phonemelist.o klatt.o sonic.o -lstdc++ -lportaudio -lpthread wavegen.o: In function `WavegenOpenSound()': wavegen.cpp:(.text+0x439): undefined reference to `Pa_StreamActive' wavegen.o: In function `WavegenCloseSound()': wavegen.cpp:(.text+0x543): undefined reference to `Pa_StreamActive' collect2: ld a retourné 1 code d'état d'exécution make: *** [speak] Erreur 1
Solution :
[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ make clean [~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ mv portaudio.h portaudio.h.old [~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ mv portaudio19.h portaudio.h [~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ make
Lien :
http://spirit.blau.in/linuxmint/2012/06/06/install-espeak-1-46-02/
– Installation :
[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ sudo make install
[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ cd ../.. [~/SYNTHESEVOCALE] ➔
3) Installation de MBrola :
– Répertoire de travail :
[~/SYNTHESEVOCALE] ➔ mkdir MBROLA [~/SYNTHESEVOCALE] ➔ cd MBROLA/
– Téléchargement de Mbrola :
[~/SYNTHESEVOCALE/MBROLA] ➔ wget http://tcts.fpms.ac.be/synthesis/mbrola/bin/raspberri_pi/mbrola.tgz
– Décompression de l’archive :
[~/SYNTHESEVOCALE/MBROLA] ➔ tar xvfz mbrola.tgz
– Installation :
[~/SYNTHESEVOCALE/MBROLA] ➔ chmod 755 mbrola [~/SYNTHESEVOCALE/MBROLA] ➔ cp ./mbrola ~/bin [~/SYNTHESEVOCALE/MBROLA] ➔ sudo cp mbrola /usr/local/bin
– Vérification :
[~/SYNTHESEVOCALE/MBROLA] ➔ mbrola MBROLA 3.02b - speech synthesizer
– Téléchargement et installation des voix :
[~/SYNTHESEVOCALE/MBROLA] ➔ wget http://tcts.fpms.ac.be/synthesis/mbrola/dba/fr1/fr1-990204.zip [~/SYNTHESEVOCALE/MBROLA] ➔ sudo unzip fr1-990204.zip -d /opt/mbrola [~/SYNTHESEVOCALE/MBROLA] ➔ sudo mkdir -p /usr/share/mbrola/voices/ [~/SYNTHESEVOCALE/MBROLA] ➔ sudo cp -r /opt/mbrola/fr1/* /usr/share/mbrola/voices/
4) Test :
– Création du phonéme :
[~/SYNTHESEVOCALE/MBROLA] ➔ espeak -v mb-fr1 -q --pho --phonout=phoneme.pho "Coucou tout le monde" [~] ➔ cat phoneme.pho k 80 u 30 0 94 20 95 40 96 59 97 80 99 100 99 k 80 u 40 0 117 80 109 100 109 t 68 u 30 0 110 80 106 100 106 l 65 m 65 o~ 76 0 102 80 76 100 76 d 65 _ 301 _ 1
– Conversion en fichier .wav avec le fichier des phonèmes :
[~/SYNTHESEVOCALE/MBROLA] ➔ mbrola -t 1.7 -e -C "n n2" /opt/mbrola/fr1/fr1 coucou.wav [~] ➔ mbrola -t 1.7 -e -C "n n2" /opt/mbrola/fr1/fr1 phoneme.pho coucou.wav [~] ➔ ls -l coucou.wav -rw-r--r-- 1 pi pi 48952 mai 11 16:27 coucou.wav
– Conversion en mp3 :
[~] ➔ lame -m j -b 192 --resample 44.1 coucou.wav coucou.mp3 LAME 3.99.5 32bits (http://lame.sf.net) Resampling: input 16 kHz output 44.1 kHz Using polyphase lowpass filter, transition band: 20094 Hz - 20627 Hz Encoding coucou.wav to coucou1.mp3 Encoding as 44.1 kHz single-ch MPEG-1 Layer III (3.7x) 192 kbps qval=3 Frame | CPU time/estim | REAL time/estim | play/CPU | ETA 60/60 (100%)| 0:00/ 0:00| 0:00/ 0:00| 3.1987x| 0:00 ----------------------------------------------------------------------------------------------------------------------------------------- kbps mono % long switch short % 192.0 100.0 65.0 13.3 21.7 Writing LAME Tag...done ReplayGain: -0.8dB
– Lecture :
[~] ➔ mpg123 coucou.mp3 High Performance MPEG 1.0/2.0/2.5 Audio Player for Layers 1, 2 and 3 version 1.14.4; written and copyright by Michael Hipp and others free software (LGPL/GPL) without any warranty but with best wishes Playing MPEG stream 1 of 1: coucou1.mp3 ... MPEG 1.0 layer III, 192 kbit/s, 44100 Hz mono [0:01] Decoding of coucou1.mp3 finished.
5) Lien :
http://bothari.free.fr/weblog/post/Ubuntu-Text-to-Speech-%28TTS%29