Synthèse vocale : Installation de ESpeak/Mbrola

Synthèse vocale : Installation de ESpeak/Mbrola.

Distribution de travail : Raspbian
1) Répertoire de travail :

[~] ➔ mkdir SYNTHESEVOCALE
[~] ➔ cd SYNTHESEVOCALE/

2) Installation de PortAudio :
– Télécharger l’archive :

[~/SYNTHESEVOCALE] ➔ wget http://www.portaudio.com/archives/pa_stable_v19_20140130.tgz

– Décompresser l’archive :

[~/SYNTHESEVOCALE] ➔ tar xvfz pa_stable_v19_20140130.tgz
[~/SYNTHESEVOCALE] ➔ cd portaudio/

– Configuration :

[~/SYNTHESEVOCALE/portaudio] ➔ ./configure

– Compilation :

[~/SYNTHESEVOCALE/portaudio] ➔ make

– Installation :

[~/SYNTHESEVOCALE/portaudio] ➔ sudo make install
[~/SYNTHESEVOCALE/portaudio] ➔ sudo /sbin/ldconfig
[~/SYNTHESEVOCALE/portaudio] ➔ cd ..
[~/SYNTHESEVOCALE] ➔

3) Installation de ESpeak :
– Télécharger l’archive :

[~/SYNTHESEVOCALE] ➔ wget http://downloads.sourceforge.net/project/espeak/espeak/espeak-1.48/espeak-1.48.04-source.zip

– Décompresser l’archive :

[~/SYNTHESEVOCALE] ➔ unzip espeak-1.48.04-source.zip
[~/SYNTHESEVOCALE] ➔ cd espeak-1.48.04-source/src/

– Compilation :

[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ make

Erreur :

g++  -o speak speak.o compiledict.o dictionary.o intonation.o readclause.o setlengths.o numbers.o synth_mbrola.o synthdata.o synthesize.o translate.o mbrowrap.o tr_languages.o voices.o wavegen.o phonemelist.o klatt.o sonic.o -lstdc++ -lportaudio -lpthread
wavegen.o: In function `WavegenOpenSound()':
wavegen.cpp:(.text+0x439): undefined reference to `Pa_StreamActive'
wavegen.o: In function `WavegenCloseSound()':
wavegen.cpp:(.text+0x543): undefined reference to `Pa_StreamActive'
collect2: ld a retourné 1 code d'état d'exécution
make: *** [speak] Erreur 1

Solution :

[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ make clean
[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ mv portaudio.h portaudio.h.old
[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ mv portaudio19.h portaudio.h
[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ make

Lien :
http://spirit.blau.in/linuxmint/2012/06/06/install-espeak-1-46-02/
– Installation :

[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ sudo make install
[~/SYNTHESEVOCALE/espeak-1.48.04-source/src] ➔ cd ../..
[~/SYNTHESEVOCALE] ➔

3) Installation de MBrola :
– Répertoire de travail :

[~/SYNTHESEVOCALE] ➔ mkdir MBROLA
[~/SYNTHESEVOCALE] ➔ cd MBROLA/

– Téléchargement de Mbrola :

[~/SYNTHESEVOCALE/MBROLA] ➔ wget http://tcts.fpms.ac.be/synthesis/mbrola/bin/raspberri_pi/mbrola.tgz

– Décompression de l’archive :

[~/SYNTHESEVOCALE/MBROLA] ➔ tar xvfz mbrola.tgz

– Installation :

[~/SYNTHESEVOCALE/MBROLA] ➔ chmod 755 mbrola
[~/SYNTHESEVOCALE/MBROLA] ➔ cp ./mbrola ~/bin
[~/SYNTHESEVOCALE/MBROLA] ➔ sudo cp mbrola /usr/local/bin

– Vérification :

[~/SYNTHESEVOCALE/MBROLA] ➔ mbrola
 MBROLA 3.02b - speech synthesizer

– Téléchargement et installation des voix :

[~/SYNTHESEVOCALE/MBROLA] ➔ wget http://tcts.fpms.ac.be/synthesis/mbrola/dba/fr1/fr1-990204.zip
[~/SYNTHESEVOCALE/MBROLA] ➔ sudo unzip fr1-990204.zip -d /opt/mbrola
[~/SYNTHESEVOCALE/MBROLA] ➔ sudo mkdir -p /usr/share/mbrola/voices/
[~/SYNTHESEVOCALE/MBROLA] ➔ sudo cp -r /opt/mbrola/fr1/* /usr/share/mbrola/voices/

4) Test :
– Création du phonéme :

[~/SYNTHESEVOCALE/MBROLA] ➔ espeak -v mb-fr1 -q --pho --phonout=phoneme.pho "Coucou tout le monde"
[~] ➔ cat phoneme.pho
k       80
u       30       0 94 20 95 40 96 59 97 80 99 100 99
k       80
u       40       0 117 80 109 100 109
t       68
u       30       0 110 80 106 100 106
l       65
m       65
o~      76       0 102 80 76 100 76
d       65
_       301
_       1

– Conversion en fichier .wav avec le fichier des phonèmes :

[~/SYNTHESEVOCALE/MBROLA] ➔ mbrola -t 1.7 -e -C "n n2" /opt/mbrola/fr1/fr1 coucou.wav
[~] ➔ mbrola -t 1.7 -e -C "n n2" /opt/mbrola/fr1/fr1 phoneme.pho coucou.wav
[~] ➔ ls -l coucou.wav
-rw-r--r-- 1 pi pi 48952 mai   11 16:27 coucou.wav

– Conversion en mp3 :

[~] ➔ lame -m j -b 192 --resample 44.1 coucou.wav coucou.mp3
LAME 3.99.5 32bits (http://lame.sf.net)
Resampling:  input 16 kHz  output 44.1 kHz
Using polyphase lowpass filter, transition band: 20094 Hz - 20627 Hz
Encoding coucou.wav to coucou1.mp3
Encoding as 44.1 kHz single-ch MPEG-1 Layer III (3.7x) 192 kbps qval=3
    Frame          |  CPU time/estim | REAL time/estim | play/CPU |    ETA
    60/60    (100%)|    0:00/    0:00|    0:00/    0:00|   3.1987x|    0:00
-----------------------------------------------------------------------------------------------------------------------------------------
   kbps       mono %     long switch short %
  192.0      100.0        65.0  13.3  21.7
Writing LAME Tag...done
ReplayGain: -0.8dB

– Lecture :

[~] ➔ mpg123 coucou.mp3
High Performance MPEG 1.0/2.0/2.5 Audio Player for Layers 1, 2 and 3
        version 1.14.4; written and copyright by Michael Hipp and others
        free software (LGPL/GPL) without any warranty but with best wishes
Playing MPEG stream 1 of 1: coucou1.mp3 ...
MPEG 1.0 layer III, 192 kbit/s, 44100 Hz mono
[0:01] Decoding of coucou1.mp3 finished.

5) Lien :
http://bothari.free.fr/weblog/post/Ubuntu-Text-to-Speech-%28TTS%29

Leave a Reply

You must be logged in to post a comment.