[XenoCafe Logo] Click for Homepage
Home Tutorials Forum Blog Advertising Links Contact About

 



Festival: Linux Text-To-Speech Tutorial and Demo

Written by Tony Bhimani
Posted on July 22, 2006

Requirements
*nix based Operating System
Festival for Linux
PHP 3 or higher
LAME Ain't an MP3 Encoder (optional)

Download the source code: festival-tts.tar.gz

Contents
Introduction
Installing the Festival RPM with yum
Playing with Festival Text-To-Speech
Saving Text-To-Speech Audio to Disk with text2wave
Installing LAME MP3 Encoder
Converting WAV-RIFF Audio Files to MP3's
Building a PHP Front-End for Festival (text2wave)
PHP Text-To-Speech Demo
Conclusion


Introduction

This tutorial teaches you how to do text-to-speech (TTS) synthesis under Linux with the Festival application. Festival is free software framework for Unix-like systems that can take plain text and convert it into audible speech output. I'm sure we've all seen the Microsoft Merlin character (also known as a Microsoft Agent) that is part of MS Office and other Microsoft products. I'm not sure if that's where Merlin originally made his debut, but I'm pretty sure he can be chosen as an Office Assistant. MS Agents use speech synthesis to provide better interaction in their native application. I personally find it annoying but there are people that like it (Clippy is more to my liking. He doesn't say a word). One way Linux text-to-speech can be accomplished is by using Festival. The Linux flavor of choice for this tutorial is CentOS 4.2, but any Unix-like operating system will work if you can get Festival compiled or loaded via RPM (or some other means). In this tutorial I'll show you how to install Festival using yum, create a HTML form to accept user input, convert the text to speech with Festival's text2wave program while using some PHP processing, and optionally convert the default WAV-RIFF audio file into an MP3 using LAME to compress the file size. There are many more possibilities of using text-to-speech than I will be showing here, but this will get your feet wet so you can venture on to create more interesting Linux text-to-speech applications (a reminder script invoked via cron perhaps?).


Installing the Festival RPM with yum

For this tutorial you'll be installing Festival by RPM instead of source, but if you'd rather install Festival from source code then you can download the package here. It'd be a good idea to do it from source since Festival is up to version 1.95 (2.0 beta) and the CentOS 4.x RPM festival-1.4.2-25.i386.rpm is version 1.4.2. I haven't tried the source and can only speculate they may have improved the text-to-speech synthesis engine, but for this example the RPM release is good enough. If you aren't sure if Festival is already installed on your system you can check using the which command or with rpm.

which festival

[graphical representation of executing 'which festival']

As you can see above, Festival isn't anywhere to be found. It's unlikely that it is installed but if you want to try with rpm then issue this command.

rpm -qa | grep -i festival

If the Festival RPM is not shown then you most likely don't have it installed. No problem, that's where yum comes in. Now we'll install Festival using yum.

yum install festival

[graphical representation of executing 'yum install festival']

yum will communicate with the CentOS repository, resolve any dependencies Festival may have with your system, and prompt you to download and install the Festival RPM. Enter y for yes and hit enter. The Festival RPM is 18 megs so you may want to get up and stretch while it downloads. Once Festival is downloaded to your drive, you don't need to manually install it, yum will do it for you. There, you're done installing Festival. Next we'll play with it and make the computer do the talking for us.


Playing with Festival Text-To-Speech

We'll take a look at a few examples of how to use Festival. You should read the Festival man page for more information on its options and usage. Essentially all you do is pass a string of text to Festival with --tts option and it'll synthesis it into speech. You can also use text files and scripts to accomplish more dynamic audio output.

Example 1: A beautiful day message (echo text)

echo "It's such a beautiful day! Why are you in front of the computer?" | festival --tts

Example 2: What's today's date? (program output)

date '+%A, %B %e, %Y' | festival --tts

Example 3: Random number of the day (PHP shell script)

./rand.php | festival --tts

#!/usr/bin/php -q
<?php
// seed srand
srand((double)microtime()*1000000);
// what is the random number?
echo "The random number for the day is " . rand(1,25);
?>

As you can see, you have many options for sending text to Festival. If you took the time to read the Festival man page then you may have noticed that there is no option to save the audio output to a file. You won't be able to redirect it like you would for text to a file, so what can you do? Simple, there is a program in the Festival package called text2wave that let's you save the audio to a disk as a WAV file (among others).


Saving Text-To-Speech Audio to Disk with text2wave

text2wave takes a text file's contents, converts it to audio speech, and saves it as either ulaw, snd, aiff, riff, or nist format. text2wave's default sound format is riff (or WAV, commonly found on Microsoft Windows). If you read the help on text2wave (text2wave --help) then you'll see it has very few options in comparison to Festival, but at least you can still pass it text using echo, programs, or shell scripts. One of text2wave's useful options is -scale for volume scaling. The default volume of speech is kind of low so you should raise it by using a float value (50 should be sufficient).

Using the examples from above, here is how you can create WAV files for each.

echo "It's such a beautiful day! Why are you in front of the computer?" | text2wave -scale 50 -o beautiful_day.wav

date '+%A, %B %e, %Y' | text2wave -scale 50 -o date.wav

./rand.php | text2wave -scale 50 -o rand.wav

If you pipe in a large amount of text you'll notice the size of the WAV file will can get very large. You can try one of the other text2wave supported audio formats or you can use LAME to convert the WAV file to a MP3. Next we'll install LAME Ain't an MP3 Encoder.


Installing LAME MP3 Encoder

To install Lame you first need to make sure you have the development tools like gcc installed on your system. If you don't have them then you can install them using yum (yum install gcc) along with all the dependencies. There may be some other packages not listed so you'll have to figure out which ones they are. Lame 3.97beta2 is provided as source (get it from sourceforge) and there may be RPM's available out there, but for this tutorial we'll be compiling the source to create our binaries. Make sure you're logged in as root and in your home directory, download the source package using wget, extract the gzipped tarball, move into the extracted lame-3.97 directory, run the configure script, make, and install the compiled binaries.

su -
[enter root password]
cd ~
wget http://easynews.dl.sourceforge.net/sourceforge/lame/lame-3.97b2.tar.gz
tar zxvf lame-3.97b2.tar.gz
cd lame-3.97
./configure --prefix=/usr
make
make install

If LAME compiled without problems then running it from the command line (lame) should give you this.

[graphical representation of execuing 'lame']

With LAME installed now we can convert those big WAV files to smaller MP3 versions.


Converting WAV-RIFF Audio Files to MP3's

Using LAME is very straightforward. All your do is pass it the WAV file name and give it a MP3 output file name. I won't go in to all the LAME options but you can read up on them from the man page (man lame). Now we'll take our three examples from before and convert those WAV files to MP3.

lame beautiful_day.wav beautiful_day.mp3
lame date.wav date.mp3
lame rand.wav rand.mp3

Now if we compare the WAV files to the MP3 versions you'll see the size difference and why MP3 would be the preferred format.

cd ~
ls -la *.wav
ls -la *.mp3

[graphical representation of executing 'cd ~', 'ls -la *.wav', 'ls -la *.mp3']

To highlight everything you've learned so far, now we'll create a PHP front-end so you can create text-to-speech files from your browser.


Building a PHP Front-End for Festival (text2wave)

To build a PHP front-end for text-to-speech processing, we'll create a HTML form that contains a textarea for what you want converted to speech, a text input field for the volume scale, and a checkbox whether to convert the WAV output to a MP3 file. I added a lot of comments for you to see what each part of the code does so I won't explain it here in great detail. All you need to know is that you type some text in the textarea, adjust the volume if you want and select whether output is MP3 or WAV. Once you click the Text-To-Speech button, the form performs a postback and the form data is captured and processed. The textarea data is written to a temp file for text2wave to convert and text2wave is executed using the PHP exec function. If the MP3 option is selected, a second exec is used to convert the WAV to MP3 using LAME. When the page reloads there is a link to the audio file next to the submit button. You can click on it to listen or right-click, 'Save Target As' to download it.

<?php
// define the temporary directory
// and where audio files will be written to after conversion
$tmpdir = "/tmp";
$audiodir = "/change/to/your/path";

// if the Text-To-Speech button was click, process the data
if (isset($_POST["make_audio"])) {
  $speech = stripslashes(trim($_POST["speech"]));
  $speech = substr($speech, 0, 1024);
  $volume_scale = intval($_POST["volume_scale"]);
  if ($volume_scale <= 0) { $volume_scale = 1; }
  if ($volume_scale > 100) { $volume_scale = 100; }
  if (intval($_POST["save_mp3"]) == 1) { $save_mp3 = true; }

  // continue only if some text was entered for conversion
  if ($speech != "") {
    // current date (year, month, day, hours, mins, secs)
    $currentdate = date("ymdhis",time());
    // get micro seconds (discard seconds)
    list($usecs,$secs) = microtime();
    // unique file name
    $filename = "{$currentdate}{$usecs}";
    // other file names
    $speech_file = "{$tmpdir}/{$filename}";
    $wave_file = "{$audiodir}/{$filename}.wav";
    $mp3_file  = "{$audiodir}/{$filename}.mp3";

    // open the temp file for writing
    $fh = fopen($speech_file, "w+");
    if ($fh) {
      fwrite($fh, $speech);
      fclose($fh);
    }

    // if the speech file exists, use text2wave
    if (file_exists($speech_file)) {
      // create the text2wave command and execute it
      $text2wave_cmd = sprintf("text2wave -o %s -scale %d %s",$wave_file,$volume_scale,$speech_file);
      exec($text2wave_cmd);

      // create an MP3 version?
      if ($save_mp3) {
        // create the lame command and execute it
        $lame_cmd = sprintf("lame %s %s",$wave_file,$mp3_file);
        exec($lame_cmd);
        // delete the WAV file to conserve space
        unlink($wave_file);
      }
      
      // delete the temp speech file
      unlink($speech_file);

      // which file name and type to use? WAV or MP3
      $listen_file = (($save_mp3 == true) ? basename($mp3_file) : basename($wave_file));
      $file_type = (($save_mp3 == true) ? "MP3" : "WAV");

      // show audio file link
      $show_audio = true;
    }
  }
} else {
  // default values
  $speech = "Hello there!";
  $volume_scale = 50;
  $save_mp3 = true;
}
?>
<html>
<head>
<title>Festival: Linux Text-To-Speech Demo</title>
<style type="text/css">
<!--
body { background-color:#ffffff; font-family:Arial, Helvetica, sans-serif; font-size:10pt; color: #000000; }
h1 { font-family:Arial, Helvetica, sans-serif; font-size:18pt; color: #000000; }
.tblfont { font-family:Arial, Helvetica, sans-serif; font-size:10pt; color: #000000; }
-->
</style>
</head>
<body>
<h1>Linux Festival Text-To-Speech Demo</h1>
<form method="post" action="<?php echo $_SERVER['PHP_SELF']; ?>">
  <table width="400" border="0" cellspacing="5" cellpadding="0" class="tblfont">
    <tr> 
      <td colspan="2"><textarea name="speech" wrap="VIRTUAL" style="width:350px;height:100px;"><?php echo $speech; ?></textarea></td>
    </tr>
    <tr> 
      <td width="135">Volume Scale 
        <input name="volume_scale" type="text" size="3" maxlength="3" value="<?php echo $volume_scale; ?>"> 
      </td>
      <td width="265">Save as MP3 
        <input name="save_mp3" type="checkbox" value="1"<?php if ($save_mp3 == 1) { echo " checked"; } ?>> 
      </td>
    </tr>
    <tr> 
      <td><input name="make_audio" type="submit" value="Text-To-Speech"></td>
      <td> 
        <?php if ($show_audio) { ?>
        <a href="audio/<?php echo $listen_file; ?>">Listen to the <?php echo $file_type; ?> file</a> 
        <?php } ?>
      </td>
    </tr>
  </table>
</form>
</body>
</html>

PHP Text-To-Speech Demo

Here is the demo of the PHP front-end in action. Type in some text and click the Text-To-Speech button. When the page refreshes there will be a link next to the button. That's the text-to-speech synthesis Festival created. Have fun!

Festival Text-To-Speech Demo

The text-to-speech demo has been retired as of December 23, 2007. Sorry for the inconvenience.

Conclusion

There you have it. You now have the tools to create TTS audio files using Festival on Linux. You could create a variety of applications using text-to-speech synthesis technology, such as a reminder service by sending audio streams over the phone or email. Festival is a great tool but sometimes it's hard to understand what the voice is saying. It has trouble with some words and phrases so it might not be the ideal solution for a commercial venture. Check out AT&T Labs Natural Voices Text-to-Speech Engine and try out their demo. It's of commercial quality and it sounds really good, the only problem is it isn't free like Festival.



How would you rate the usefulness of this content?

Poor 1
2
3
4
5
6
7
8
9
Outstanding

Optional: Tell us why you rated the content this way.
Characters remaining: 1024
Average rating: 4.55 out of 9.

1 2 3 4 5 6 7 8 9
992 people have rated this content.
This page has been viewed 180,063 times
Copyright © 2004-2014 XenoCafe. All Rights Reserved. XenoCafe is Powered by Linux. Free your mind and your wallet. Switch to Linux.