Back to Main Page

Introduction

Installing PhoneBlogger is a little challenging due to the various moving parts, but hopefully you will find it worth the effort. PhoneBlogger is primarily a VoiceXML application, but complex VoiceXML apps with web integration often require a lot of coding in JavaScript or a CGI scripting language to accomplish the heavy lifting. Since I'm a big fan of Python and I had already written a Python CGI script for my SoccerPhone app, I decided to go with Python again. Porting the scripts to Perl or PHP probably wouldn't be that hard, if you're so inclined.

Prerequisites

You can host the VXML files, the CGI scripts, and your weblog all on the same server, or each on different servers. If you put the CGI scripts on a different server than your weblog, you will need to set the useNewMediaObject config setting to true.

Weblog Tool

PhoneBlogger uses the MetaWeblog XML-RPC API to post to a weblog. I have tested it only with Movable Type and Blogger, but it shouldn't take too much work to get PhoneBlogger to post to another weblog tool that supports the MetaWeblog API or a similar API, like the Blogger API. I have already completed part of the implementation necessary to support many different blog tools via simple changes to an XML configuration file.

VoiceXML Server Developer Account

Since PhoneBlogger is a VoiceXML 2.0 app, you need access to a voice server running a VoiceXML 2.0 browser. These voice servers will include the speech recognition engine, text to speech engine, JavaScript engine, etc. that PhoneBlogger relies on. I originally developed PhoneBlogger in late 2002 using a free hosted account at TellMe. While I plan to test the VoiceXML code with other services, like BeVocal, Voxeo, and HeyAnita, I haven't had time to do that, yet. If you want to take the quickest and easiest route, get an account with TellMe. If you want to be a trailblazer and help debug the code on another service, let me know and I will be glad to help you with it.

Signing up for a developer account at TellMe is very easy. The biggest challenge is coming up with a 5 digit extension that is meaningful to you but hasn't already been taken by another developer.

Python CGI Scripts, SoX, and LAME

Python

PhoneBlogger uses several Python CGI scripts to copy files, convert audio formats, and post to your weblog. The CGI scripts can run on a web server other than where your blog is located. Although it will run slower, this could allow one person to host the complicated parts of PhoneBlogger for several other bloggers.

I started the development of the Python scripts on local installations of Python 2.2 and Apache 2.0 on both RedHat 8.0 Linux and on Windows XP. I then switched to a Python 2.1.3 install on Debian GNU/Linux on the server at DreamHost that hosts my website and blog. Python 1.5.2 may be sufficient, but try to get access to at least a Python 2.1 installation. If your website host doesn't have 2.1 or newer installed, you may be able to compile it and install it yourself. Email me if you run into trouble and I may be able to help you out.

If you use a pre 2.2 version of Python, you will need to get a copy of the xmlrpc library that ships with Python 2.2.

  1. Go to the xmlrpc lib download site and scroll down to the bottom of the page
  2. Download xmlrpclib-1.0.1.zip
  3. Unpack the zipfile and copy the file xmlrpclib.py to the same directory where you placed the PhoneBlogger Python CGI scripts

SoX

One of the Python CGI scripts (AudioUp.py) spawns a process to call SoX to convert the u-law WAV file generated by the TellMe voice server to an uncompressed WAV file. This step was needed, because LAME does not support u-law WAV files. If your server does not have SoX already installed, you will need to install it. There are instructions below for Linux.

LAME

One of the Python CGI scripts (AudioUp.py) spawns a process to call LAME to convert the WAV file generated by SoX to a 96 kbps MP3 file. For a phone call on the PSTN, 96 kbps is sufficient. If your server does not have LAME already installed, you will need to install it. There are instructions below for Linux.

Installing PhoneBlogger

First download the files from PhoneBlogger project site on SourceForge. Either through an FTP client, ssh, or telnet, create a directory structure like the following on your server:

/home/your_username/bin <-- SoX and LAME
/home/your_username/downloads <-- download and compile area for SoX and LAME
/home/your_username/example.com/cgi-bin  <-- Python scripts
/home/your_username/example.com/blog <-- Doc root for blog
/home/your_username/example.com/blog/audio <-- MP3 files
/home/your_username/example.com/phoneblogger <-- VXML files, grammar files, etc.
/home/your_username/example.com/phoneblogger/test <-- HTML test harnesses

This is just the directory structure I used. You could use a very different structure with different directory names (e.g., using the root of your website instead of a sub-directory called blog, storing audio files in a directory called mp3 at the same level as the blog directory, etc.). Just make sure you adjust the paths correctly in the Blogs.xml config file.

Python CGI Script Setup

  1. Copy the two Python files to the cgi-bin directory
  2. Make the files executable. From a shell via telnet or ssh, that means:
  3. chmod 755 AudioUp.py
  4. chmod 755 PostAudioToBlog.py
  5. Update first line of Python scripts, if necessary, to point to the python binary on your system

SoX Setup on Linux

  1. Download sox-12.17.3.tar.gz or newer from the SoX project site
  2. Copy/FTP it to the downloads directory (specified above) on your server.
  3. Enter the following three commands at a shell prompt:
    1. ./configure --prefix=/home/your_username/bin/sox
    2. make
    3. make install

You should then have a SoX binary in the bin/sox/bin directory.

LAME Setup on Linux

If your server does not have LAME installed, you will need to install it.

  1. Download lame-3.93.1.tar.gz or newer from the LAME project site
  2. Copy/FTP it to the downloads directory (specified above) on your server
  3. Enter the following three commands at a shell prompt:
    1. ./configure --prefix=/home/your_username/bin/lame
    2. make
    3. make install

You should then have a LAME binary in the bin/lame/bin directory.

VoiceXML Setup

  1. Copy the vxml, gsl, js, and xml files to the phoneblogger directory
  2. Create a sub-directory called test and copy the HTML files into it.

The HTML files contain test harnesses that are described below.

XML Setup

In the PhoneBlogger directory, make a copy of one of the blog entries you see in the Blogs.xml file and edit the entries to match your configuration. You can test whether the Blogs.xml file is formatted correctly by loading the phoneblogger/test/testBlogsConfig.html in a web browser. If you still use IE as a web browser, you will need to use phoneblogger/test/testBlogsConfigIE.html. This test page also makes sure that the JavaScript code in the phoneblogger directory was not corrupted.

Grammar File Setup

PhoneBlogger currently uses Nuance GSL formatted grammar files.

In BlogNames.gsl, you need to provide the name of your blog and the word or words you will say for the speech recognition engine to detect. If you are setting up PhoneBlogger for use with multiple blogs, you need to add an entry for each blog.

In this sample BlogNames.gsl entry:

[dtmf-1 mine (my blog)] {<blogName "My Blog">}

the part in [square brackets] is what you will say or do and the part in {curly braces} specifies the name of your blog. In this example, I can press the 1 key on my phone keypad, say "mine" or say "my blog". Think of it as a phonetic spelling. Although the speech reco engine used by TellMe is quite good, you may need to phonetically spell words if the words are unusual or not words found in a dictionary. Be sure to use only lowercase. FYI, dtmf stands for dual tone, multi-frequency.

In grammars, group words that must be said together in order within parentheses. In the above grammar, the caller can't just say "my" to generate a match with "My Blog". Saying "blog my" won't work, either. The caller has to say "my blog". When items are alternatives, group them in square brackets. In Boolean logic terms, items in parentheses are And'd and items in square brackets are Or'd.

In BlogUserNames.gsl, you will need an entry for each blog user. If you use same username on two blogs, you will only need one entry.

In this sample BlogUserNames.gsl entry:

[dtmf-1 (mister blogging person) me] {<userName "me">}

In this example, I can press the 1 key on my phone keypad, say "me" or say "mister blogging person". Once again, you may need to adjust the spelling of the text in parentheses in order for the speech recognition engine to get the right match. Be sure to use only lowercase.

The text in quotes to the right needs to exactly match your username.

TellMe Setup

After you login to studio.tellme.com, click on MyStudio in the left menu. Be sure the Application URL tab is selected to the right. In the edit box under the label Application URL, enter the URL for the main.vxml file. For my site, the path is: http://www.wombatnation.com/phoneblogger/main.vxml

Assuming you have already installed main.vxml on your server, click Update. After 5-10 seconds, you should see:

Your change was successful. The syntax checker was run. (No errors detected in your VoiceXML.)

Final Configuration

After installing the files, you need to configure a few things:

Configuration Testing

There are two HTML files that allow you to test the functioning of the Python scripts independently of the VXML files. The files are in the phoneblogger/test/ directory and their names correspond to the Python files they test.

First, load phoneblogger/test/testAudioUp.html in a browser. You will need to browse to or type in the path to a 8 kHz, 8-bit mono u-law WAV file. You can use any of the pre-recorded prompts that come with PhoneBlogger for this purpose. The other variables on the form correspond to entries in Blogs.xml.

After you click submit, the AudioUp.py script will be run exactly as it is run by the VoiceXML application. If it works, you will find a new MP3 file on the server. The file will be under the audio directory you specified on this HTML form. It will then be in a sub-directory of the form YYYYMM, e.g., 200311 for November 2003.

Once you get AudioUp.py working, load phoneblogger/test/testPostAudioToBlog.html. Most of the variables on the HTML form correspond to entries in Blogs.xml. You can use the MP3 file that was created above with this form. Copy just the date part of the directory and the filename into the edit box, e.g., YYYYMM/FILENAME.mp3.

After you click submit, the PostAudioToBlog.py script will be run exactly as it is run by the VoiceXML application. If it succeeds, you will have a new entry on your blog.

Troubleshooting

The Python cgi scripts write logfiles to the directory they are in. AudioUp.log will tell you the filename used to save the MP3 file. The first line that is logged is the day and time. If you don't see a date entry, it probably means the scripts were not found or could not be run. Check to make sure you set the permissions properly (see notes on chmod above).

If the SoX and LAME conversions worked, you should see:

Converted to uncompressed wav
Converted to mp3

If the conversions worked, listen to the mp3 file to make sure it sounds okay. Then, check out PostAudioToBlog.log for clues as to where something might have gone wrong. If you see lines that say "XML-RPC Fault: ServerInvalid login", you might have spelled your username wrong in BlogUserNames.gsl or you may have entered your password incorrectly. Your password on your blog must currently be digits that can be entered on a phone keypad.

If you can't sort out the logfiles, you are welcome to email them to me and I will try to help you out.

Back to Main Page