Archive for November, 2004

11/30/2004: 12:34 am: RobertSpeech

The Baltimore Sun published an excellent, detailed article on expressive speech synthesis [requires free registration] that focuses mostly on work going on at IBM. The resulting speech synthesizer would be not only be able to laugh, cough, pause for a breath, and say uh and um, but would also be able to smoothly switch between an uptone voice and a downtone or neutral voice when appropriate.

The article also discusses speech recognition research aimed detecting when a speaker is frustrated or angry. One of the claimed values of this capability would be in more quickly detecting when a caller should be transferrerd to a live agent. I don’t think this is a huge step forward, though, since most speech applications can already handle this pretty well. Most callers know that if they press zero or say “agent”, they will get transferred to an agent relatively quickly, depending on how the speech application was programmed. One catch is that the other grammars need to be designed to avoid collisions with words like “agent” and “operator” as much as possible.

Also, some speech applications are designed to complete automate a service. If the company providing the service would lose a lot of money by allowing transfers to live agents, they might decide not to offer that feature. While that might sound short-sighted, there are definitely some scenarios where this makes a lot of sense.

Not only does the article provide a concise description of concatenative speech synthesis, but it also includes an interesting update on laughter research. Recent studies have shown that less than 15% of laughter is in response to intentional jokes. You’ll have to read the article to find out why people do laugh, and what the heck a laugh note is.

[via ACM News Service]

11/28/2004: 1:45 pm: RobertSoftware, Speech, VoiceXML

My main experience with XML-based programming languages has been with VoiceXML. One nice advantage of an XML-based language is that the syntax checker is essentially free, assuming the language provides a DTD, or preferably a Schema. Of course, most language compilers and interpreters also come with a syntax checker, so the DTD/Schema advantage is primarily a time saver for the language creator.

The Ant build program also uses an XML-based programming language. I’ve written quite a few Ant build files by hand, and the experience has brought little joy. However, writing old school make files was even more frustrating. The many problems with the use of XML with Ant have been noted by Bruce Eckel, Martin Fowler, and Patrick Logan. Patrick points out YAML as a possible compromise.

VoiceXML has the further complication that good VoiceXML applications tend to be a lot more dynamic than build files. Since the goal of a VoiceXML application is to create a natural sounding, automated voice dialog, you generally want to customize the dialog for each caller and to slightly vary the spoken dialog on each call so as to avoid a completely robotic, scripted effect. While you can accomplish this to some degree with static VoiceXML files, it’s far easier if you dyamically generate some or all of the VoiceXML at runtime. With an Ant build file, I think it would be rare that you would want to dynamically generate the build script every time you run it.

Manual creation and editing of Ant files and VXML files is frustrating and limiting. The resulting files seem overly verbose. Because they are so verbose, it’s difficult to keep large sections of the code in your head at one time.

Because manual VoiceXML coding is so difficult (hand-coding SALT-based applications is even worse) a lot of vendors have developed SDKs or graphical IDEs that mostly or completely hide the VoiceXML code from the developer. The good news is that this opens up speech application development to a lot more people. For the advanced developer, it also makes it easier to create large, dynamic applications. Of course, there are always downsides.

In my opinion, the biggest disadvantage is that most of these SDKs and IDEs throw portability out the window. While VXML code is relatively portable between different VXML browser implementations, the IDEs typically don’t generate VXML directly. Instead, they generate an intermediary form, usually consisting of POJOs (Plain Old Java Objects), servlets, JSPs, and/or XML files with custom tags. The intermediary form must then be processed by a server runtime layer that sits on top of a web or application server.

Nonetheless, there are several very good reasons that the SDK and IDE vendors have gone down this path:

  • Markup languages like VoiceXML and SALT are too low level for large, sophisticated applications.
  • Many developers, especially those in IT groups, prefer drag-and-drop graphical IDEs
  • Mapping of VoiceXML tags directly to graphical forms would provide only minimal abstraction for a new developer
  • The abstraction from VXML means a tool could theoretically use a single dialog design to dynamically generate applications in multiple markup languages, like SALT, XHTML, WML, XHTML+Voice, or a chatbot markup language for a single multi-modal interaction

The VoiceXML 3 standard is targeted to close some of this gap, by I haven’t been involved enough in the process so far to comment on how successful the voice browser working group will be.

The closest thing I’ve seen to this situation in the Java world is BEA WebLogic Designer. WebLogic Designer provides a significantly higher level abstraction above not only web services, but also web application and database integration development. The goal was to bring the good parts of Visual Basic to the corporate Java developer. The downside is that WebLogic Designer generates code that will run only with WebLogic runtime components. Therefore, the ease of use advantage costs you portability. Nonetheless, WebLogic Designer can provide a huge productivity advantage to many developers.

11/27/2004: 5:24 pm: RobertLinux

I previously posted on getting the Unreal Tournament 2004 demo working on Fedora Core 1. The biggest problem was getting the right NVIDIA video card drivers installed. Now that I’ve upgraded to Fedora Core 2 on my desktop, I have given up on the Livna NVIDIA drivers and am now using NVIDIA’s proprietary drivers.

The first thing to try is running glxgears from a shell prompt. Every five seconds it will print out the number of frames per second that are being drawn to the screen. Press the Escape key when you’ve seen enough.

$ glxgears
9751 frames in 5.0 seconds = 1950.200 FPS
10738 frames in 5.0 seconds = 2147.600 FPS
10744 frames in 5.0 seconds = 2148.800 FPS
10740 frames in 5.0 seconds = 2148.000 FPS

I get about 2,000 frames per second on a 1.8 GHz P4 with an NVIDIA GeForce Ti200 video card. If you get a number in the hundreds, then you’re probably just using the generic frame buffer video driver. Forget about doing anything that requires 3D graphics until you install the proper driver.

The easiest way I have found to install the driver is to use the one on NVIDIA’s website. Select the appropriate items (for me, this was Graphics Driver -> GeForce and TNT2 -> Linux IA32), and then click Go! Download the driver file (it ends in .run).

Before going any further, make sure you have downloaded the source code for the kernel you are running. FC2 will do this by default, but FC3 doesn’t. You can use the command uname -r from a shell prompt to find out what kernel version you are running.

One catch about installing the driver is that you can’t be running the X Window system during the install. In my opinion, the best way to do this is to edit /etc/inittab so that X is not automatically started. If anything goes wrong during the install, it’s easier to fix the problem if X isn’t trying to automatically start. In the inittab file, change the line that looks like:

id:5:initdefault:

to

id:3:initdefault:

Then reboot. When you get to a command line login prompt, login as a regular user. Then su to root and cd to the location of the .run file that you downloaded and run it. Assuming you saved the file to a downloads directory in your home directory, this would look something like:

$ su
Password:
# cd downloads
# sh  NVIDIA-Linux-x86-1.0-6629-pkg1.run

Go through the install and answer yes when asked if the installer should build a new kernel module. If you haven’t already installed the source RPM for your kernel, this will not succeed.

After the install completes, you need to edit some X11 config files. If you are running FC2 successfully, you have presumably switched over to using xorg instead of xfree86. Edit /etc/X11/xorg.conf to comment out the line that says dri in the “Module” section. Also, in the “Device” section”, change the driver from “nv” to “nvidia”.

After saving your changes to xorg.conf, press ctrl+d to logout as root and return to the session as a regular user. Start X by typing startx at a shell prompt. As X is loading, you should very briefly see an NVIDIA logo on a white background.

After rebooting a couple times and convincing yourself that all is well in video driver land, you might want to change /etc/inittab back to always starting X. But, then again, maybe not. Whenever you start using a new kernel, you will have to update the NVIDIA driver. The good news is that you don’t have to download a new file from the NVIDIA website. Instead, you just need to drop out of X, uninstall the driver, and then reinstall it. When you go through this process, it will trigger the building of an interface for the new kernel.

# nvidia-installer --unistall
# nvidia-installer --update --accept-license

Once you get the driver installed, run glxgears again to make sure it really is working. Then, play Onslaught in UT 2004. Repeatedly.

11/19/2004: 12:42 am: RobertSoftware

If you’re a fan of O’Reilly (the book publisher, not the obnoxious lout on Fox who is discounted by nearly everyone), you should check out the big sale on O’Reilly books at BookPool. They have over 500 O’Reilly published computer books on sale for 43% off. Though I already have more books than I could read, even if I did nothing but that for the next year straight, I couldn’t resist picking up five more books.

One of them wasn’t an O’Reilly book, but it was still well discounted. It was The Java Developer’s Guide to Eclipse by Jim D’anjou, et al. This was a timely purchase, as I just gone to an excellent presentation on Eclipse by Jim at a Java SIG meeting in Oakland last night. Jim works at IBM and is heavily involved in the Eclipse project.

11/16/2004: 10:03 am: RobertSoftware

Get Firefox!

It has come to my attention through my website reports that some of you are stil using Internet Explorer. Are you mad! Good god, my dear person, won’t someone think of the innocent computers, i.e., mine, that are subject to spam and hacking zombie attacks caused by people who use IE and have their computer’s taken over by dirtbags who have written viruses/worms/spyware that take advantage one, two, or a hat trick of the legions of vulnerabilities in IE. Everytime you use IE, a Teletubby dies.

So, cut it out with the IE junk. Please download and use Firefox. Firefox has been a great web browser for many, many months, and now the 1.0 release is out and is extremely stable. I swear you won’t regret switching. Maybe you don’t think IE is a piece of junk, but just wait until you’ve tried Firefox. You will quickly realize what a lame web browser IE really is. Could the IE bookmark manager possibly be any worse designed? I don’t think so.

If you use Windows, you do need to use IE for Windows Update and for a handful of other non-standard websites out there. The vast majority work as good or better in Firefox, though. And once you’ve tried tabbed browsing, you can’t go back.

One of the best things about Firefox is the number of cool extensions you can get for it. Foxytunes, for instance, is awesome. It lets you control your music player through a toolbar in the Firefox statusbar. Foxytunes works with a large number of music players on OS X, Windows, and Linux. I’ve got about another 15 cool extensions installed. One of these days, I’ll post a list.

Get Firefox!

11/13/2004: 7:55 pm: RobertLinux

It seems like only a few months ago (August 22nd, to be exact) that I upgraded my Linux install from Fedora Core 1 to Core 2. Now, Fedora Core 3 is already out. For now, I’m going to leave FC2 on the desktop PC, but I will put FC 3 on my new laptop.

Anyway, here’s a quick run down of the problems I ran into with FC 2, and how I fixed them.

Error activating XKB configuration

This was a well documented issue that came up because of the switch from xfree86 to xorg for the X11 implementation. The fix is to edit the /etc/X11/Xf86Config file. Change the line :

Option "XkbRules" "xfree86"

to

Option "XkbRules" "xorg"

Yum Failed Due to Wrong Repositories

Some of the URLs in my yum.conf file were hard-code for FC1. During the upgrade, the file /etc/yum.rpmnew was created. After backing up my FC1 yum.conf and renaming yum.rpmnew to yum.conf, I was back in business.

Failed NVIDIA Dependencies

I was using the NVIDIA loadable kernel modules from Livna.org with FC1. I couldn’t get that trick to work with FC2. It turns out that the kernel provided with FC2 uses 4k stacks instead of 8k stacks. The NVIDIA drivers needed 8k stacks. Rather than recompile the kernel with 8k stacks or use one someone else compiled, I waited around long enough for NVIDIA to fix it. Version 6111 of the NVIDIA drivers worked for me. After downloading the installer from the NVIDIA website, I did the following as root.

# sh ./NVIDIA-Linux-x86-1.0-6111-pkg1.run

Then I made the following change in /etc/X11/xorg.conf

Changed Driver "nv" to Driver "nvidia"

Then, I confirmed that Load “glx” was in the Module section and that Load “dri” and Load “GLcore” were not present. I saw a message “Could not find kernel module interface’ and then a warning about “rivafb” kernel module. The install finished, I rebooted, and I had my fast NVIDIA drivers back again. Unreal Tournament 2004 worked great.

Rhythmbox Problems

# yum install gstreamer-plugins-mp3
# yum install libmad
# yum install libid3tag

XMMS Problems

# yum install xmms-mp3

I also needed to use OSS instead of ALSA. I used Options->Preferences->Audio I/O Plugins to change the Output Plugin to OSS Driver 1.2.1.0.

Evolution Error - Cannot Activate Component OAFIID

The full error message was something like:

Cannot activate component OAFIID
GNOME_Evolution_Mail_ShellComponent
The error from the activation system is
Unknown CORBA exception id: 'IDL:omg.org/CORBA/INV_OBJRE:1.0'

The fix was to delete /tmp/orbit-username (replacing username with my login name) and then reboot.

Make Firefox default browser

Used gconf-editor - desktop->gnome->applications->browser and changed exec from “mozilla” to a symbolic link to the install of Firefox that I installed from a tarball. Since FC3 comes with Firefox, this should be an issue going forward.

11/6/2004: 12:12 pm: RobertEverything Else

Day of the Dead boots

Perhaps due to the influence of having lived in Texas for ten years, I was overcome about five years ago by the need to buy some cool-looking cowboy boots. Unfortunately, they weren’t the pair you see on this page. While I had thought my boots were pretty cool, my next door neighbor has since tipped me off to a couple of companies that produce absolutely amazing cowboy boots. Of course, amazing doesn’t come for cheap. Many of these boots come in at over $1,000 a pair.

Rocketbuster Boots makes these gorgeous Day of the Dead-themed boots. The Sasquatch boots are a bit much if you’re averse to shag carpets, but many of their other boots are way cool. Actually, the Sasquatch boots are kind of cool in a retro 70’s kind of way.

Another site for awesome boots is the Liberty Boot Company. My favorites are the Killaz, 62 Muertos, and Barbed Wire. You should check them all out, though.

11/3/2004: 12:22 am: RobertBlogging and RSS

After many months of getting no more then a handful of spam comments per day on this blog, last week I started getting 50-100 spam comments a day. While almost all of them were caught by WordPress and put into the comment moderation queue, I was wasting a lot of time each day deleting them from the queue and deleting the emails that notified me of each comment.

After doing a little research, I came up with three tactics for fighting back. As with security, a layered approach works best. No single tactic is going to stop all black hat hackers or spammers. The goal is to create enough of a deterrent that they decide to go somewhere else.

With black hat hackers, deterrents don’t always work, since they tend to have lots of free time on their hands and they enjoy a good challenge. Fortunately, spammers just want to make money. They are seeking the cheapest way to generate lots of inbound links to boost the Google PageRank of their websites. In addition to using automated programs to post their annoying comments, the spammers often employ legions of people in India, China, etc. to post comments (I’m assuming this based on the reverse lookups on IP addresses). Assuming that the spammer’s orcs are getting paid based on comments posted, they also want to post the comments as quickly as possible.

While I would be glad to post the three changes I made, I fear that one of the spammers’ orcs might actually read this. So, email me if you want to know what changes worked for me.

So far, I’ve gone from nearly 100 spam comments a day to exactly 0 spam comments after making the changes about 48 hours ago. Also, if they figure out one of the changes, I can trivially alter that tweak to shut them down again. It will be faster for me to make the change than for them to update their system. With that equation, I believe I will come out the winner in the end.

I updated my WordPress install notes to suggest some of the changes.