Archive for November, 2005

11/30/2005: 12:43 am: RobertSoftware

I couldn’t resist upgrading to Firefox 1.5, even though I expected that some of the extensions I had installed would not be compatible. The extensions that are still enabled (or were added during the upgrade) include DOM Inspector, Talkback, Nuke Anything, AdBlock, FoxyTunes, Web Developer, Piggy Bank, View Rendered Source Chart, del.icio.us, Customize Google, ASCIItoUnicode, AJAX Yahoo Mail, Tab X, and Disable Target for Downloads. Extensions that didn’t make it, or at least not yet, include Nuke Anything Enhanced, Foxylicious, LiveHTTPHeaders, SessionSaver, and Bookmarks Synchronizer. GreaseMonkey, Open Java Console, Tamper Data, and Venkman also need to be updated.

One new feature I’m glad to see built into Firefox is drag and drop tabs. I used to have to install an extension on all my machines to get that feature. I use FireFox regularly on five different machines (with a mix of Linux, Windows, and OS X), so that was a lot of extra installs.

Another feature that I don’t need much myself, but would find desirable on a public machine with Firefox installed, is the new Clear Private Data option in the Tools menu. It gives you the ability to clear any or all of the following – browsing history, saved form information, saved passwords, download history, cookies, cache, and authenticated sessions.

The Blazer browser on my Treo allows me to navigate through the links on a page using a keypad button rather than using the stylus. This feature turned out to be far more valuable than I expected. Firefox now lets you do the same thing by pressing the tab key. As long as there aren’t too many links on a page, this is a very convenient way to navigate a website without having to reach for the mouse.

Firefox 1.5 also includes, at least on Windows, integration with a mail program via options in the Tools menu. When I selected Read Mail, Outlook Express was automatically launched. No sense in suffering further with Outlook Express, so I finally was inspired enough to install Mozilla Thunderbird. It automatically imported all my settings from Outlook Express, which I had rarely used anyway. Now, that will be never. I’m already much happier with Thunderbird.

11/29/2005: 12:53 am: RobertSpeech

Practically since the dawn of the Web, people have written about the possibility of offering voice access to websites over the phone. One of the biggest challenges is that many websites aren’t even accessible to special web browsers, e. g., IBM Home Page Reader, that are designed for people who are blind or who have severely impaired vision. This is partly due to the fact that many websites are coded such that the visual presentation information is deeply mixed with the content. Separating out the the visual presentation information is a major challenge, even for people who have control of their own website.

Most modern blogging or content management tools store the content in a database and use templates to render a presentation. However, the vast majority of the templates merely provide different skins, or themes, for a visual presentation. Perhaps you might get a template or two for printing or for PDA/mobile presentation, which are really just specialized visual presentations. Speech access requires a dialog-style interaction that is very different in nature. Perhaps even more importantly, our hearing system, a.k.a., our ears, don’t offer an analogue of page scanning. While most people can visually scan a web page quickly, it’s quite difficult for a person to pick out an item of interest from a series of short audio snippets that are played in quick succession. It’s also difficult or impossible to navigate between audio snippets at anywhere near the speed you can visually navigate a web page.

Now that I’ve rained on the SpeechWeb parade, let’s take a look at a recent article entitled “Call for a Public-Domain SpeechWeb” by Richard Frost in the November 2005 issue of Communications of the ACM. It’s not that I think the SpeechWeb is a bad idea; I just think it’s really hard to do well at a reasonable cost.

Frost has proposed an architecture where the speech browser (think web browser for your ears) and the speech recognition engine reside locally on an end user device, most likely a mobile device. The user would access a speech application on the browser in the same way that you access web applications that reside and mostly run on a remote web server. That seems fine so far. However, powerful speech recognition engines tend to be very CPU and memory intensive. That’s immediately a problem on small mobile devices.

An added complexity is that speech applications have to deal with much more ambiguity than web applications. With a web application, you can offer the user a wide variety of constrained input elements, such as buttons, combo boxes, and radio buttons. Imagine a web app where each step were a question followed by a free-form text box and where the app has to be coded to interpret whatever the user types. One of the tricks in writing a speech application, though, is that you design the questions so that people tend to answer them in a limited set of ways.

Frost has mostly avoided natural language interfaces and stuck with constrained recognition grammars. Natural language recognition always seems to be just a year or two away from being viable. Unfortunately, it’s been like that for over five years, and I haven’t seen any improvements or breakthroughs to make me think it is any less than 5 or even 10 years away from being commonly used in applications. At speech conferences, it seems like the same old directory assistance and call steering applications get trotted out as examples of how great the stuff works.

Near the end of the article, Frost talks about the promising development of lightweight, yet powerful, speech recognition engines, such as the X+V browser available with Opera via a collaboration with IBM. Maybe I need to set aside some time to download Opera and check this out.

There’s obviously a long ways to go to get to an environment where speech applications can run viably on handheld devices, but I think Frost’s suggestion is worth looking into. Eventually, mobile devices will have the horsepower to perform speech recognition accurately with large vocabularies. One advantage of doing the recognition locally is that you don’t have to worry about network issues, such as noise and lost packets. However, that’s not really a major issue with speech applications in call center. A bigger issue is dealing with the many different accents and styles of speech. If small speech recognition engines can be tuned to a single speaker in a cost effective way, that could greatly improve their accuracy.

11/14/2005: 12:18 am: RobertArts and Education, Hurricane Katrina

If you’re looking for good ideas for Xmas gifts, please consider buying from people and businesses located on the Mississippi Gulf Coast. Here are a few links I managed to dig up.

Also, please consider making a donation to the Ohr O’Keefe Museum of Art or the Walter Anderson Museum or any of the other museums on the Mississippi or Louisiana Gulf Coast.

: 12:08 am: RobertHurricane Katrina

Mary Mahoney’s is one of the most famous restaurants in Biloxi. The Mary Mahoney’s restaurant, bars, and cafe suffered a lot of damage, but the upstairs restaurant is already open. They plan to expand the cafe downstairs, but it needs a lot more work.

Having Mary Mahoney’s reopen is a good sign for the return of Biloxi. I’m going to be in Biloxi December 2-7, and I plan to stop by. While I’m there, I plan to eat out as much as I can at locally owned restaurants (not exactly a difficult task, given how good the food can be on the Mississippi Coast) and do some Xmas shopping in locally owned stores.

11/13/2005: 8:21 pm: RobertHurricane Katrina

schooners

I just spoke with my mother about how things are going in Biloxi. She went out on a schooner this weekend along the Biloxi coastline and said that the devastation is still hard to believe. The only large landmark that looks normal is the Biloxi Lighthouse. The big hotels are still standing, but many windows are still missing and the bottom floors are still mostly wiped out.

Most, if not all, of the casinos were built along with a hotel. Some of the hotels are in good enough shape to have already opened or be opening soon. Many of these hotels are initially reopening a small casino inside the hotel. Later, they will build new casinos on land.

The Imperial Palace is probably in the best shape. The hotel is currently housing a lot of the FEMA personnel. They plan to open a small casino inside the hotel on December 20th. That’s December 20th, 2005. I was amazed to hear that they could be ready that quick. The Isle of Capri is hoping to reopen on December 31st.

Treasure Bay Casino probably won’t be operating for at least 6 months and The Grand in Biloxi at least a year. The Grand might not rebuild in Gulfport, and instead focus on their Biloxi operation. The Beau Rivage may not reopen for 18 months. Casino Magic is the only casino I know of that is seriously considering not coming back at all, though there are a couple more casinos that I haven’t heard anything about.

: 3:58 pm: RobertEverything Else

Real estate info is obviously extremely suitable for being paired up with Google maps. Two sites my brother recently told me about are Home Price Records and Trulia. These sites serve complementary purposes – Home Price Records lists properties that have been sold and Trulia lists properties that are for sale.

Home Price Records prompts you to enter an address and some basic search criteria (number of bedrooms, square footage, and price range), and then it presents a list view of nearby properties meeting your search criteria. You can switch to map view to take advantage of the Google Maps integration.

Trulia is currently in beta release and allows you to search for houses in California. If you click on the More Options link, you get access to the same search criteria as Home Price Records, with the addition of number of bathrooms. Trulia listings typically include a photo of the property.

While the additional info available at Trulia is obviously more important for properties that are still for sale, Home Price Records would be a more useful site for assessing the value of one’s own home if it also provided this information. Trulia does not use MLS info, so it indexes sites, like real estate broker sites, that provide their own info. It would be a bit trickier for Home Price Records to present the extra day, as they would likely need to cache all this info long after the real estate brokers removed it from their sites. But, hard drives are pretty cheap these days.

Update 11/21/2005: Another complementary Google map mash-up is at For Sale by Owner Center. The listings are kind of sparse in Northern California, but there are houses listed for sale all across the country. In some areas, there are quite a few houses listed. You can directly contact the owner, as well as view the house via Google Earth. Listings can be filtered by price, bedrooms, bathrooms, and date listed, as well as sorted by all of those plus square footage. Listings are free for the owners of the homes.

11/6/2005: 8:51 pm: RobertBlogging and RSS, PhoneBlogger, Software, Speech, VoiceXML

I haven’t posted about PhoneBlogger in quite a while, but I’m thinking about updating and enhancing some of the code. A lot has happened in the audio/phone blogging world since I announced PhoneBlogger January 9, 2003, and posted the PhoneBlogger source code on SourceForge.

One new buzzword is mobcasting. The Wikipedia page on mobcasting quotes Andy Carvin as writing:

A quick example: imagine a large protest at a political convention. During the protest, police overstep their authority and begin abusing protesters, sometimes brutally. A few journalists are covering the event, but not live. For the protestors and civil rights activists caught in the mêlée, the police abuses clearly need to be documented and publicized as quickly as possible.

This is quite similar to the scenario I was thinking of nearly three years ago when I announced PhoneBlogger:

A journalist could use it from a payphone (good luck finding one, though) or with a basic cellphone to immediately publish to the web from the scene of an unexpected event in progress. It’s moblogging for the people, man.

Note the quaint reference to a payphone. My point was that you wouldn’t need a fancy phone. Of course, mobile phones have come a long way since I wrote that. Carvin’s example also includes the use of camera/videophones, rather then just audio.

My favorite part of the Wikipedia article, though, is near the end where it says:

Carvin is now exploring the creation of an open-source mobcasting tool that could be installed on a server to allow for community mobcasts via a local telephone call.

I’ve been thinking about the same thing, too. While it makes life simpler for me to host the application with a VoiceXML hosting provider like BeVocal, I do like the idea of having a more self-contained app. It’s going to be pretty complicated, though, to sort out everything I need with a free PBX like Asterisk or sipX, a free VoiceXML browser like OpenVXI, a free ASR engine like Sphinx, and a free TTS engine like Festival. Dealing with PSTN calls will also be a hassle. If I implemented this, I would probably just deal with SIP. That led me down the path of looking into building or finding a SIP softphone that could run on a mobile phone. There is a Java API, JAIN-SIP, for building a Java SIP user agent. The phone would ned only a J2ME runtime. What with all these acronyms and integration efforts, I think you can guess why I haven’t taken all of this on by myself, yet.

I’m glad to see that people like Andy are doing really interesting things with audio blogging. I built PhoneBlogger solely because I thought it would be fun to build. I never really ended up using it.

11/2/2005: 11:31 pm: RobertEverything Else

I suspect that you, just like me, wonder just what the friggin heck George Bush is trying to communicate when he gamely attempts to speak English, or in the case of one response during a debate with John Kerry, not speak at all. This video makes it all clear. It turns out that Bush has been carefully trained by Harlan Macraney not to speak smoothly, not to use “real words”, not to actually talk.

Public speaking is difficult. I sometimes get nervous when I speak in front of a lot of people, and I know that I say um and uh too often when speaking in public. But, my god, this man makes Dan Quayle look like a Toastmaster of the Year award winner.

Thanks for the link to the hilarious video, Michael!

11/1/2005: 10:56 pm: RobertHurricane Katrina

A friend of mine recently emailed some photos her husband took while driving down Highway 90 in Biloxi. He said the eastbound lanes are in relatively good shape, but the westbound lanes are still pretty bad. That’s surprising, as the eastbound lanes are closer to the water.

In one of my first posts after Hurricane Katrina hit, I included a photo I found somewhere on the web (Flickr?) of the McDonald’s near Edgewater Mall. Here’s a photo of the same McDonald’s on October 20th, along with the original image.