WombatNation: April 2004 Archives

April 28, 2004

Toilet of the Future

BBC News | HEALTH | Stay healthy with the toilet 'doctor'

I don't often get excited about toilets, unless of course we're talking about my incinerating toilet, but an English company has dared to do the unthinkable - combine a waste removal device with an automated diagnostician. Or some such thing. Dear readers, please direct your rapt attention to the Versatile Interactive Pan.

Okay, so maybe on the surface it looks like a Kevlar diaper or an uncomfortable Sumo wrestler's loincloth, but this device is full of technology for measuring what users were full of.

Although the device is still just in the concept phase, the promise of getting access within five years to a voice activated toilet seat and an instant analysis of my urine and stool samples has me already trying to figure out how to rip out the current, woefully outdated toilet that was installed at my house just 68 years ago. Of course, the incinerating toilet stays at the cabin, though I wish it didn't dim the streetlights every time someone uses it.

I'm a bit worried about the following quote, though.

[Spokesman] Mr Wooliscroft went on: "We also want to link to the local supermarket. If, for example, a person is short on roughage one day, an order of beans or pulses will be sent from the VIP to the supermarket and delivered that same day."

First, let's ignore the fact that I have absolutely no clue what pulses are, but I'm not sure I want to receive them to solve an inadvertent roughage shortage. It sounds kind of painful. No, the bigger concern is that this quote implies a significant disconnect from reality. If this is what passes for serious thinking at Twyford Bathrooms, then I'll go ahead and give up now on them ever being able to deliver to me a working version of my dream toilet of the future.

[via Good Morning Silicon Valley]

Posted by Robert at 08:19 PM | link | comments (0) | trackback (0)

April 27, 2004

AudioBlog.com

There's a new audioblogging service in town, although AudioBlog.com is currently only in beta. So far, the service supports MovableType, TypePad, Blogger, and LiveJournal blogs. One cool aspect of AudioBlog.com is that in addition to audioblogging by phone, you can audioblog from any computer with a microphone and a Flash-enabled web browser. Also, it looks like Eric set things up so the audio recordings are set up for efficient streaming.

[via Audio/Mobile Blogging News]

Posted by Robert at 09:48 PM | link | comments (0) | trackback (1)

April 25, 2004

Cuff Update

It's been nearly three weeks since I damaged, and very possibly tore, the rotator cuff in my right shoulder. One or more of the tendons still produce a disturbing pop a couple times a day. Usually there is no pain, but sometimes it can be relatively painful. Overall, though, I think it is getting better. I have a noticeably greater range of safe motion than I did a week ago.

I went to physical therapy last Wednesday, and I have six more sessions spread out over the next two weeks. The therapist hooked up a four electrode electro-stim box to my shoulder, along with a huge heating pad, for about ten minutes. He then used a small wand that looked like a shower head to do ultrasound heating of the muscle and tissues. After a deep massage of my shoulder, he finished up by having me do a couple simple exercises lifting small weights. I was surprised by how much the injury weakened the muscles in my shoulder.

Since I definitely can't play tennis, I decided to go for a ride on my mountain bike. The biggest problems were that I hadn't been on my bike for over three weeks and the outside temperature was about 30 degrees F higher than when I had been riding in February and March. After doing only three or four hundred feet of climbing through the hills, I was ready to head home. The only thing that caused my shoulder to pop during the ride was pulling the water bottle from the cage on the bike frame and lifting it to my mouth to drink.

Posted by Robert at 05:54 PM | link | comments (0) | trackback (0)

April 24, 2004

Audio Credit Card

New Scientist - Credit card only works when spoken to

Beepcard has announced a new credit card they have developed that supports audio-based authentication for credit card transactions, via technology embedded within the card itself. This is a very cool idea, assuming they can get past a couple technology and personal adoption issues.

Beepcard had previously developed a credit card that could be used to verify that a remote customer had physical possession of the credit card being used for an online transaction. The customer would hold the special credit card up to a microphone hooked up to the computer being used to facilitate the transaction. The customer pressed a button and the card would emit a pseudo-random sound. The actual sound is determined by an algorithm simultaneously run on a chip on the card and running on a server. The sound is recorded by an applet that can be installed by the customer or downloaded from a website. Beepcard's software running on a remote server would then verify whether the correct sound was emitted. Since the sound is cryptographically (3DES) unpredictable, you don't have to worry about a replay attack.

Although the article doesn't mention it (but Beepcard's website hints at this), I don't see why a company couldn't ask the customer to hold the card up to a telephone's microphone and press the button, record the sound on the call center's equuipment, and then verify the recording with the server's calculation. That would provide additional security even for orders through a human or automated call center agent. Of course, calls over cellphones or poor connections might have problems. Sampling rates for telephone calls are typically around 8 kHz with 8-bit samples, so a second or two of audio should be able to provide you with plenty of information bits for a secure audio code. Heck, the RSA SecurID token I used to have at work used only a six digit number as the ID code.

Their new credit card contains a microphone. You speak your password and the card authenticaes you. Assuming they used digit-only passwords, the voice recognition software needs to distinguish between only ten digits., albeit in a speaker independent manner. Of course, this is still quite an accomplishment for software running on a very small, extremely low power, CPU.

Some day, this will be extended to speaker authentication with non-secret phrases. You will speak a large set of phrases and a model will be constructed for your speech patterns. You will then be prompted to repeat a varying, non-secret phrase, such as count from 1 to 6, or say the alphabet from f to j. The randomness will make it harder for a thief to use a recording and the non-secret nature of the phrase will allow you to use in public settings.

Of course, the challenges include:

Battery life - they are targeting to support 10 transactions a day for two years
Thicker, more fragile card - the card is three times as thick as a normal card, and obviously more fragile
Customer security concerns - even though the card should make transactions more secure, people often fear new technology, especially if it is difficult to explain to them exactly how it works
Spoken passwords - Since you have to speak your password, it is suitable for use only where you don't think anyone else can hear you
Hoarse voices - if the customer can't speak normally, they can't use the card unless they tell someone else their password. This will be an even bigger problem for speaker authentication.

Posted by Robert at 06:04 PM | link | comments (1) | trackback (0)

New MLS Website Broke SocerPhone

Two bad things happened at the MLS website this week, at least with respect to SoccerPhone (it's a program I wrote that you can use to hear live soccer scores over the telephone). One change is pure unadulterated evil, and the other could be good or evil depending on your aesthetics.

The first pure evil change was to change the URL for the live scores page. What the heck was wrong with /scores.html? I'm sure I'm not the only MLS fan who has that page bookmarked. The new URL is /MLS/scoreboard/index.jsp, or equivalently, /MLS/scoreboard/. While I'm very happy they have gone to a Java-based solution for their website (the performance is now much, much better and the generated HTML quite a bit cleaner), I wish they could have at least redirected scores.html to the appropriate JSP page. While this change also broke SoccerPhone, at least it required only a one line code change.

So how could they have avoided changing the URL that most people used?

One approach would have been to have put the original scores page at /scores/index.html. Then, when they upgraded their site to use JSPs, they could have set up their servlet engine to look for index.jsp in that same directory if someone requested the page http://www.mlsnet.com/scores/. The people who bookmarked http://www.mlsnet.com/scores/ instead of http://www.mlsnet.com/scores/index.html would have never noticed the change. Worst case, the webmaster could have always used Apache's mod-rewrite module to redirect the requests. Looking at the HTTP headers (thanks to the cool Live HTTP Headers plug-in for Mozilla browsers), they seem to be using Sun ONE Web Server for at least part of the site, which I assume has at least the basic URL rewriting capabilities that would have been needed. Some of the files on the MLS website are coming from an Apache 1.3.26 server.

Another approach would have been to hide the file extension. I've read a bit about how to do this using Apache. That allows you to still use reasonable names for files, without the person browing your site having to know if the resources are implemented as static HTML, JSPs, PHP, CGI scripts, etc. The advantage of this approach over the previous one is that you don't have to create as many directories.

The bigger change to the MLS website was to revamp the UI for the entire website, including the live scores page. That change completely broke SopccerPhone. I grab the team names, scores, etc. by parsing the HTML using regular expressions in a Python CGI script. The new HTML isn't even remotely close to the old HTML.

I've now ported the code for grabbing the team names and the scores to deal with the new HTML layout. Fortunately, the live scores page has a control that lets you easily and quickly see scores from previous weeks. That, in and of itself, is awesome. I had wanted to enhance SoccerPhone so you cold retrieve previous weeks' scores. The other reason it is cool is that I don't have to wait until the games start tomorrow to do most of the work on grabbing the game time and the scoring details.

Posted by Robert at 12:47 AM | link | comments (0)

April 21, 2004

Threading Features in JDK 1.5

I attended the EBIG Java SIG tonight, which featured a presentation by Bruce Eckel on threading in JDK 1.5. Though I've not done much multi-threaded programming myself, I'm very wary of it because I've seen enough damage at work by other developer's failed attempts. Bruce confirmed my feelings; don't make your program multi-threaded unless you absolutely have to.

The new threading features in the JDK 1.5 are based on previous work by Doug Lea, which now shows up in the java.util.concurrent package. The new capabilities look really powerful, but trying to follow Bruce's example code and predicting the results before he ran the code had my head spinning. This is really tricky stuff.

Posted by Robert at 11:47 PM | link | comments (0)

April 18, 2004

PhoneBlogger and Radio

So, for any of you stopping by my blog because Dave Winer linked to it on scripting.com, just a heads up that PhoneBlogger doesn't yet support Radio. However, if anyone is interested in adding support for Radio, I would appreciate the help. PhoneBlogger is already modularized to support the differences between Movable Type and Blogger, so I hope it will be easy to add support for Radio. The Movable Type specific code uses the metaWeblog API, so, for all I know, support for a Radio blog may just be a matter of entering the right values in the XML config file.

Posted by Robert at 12:21 PM | link | comments (0) | trackback (0)

April 15, 2004

Fragile Rotating Cuffs

A little over a week ago, I stood up at my desk after a long day of working from home. I had probably been hunched over the keyboard for far too long. I clasped my hands behind my back and pulled my shoulders back sharply. This often results in a slight popping sound that makes the muscles in my shoulders relax and feel better. This time, however, my right shoulder produced a much louder pop.

Since then, when I move my right arm across my body in the front and then bring it back to my side, I hear a similar popping sound (and so does anyone else in the room) and the feeling of a tendon possibly slipping past a muscle or bone. Raising my right hand and arm above my head also results in a similar pop. Sometimes it hurts a little, but sometimes it hurts a lot. I've managed to avoid a trip to a doctor's office for over four years, but my wife convinced me that a week of a painfully popping shoulder merited a trip to a physician.

My doctor thinks that most likely I have a torn rotator cuff, but he can't be sure without an MRI. Not only is this a somewhat painful injury that will probably result in weeks or months of rehab, I don't even have a good story, like coming in as a surprise relief pitcher for the Oakland A's and pitching too many innings without warming up properly. No, all I can say is that I was stretching at my desk. There will be no purple heart for me. I'm just stuck with naproxen, ice packs, and trips to the physical therapist. And, no, there is no useful moral to this story.

Posted by Robert at 11:45 PM | link | comments (0) | trackback (0)

April 10, 2004

Free WiFi at Ultimate Grounds

Ultimate Grounds on Park Blvd in Oakland offers free wireless Internet access! While working from home last week, I stopped by Ultimate Grounds for a cup of coffee. I brought my laptop along, hoping that either they or someone living nearby had an open hotspot. Thanks to Ultimate Grounds' open WiFi connection, I was able to VPN into my company's network and catch up on email and a bunch of other work. I ended up working there through lunch. I hadn't planned to eat there, but since I benefited from their free WiFi connection, I was more than happy to buy a sandwich and a drink from them.

While searching for their address tonight, I ran across a page at the AMD (Advanced Micro Devices) website that lists "AMD hotspots in Oakland". I already knew about Pacific Coast Brewing's hotspot, but I didn't know Cato's Alehouse offered free WiFi. I was actually there last night, but didn't notice any signs advertising free WiFi. I would've definitely brought the PowerBook along if I had known, since I was just tagging along with my wife who was having a meeting with someone else. Although I took the time to catch up on some reading, I would have equally enjoyed complementing a couple Downtown Browns with some web browsing.

Posted by Robert at 09:49 PM | link | comments (1) | trackback (0)

April 07, 2004

Speech Recognition

Automatic speech recognition (ASR) technology addresses a very difficult to solve problem, but researchers have made a significant amount of progress over the last ten years or so. There are still many unsolved problems, but the quality of modern ASR engines has made speech a viable user interface for many different applications.

Speech recognition technologies are commonly used to recognize a speaker's response to a prompt or to transcribe what a speaker has said. Other uses are telematics, which usually means a speech interface to systems in an automobile, or command and control of other devices, like desktop computers. The two most common approaches used to recognize a speaker's response are often called grammar constrained recognition and natural language recognition. When ASR is used to transcribe speech, it is commonly called dictation. The telematics and other command and control systems that I am aware of (outside of science fiction movies) are grammar constrained.

Grammar Constrained Recognition: Works well when the speaker is providing very short responses to specific questions. To create a grammar, you specify the most likely words and phrases a person will say in response to a prompt and you map those words and phrases to a token, or a semantic concept. For example, you might create a "yes-no" grammar that maps yes, yeah, uh-huh, sure, and okay to the token "yes" and no, nope, nuh-uh, and no way dude to the token "no". If the speaker says something that doesn't match an entry in your grammar, recognition will fail.

Natural Language Recognition: Allows the speaker to provide more natural, sentence-length responses to specific questions. Natural language recognition uses statistical models. The general procedure is to create as large of a corpus as possible of typical responses, with each response matched up to a token or concept. In most approaches, a technique called Wizard of Oz is used. A person (the wizard) listens in real time or via recordings to a very large number of speakers responding naturally to a prompt. The wizard then selects the concept that represents what the user meant. A software program then analyzes the corpus of spoken utterances and their corresponding semantics and it creates a statistical model which can be used to map similar sentences to the appropriate concepts for future speakers.

For example, let's say you want to route phone calls for a customer helpdesk by asking the caller to briefly describe their problem. For the concept "forward my call to the billing department", you would want to recognize sentences like "I have a problem with my bill", "I was charged incorrectly", "How much do I owe this month", etc. While you could construct a grammar with all the likely keywords (bill, charge, charged, owed, etc.), if the caller speaks in sentences, you may pick up multiple matches. You might also miss sentences that fit the right pattern, but just miss the pre-ordained keywords. It is difficult to create large, rich grammars that consider the context in which the words are said. In addition, as a grammar gets very large, the chances of having similar sounding words in the grammar greatly increases.

The obvious advantage of natural language recognition over the grammar constrained approach is that you don't have to identify the exact words and phrases. A big disadvantage, though, is that for it to work well, the corpus must typically be very large. Creating a large corpus is time consuming and expensive.

Dictation: Used to transcribe someone's speech, word for word. Unlike grammar constrained and natural language recognition, dictation does not require semantic understanding. The goal isn't to understand what the speaker meant by their speech, just to identify the exact words. However, contextual understanding of what is being said can greatly improve the transcription. Many words, at least in English, sound alike. If the dictation system doesn't understand the context for one of these words, it will not be able to confidently identify the correct spelling.

The challenge for developers of ASR engines is that the end customer judges them on one criteria - did it understand what I said? That leaves little room for differentiation. Of course, there are areas like multi-language support, tuning tools, integration API (the proposed standard MRCP or proprietary) , etc., but recognition quality is most visible. Because of the complex algorithms and language models required to implement a high quality speech recognition engine, it is both difficult for new companies to enter this market as well as difficult for existing vendors to maintain the necessary investment level to keep up and move ahead.

Currently, Nuance and ScanSoft dominate the speech recognition market. There are a lot of other small vendors, like Aculab, Loquendo, Lumenvox, etc., but they are essentially niche players. The speech recognition side of ScanSoft is actually composed of SpeechWorks and the products of several former niche players. IBM has also participated in the speech recognition engine market, but their ViaVoice product has gained traction primarily in the desktop command and control (grammar-constrained) and dictation markets.

This is all changing. The big software heavyweights, Microsoft (Speech Server) and IBM (references - main site, voice toolkit preview, eWeek article, older InternetNews article, new InternetNews article on VXML toolkits) are now making substantial investments in speech recognition. IBM claims to have put one hundred speech researchers on the problem of taking ASR beyond the level of human speech recognition by 2010. Bill Gates is also making very large investments in speech recognition research at Microsoft. At SpeechTEK, Gates predicted that by 2011 the quality of ASR will catch up to human speech recognition. IBM and Microsoft are still well behind Nuance and ScanSoft in technology and market share, but they are gaining on them fast.

Posted by Robert at 11:05 PM | link | comments (0) | trackback (0)

April 03, 2004

Referees and Bad Language

Word Mapping [Not suitable for those with an aversion to prof*nity.]

This site provides advice to current and prospective referees in the Football Association in England. As an aside, the term "soccer" that is more commonly used in the US, Canada, Australia, and a few other countries comes from an abbreviation of "association football". Perhaps the abbreviation was created after an extended drinking bout that resulted in a significant slurring of speech.

And while I'm already off course, I take issue with those who say we in the States should call the sport football, since it is called football in the English speaking parts of the UK, futbol in Spanish, futebol in Portuguese, fussball in German, and voetball in Dutch. But, what about Italy? They call the sport calcio (or perhaps more completely, gioco del calcio), which an Italian friend told me literally means "to kick", as in a ball or a person. How come I never hear people criticizing the Italians for not calling the game piedesfera? Okay, enough of that.

One of the highlights of the aforementioned referees' advice site is a Venn diagram that tries to illustrate what punishment a ref should dole out for specific examples of bad language. This is useful info for the non-Brit, as I had never received the memo indicating that bollocks and piss off were semantically so closely related. And who knew that shag needed to be carefully concealed as sh&g?

[via Boing Boing]

Posted by Robert at 10:07 PM | link | comments (0) | trackback (0)

Unreal Tournament 2004 Demo

Over the last few years, I've generally found that my interest in a specific video game rarely lasts longer than the amount of gameplay that is made available through the free demo. The upside of this is that now I don't pay for games that I end up playing only a couple times. I have not yet extended this discipline to books, so I still buy many books that I don't read or only read a chapter or two from.

Although I was tempted to immediately buy UT2004 (I loved the UT2003 demo, but lost interest before deciding I needed the whole game), I decided to once again give the demo a try. I got the UT2003 demo to work on XP, but not on Red Hat 8.

After downloading the Linux version of the UT2004 demo through one of the mirror sites and installing it [uncompress the bz2 file, su to root, and run sh ./ ut2004-lnx-demo-3120.run], I got the following error when it tried to start:

Couldn't set video mode: Couldn't find matching GLX visual

Based on the info I found from googling for that phrase, it was pretty obvious the problem was with the NVIDIA drivers. I had not yet updated the drivers that came with Fedora Core 1. I used yum and the rpm.livna.org repository to install a kernel built with the newest version of the NVIDIA drivers. After a quick reboot, I was hard at work fragging aliens. And as an extra bonus, Tux Racer was finally playable.

Update 4/25/04 - In case you are a Fedora Linux user and need more help getting this to work, here are a little more detailed instructions:

Add the following section to /etc/yum.conf (I normally leave gpgcheck set to 1, but I have sometimes had to turn off digital signature checking specifically to download Livna's kernel module for the Nvidia drivers)

###############
## Livna.org ##
###############
 
[livna-stable]
name=Livna.org - Fedora Compatible Packages (stable)
baseurl=http://rpm.livna.org/fedora/$releasever/$basearch/yum/stable
gpgcheck=0

su to root
#yum install kernel-module-nvidia-$(uname -r)
Reboot your machine (actually, exiting and restarting the windowing system may be sufficient, but I haven't verified that)

FYI, $uname -r returns the release identifier for the kernel you are using. Also, you can use $yum info kernel-module-nvidia* to see all the modules (e.g., kernel-module-nvidia-2.4.22-1.2188.nptl) that are available. If you have a single processor machine, you need one that ends in nptl. If you have a multi-processor box, you want nptlsmp. NPTL = native POSIX thread library and SMP = symmetric multi-processing

Posted by Robert at 04:47 PM | link | comments (0) | trackback (0)