8/5/2003: 10:55 pm: PhoneBlogger

[via Audioblog/Mobileblogging News via Smart Mobs]

IndyMedia in Melbourne, Australia, recently announced PIMP, the Phone IndyMedia Patch system, as an automated system for anyone with a telephone to submit live reports to Indymedia. The first use of PIMP was to report on a demonstration. This is exactly what I imagined early on as a great use for PhoneBlogger.

The purpose of PIMP is very similar to PhoneBlogger. Their system is pretty low-tech, but it looks like it should work just fine. Sometimes, the simplest thing that works is best.

They definitely aren’t using VoiceXML, speech recognition, or text to speech. Instead, they’re using a program called vgetty to do much of the heavy lifting. When you call PIMP, you’re actually calling a voice modem. Vgetty is used to save your recording to a wav file. They use SoX and LAME to convert the WAV file to an MP3, similarly to how I used them with PhoneBlogger.

For some people, setting up a voice modem and a couple Linux programs will be easier than setting up a VoiceXML gateway and bridging it to the telephone network. But, with the free hosted VoiceXML services provided by companies like TellMe, BeVocal, and Voxeo in the US, I think the VoiceXML approach ends up being a lot simpler, and definitely a lot more flexible. In countries without free hosted VoiceXML gateways, of course, Indymedia’s approach makes a lot of sense. However, the installation instructions for PIMP are a bit intimidating and somewhat less than clear. But, then again, so are the ones I wrote for PhoneBlogger.

I’m not sure if Hugh is using the same approach as IndyMedia, but he has a service called VoiceMonkey that also essentially acts like a voice mail system that can publish recordings to a website.

7/26/2003: 11:08 pm: Blogging and RSS, Intellectual Property, Music, PhoneBlogger, Reviews

The Digital Mix benefit last night for EFF was a highly entertaining mix of music, art, education, and motivation.

Ray Beldner

Ray Beldner, a local artist and educator, talked about the Illegal Art exhibit that has been showing at Fort Mason for the last three weeks. Unfortunately, today (Saturday the 26th) was the last day. Ray has created some very cool art works using US currency to emulate works by artists like Duchamp, Picasso, and Warhol. It turns out that defacing currency actually isn’t illegal, at least not seriously so, unless you do it with the intent to defraud someone, e.g., by making a $1 bill look like a $100 bill. I wonder if he was disappointed when he learned that he wasn’t as much of an outlaw as he thought he might have been. Of course, he’s still under potential legal and financial pressure from the estates of the artists he has copied. Ray showed photos of some of his work and work by other artists at the exhibit. He told several cautionary tales about how many of the artists involved with this exhibit have been threatened with legal action that has forced them to destroy previous works and alter their plans for future works.

FreshBlend performed before Ray spoke, but I had arrived too late. The first musical performance of the night I did see/hear was the polka electronic death country musical stylings of Mochipet. Unfortunately, the sound level was way too high for the acoustics of that room, and most of the audience quickly made their way back to the front room. You can download some of David’s music from his website.

Glenn Brown

Glenn Brown, the executive director of Creative Commons, gave an excellent summary of what Creative Commons is all about. If you haven’t seen it yet, I highly recommend watching the animated short at the CC website. That will give you a similar overview of the basics, at least.

Check out my previous post for an experiment with PhoneBlogger at the Digital Mix show. I tried to phoneblog one of the speakers, but the sound level wasn’t high enough from where I was sitting. Obviously cellphone designers design the mic placement to pick up just the person speaking into the cellphone from a very close distance. In this case, what would normally be considered background noise was the desired signal.

I wish I had tried to phoneblog part of Cat Five’s performance later that night. However, I didn’t think my attempt with Uprock had succeeded, so I didn’t even try. The audio quality of my telephone recording of Uprock isn’t that great, but it will give you a decent feel for their music. The heavily saturated visuals were pretty cool, but they really needed more material, as the images repeated a little too frequently to keep things interesting.

Fred von Lohmann

Fred von Lohmann, an intellectual property attorney with the EFF, gave a very educational and motivational speech about why it is incredibly important that each of us act now to defend our digital rights. He’s a really great speaker, which is obviously extremely important, given the vital litigation work he performs for the EFF. If you haven’t donated to the EFF recently, do it soon. That means me, too. I was member number 323 when I joined back around 1990, the year the EFF was founded. I still have a 3.5″ floppy disk labeled “EFF-Austin InfoDisk version 1.0 August 1992″ that I picked up at a meeting along with a floppy disk Bruce Sterling gave me that is stamped Garbage In Garbage Out. EFF accidentally fell off my list of charities to donate to one year, and I kept forgetting to put them back on the list. Lazy, lazy, lazy. I won’t forget this year, though.

Meanest Man Contest

Meanest Man Contest definitely stood out from the other performers, if for nothing else because of the vocals and the guitar, though the guy on the left didn’t happen to be playing it when I took this photo. Check them out if you like intelligent rapping over sampled, cut-up music.

One downside of seeing a live performance of laptop music is that most of what you usually see are a couple people seated at laptops with the backs of the laptops facing you. Other than watching them move around a mouse and tap on the keyboard, they might as well be behind a wall. Mochipet’s performance would have been a lot more interesting if I they had projected the image from his laptop up onto the big screen. Some of the other groups has some interesting videos projected on a big screen behind them, which leads me to…

Cat Five

Cat Five walked out wearing fright wigs, settled in behind their Mac laptops, cranked up the dry ice smoke, and launched into “American Military Operations”. The visual collage projected during their performance was a great backdrop for their music. This was definitely my favorite performance of the evening. The above photo has some weird artifact in between the two guys on the left. This is not a UFO, a ghost, a dinner plate, or a part of the show. Stuff like that sometimes happens with my camera when I try to take photos with way too low light levels. I had to crank the brightness way up from the original image. Anyway, Cat Five were very cool and you should definitely check them out if you get a chance.

7/25/2003: 11:35 pm: Blogging and RSS, Music, PhoneBlogger, VoiceXML

This post was created with PhoneBlogger. Click to listen to the recorded message.

This was an experiment in audioblogging a live music performance. The music you hear is from a performance by Uprock at Digital Mix, a benefit for EFF.

I don’t think that bands need to worry too much any time soon about people bootlegging shows this way, although my setup was by no means ideal.

  1. I wasn’t getting a strong signal from inside the performance space
  2. I have a 3+ year-old Sanyo SCP-4000 cellphone
  3. The acoustics in the performance space were not that great
  4. I could definitely have picked a better spot from which to take the recording

and, of course, there’s the roughly 4 KHz bandpass filter imposed by the telephone network. Nonetheless, I think this was a very promising experiment. Most importantly, I think it is a good justification for investing in a newer, better cellphone.

7/13/2003: 1:08 pm: PhoneBlogger

‘AOL Journals’ To Bring Blogs To Millions (

AOL will give members three ways to update their blogs — through an online template with blank boxes for text input, through AOL’s instant-messaging system or by telephone. The phone option will be available only to subscribers to the extra-cost “AOL by Phone” service, who will be able to leave voice messages that will be posted as MP3 sound files.

Dang! I should have patented the invention of using a VoiceXML application to post to a blog by telephone when I had the chance. Actually, I would be pretty disappointed if the U.S. Patent Office granted a patent for phoneblogging. Nonetheless, given the kinds of ideas Amazon and others have recently been able to patent, I wouldn’t have been shocked if they would have given me a patent for the invention. Based on estimates I have read that an independent inventor can expect to spend about $10,000 just for the US patenting process, there was no way I would have gone for the patent unless I planned to license it or turn it into a real business.

The AOL by Phone voice portal was spawned from AOL’s purchase of back in August 2000. I can’t remember who TellMe and BeVocal’s other competitors were back then. I didn’t start doing VoiceXML development until sometime late in 2001.

I started working on PhoneBlogger in late October 2002 and released it in January 2003. Between vacation, my real job, and not having a real plan for what to do with PhoneBlogger, I took far too long to finish it up. If I had been able to work on it full time, I’m sure I could have completed it start to finish in less than two weeks, maybe even less than one. That’s far more of a tribute to the richness of the code libraries and tools (VoiceXML, Python, xmlrpclib, TellMe VXML hosting service, Lame, SoX, etc.) I was able to use than my coding skills.

The biggest nightmare by far in developing PhoneBlogger was dealing with XML documents in JavaScript. I estimate that I spent about 1/4 of the total time writing code and unit tests and then debugging what should have been some really simple code for reading and parsing XML. JavaScript desperately needs better APIs than the DOM. In hindsight, I would have figured out another way to deal with the config info, but I kept feeling I was just a couple lines of code away from getting it to work. One of the hardest decisions for a developer to make is when to abandon an approach. This time I let stubborness get the better of me.

via Audioblog/Mobileblogging News via Joho Blog

6/2/2003: 10:47 pm: PhoneBlogger

While helping someone else setup their own install of PhoneBlogger, I realized that I had never gotten around to publishing much installation documentation on my website. The previous publicly available documentation was very sparse, as in the sense of providing virtually no clue as to how to install the dang thing. Well, I have mostly remedied that.

5/21/2003: 10:42 pm: PhoneBlogger, VoiceXML

The Technology section of Der Spiegel Online has a long article on audio blogging [German | Google Translation] titled “audio blogs: Voices from the Web”. PhoneBlogger makes an appearance in the Internet links sidebar as “Audioblog solutions (III)” and in the main text of the article.

As translated by Google:

“The ink of the W3C-Empfehlung is not yet completely drying, there urge already ready for occupancy Web log solutions of Bevoice, Tellme, Audblog or open SOURCE projects such as Phoneblogger into the market.”

A better translation might be “Although the W3C standard for VoiceXML is only recently complete, the audio weblog solutions of BeVocal, TellMe, AudBlog, and open source projects like PhoneBlogger have already entered the market.”

Harold’s Audioblog/Mobileblogging News blog also showed up in the links sidebar.

5/13/2003: 10:56 pm: PhoneBlogger

alphaWorks : Transcription Portlet

Transcription Portlet is a voice portlet for transcribing telephony-based dictation. This technology makes large-vocabulary speech recognition technology available to telephony-based portal applications (portlets). It provides Java APIs through which developers can integrate transcription capabilities into a portlet application.

Oh, I could certainly use one of those. Unfortunately, it requires WebSphere Voice Application Access. WVAA includes a set of plug-ins for WebSphere Studio (which is based on Eclipse) and a bunch of runtime stuff.

WebSphere Voice Application Access also provides the runtime components that make up the voice portal server infrastructure: WebSphere Portal Server (WPS), WebSphere Application Server (WAS), IBM SecureWay®, IBM DB2®, IBM HTTP Server, and others.

If you haven’t seen the price tag for WebSphere Portal Server, take my word for it, it ain’t cheap.

Now if someone would be willing to host this as a web service, that would be sweet. I could easily modify PhoneBlogger to have it send the recorded audio file to the transcription web service and then retrieve the text via HTTP, or even email, which the Transcription Portlet already supports. The submission of the audio file would obviously include a userid for a user who had already spent time training the speech recognition system on her voice.

4/24/2003: 11:51 pm: PhoneBlogger, Privacy and Security, Software

Sun tackles privacy, speech recognition | CNET

Both parts of this announcement are very interesting. While presence awareness software from Sun is not that surprising (they already have an instant messaging server and an identity management server), I’m surprised to see them getting involved in speech recognition.

Although the article doesn’t mention it, I assume Virsona is built on JXTA. This is an interesting approach to giving people more control over their presence info, as opposed to having it tracked on a server over which they have no control. I hinted at a blog-based approach in my notes from a recent Weblog panel at Cal. I think a combination of these two approaches would be very powerful.

I’m very excited to hear that Sun is getting serious about working with the CMU Sphinx project to create a high quality open source speech recognition engine in Java. Currently, it has only a 1,000 word vocabulary and will be speaker dependent, i.e, each speaker will have to go through a training period before the recognition level will be acceptable. However, this should be sufficient for at least a first stage of auto-transcription for PhoneBlogger.

First, I want to add support for simple titles to audio posts. Right now, the title and text for an audio post is exactly the same. Unless you later go back and edit the post once you have Internet connectivity from a text entry device, a reader of your blog can’t tell what the post is about without listening to it.

Of course, it wouldn’t hurt if I stopped reading news and blogging until past midnight and got working on at least adding support for Movable Type categories.

4/3/2003: 9:26 pm: PhoneBlogger, SoccerPhone, VoiceXML

Bad news for my free, public SoccerPhone service, which ran as a TellMe Extension. I received the following email from TellMe today:

VoiceXML Developer,
Tellme has made many investments in VoiceXML over the past four years. One of these investments was in the Extensions program, with the goal of making VoiceXML a more utilized public standard. Now with VoiceXML well on its way to standardization in the W3C and with hundreds of thousands of VoiceXML applications in production, it is clear that investment has paid off. It is time for us to retire the Extensions program and invest in other areas. As of Wednesday, April 9th we will no longer host Extensions on 1-800-555-TELL or Developers can continue to build VoiceXML applications on Tellme Studio.
Thank you for your individual contribution in making VoiceXML the most widely-used and successful voice standard in the world.
The Tellme Development Team

Fortunately, it looks like TellMe will still support developer level access (i.e., you need the admin password) to a VoiceXML application, which should be sufficient for most deployments of PhoneBlogger. I’ll now have to look into BeVocal and HeyAnita, although a quick scan of their websites doesn’t suggest that they provide a service similar to TellMe Extensions.

Although I will miss it, this was one of the last remaining relics of the dotcom era. While Extensions got TellMe a decent amount of good PR, I imagine it cost them quite a bit of money to host it, especially when you consider the time that employees were putting into administering a free, hosted service as opposed to one of their services that generates revenue.

I just wish they would have kept it, but without a toll-free number. A lot of people with cellphones have nationwide long distance included in their plan, so TellMe was paying toll charges for nothing. Or, at least I think most people choose the long distance plans. If they don’t, they should. I very rarely make a long distance call from my house anymore.

Eric Snowdeal indicates on his ex machina that he has run into the same problem.

3/19/2003: 10:47 pm: PhoneBlogger

I released the first version of the PhoneBlogger source code last Monday on the PhoneBlogger SourceForge project site. As with SoccerPhone, I released it under the GNU GPL.

I haven’t changed much of the code since January, as I have been working on a couple other projects. I’ll probably start working on PhoneBlogger again in the next couple weeks. One of the main things I need to work on is simplifying the install. Other possible enhancements include:

  • Movable Type category support
  • Append audio link to previous post
  • Dialog flow improvements
  • Send email with the URL for the audio file instead of posting directly to a blog

I would really like to add Ogg Vorbis support as an alternative to MP3, but I have had a hard time getting the Ogg libraries working from SoX.

« Previous PageNext Page »

Fork me on GitHub