4/7/2012: 9:53 pm: Java

At work we have a Java-based service that caches a very large amount of data. I spend a lot of time optimizing performance and memory usage for this service. The amount of memory it uses at runtime to cache a sufficient amount of data for performance reasons is now reaching the the 32 GB boundary.

One downside of using a 64-bit JVM is that the object pointers used by the JVM to reference objects would normally need to increase from 4 bytes (32 bits) to 8 bytes (64 bits). But, if the heap is under 32 GB, the JVM can take a shortcut because it knows that the offset will always fit in 4 bytes.

The JVM argument -XX:+UseCompressedOops was added to force the JVM to use compressed ordinary object pointers whenever possible. In Java 6 Update 23, the JVM was updated to enable Compressed Oops by default. Compressed Oops works very well for heaps up to about 26 GBs, but can still be advantageous for larger heaps.

However, I found that with Java 6 Update 24, Compressed Oops are not used at times when they could be, even if you specify -XX:+UseCompressedOops.

Specifically, I was using a 31.7 GB heap and my service was unexpectedly running out of memory. On one of our other systems, the heap for the same service reached only 26 GB after its large internal cache was fully loaded. After a lot of investigation that weekend, I discovered that the working system was on Java 6 Update 26. I downgraded it to Update 24 and easily reproduced the problem.

I had previously done a lot of testing to see what our penalty was for crossing the 32 GB heap boundary and found that it was about 8 GB. That’s actually not too bad, as general estimates I had heard from others ranged from 30-50%. This is probably due to the fact the objects we are caching mostly have only primitive data types as fields.

I highly recommend the Everything I Ever Learned About JVM Performance Tuning @Twitter slides and presentation for more info on JVM tuning. Or read Andrew’s summary of Attila’s talk.

By reducing the cache size on the production server enough to not blow out the heap, I confirmed that the heap size when using 1.6.0.24 was about 8 GB higher than when using 1.6.0.26. JVM, meet smoking gun.

The release notes for 1.6.0.26 references a compressedoops related bug fix that may have resulted in this feature now working as described. I read through all of the many bugs fixed in that release and it seems to be the most likely candidate.

3/6/2012: 11:20 pm: MySQL

At work today I ran into a new reason for not keeping open MySQL connections for a long time. It involves dynamic session variables like long_query_time.

I wanted to capture a couple of hours worth of all queries in the slow query log so I could analyze them with pt-query-digest from the excellent Percona Toolkit. So, I used

set global long_query_time=0;

After a couple of hours I had a lot of data. I discovered later that I was actually missing some of the early queries.

Then I set long_query_time back to 1 so only queries longer than 1 second would be logged. To my initial amazement, lots of very short queries continued to be logged to the slow query log.

A little research turned up the fact that the long_query_time session variable for a connection is initialized from the global variable only when a connection is opened. So, any connections that were open when I set long_query_time to 0 continued on as if it were still set to 1. Therefore I missed capturing those queries.

Worse, though, is that some of our code uses connection pools that can keep alive connections for a few hours. Short queries on those connections continued to be written to the slow query log after I set long_query_time = 1. Fortunately, I was tailing the slow query log, so I noticed this before it got too big. At this point, you can temporarily disable slow query logging, let it go and hope those connections don’t last too much longer, or go all club and hammer on the connections and kill them and hope your code handles it gracefully. The last one generally isn’t so great of an idea for a production app.

As I mentioned above, my slow query log capture didn’t include sub-second queries from connections that were open when I set long_query_time to 0. That would obviously affect the results of my analysis by leaving out the queries from long-lived connections.

So, if you’re doing something like this, you should capture data for a long enough time to offset this factor or at least throw out the data from the earlier parts of the log. You should also check out the open connections with show processlist; before changing long_query_time.

1/30/2012: 12:07 am: Food and Drink, The Unusual and the Weird

Mixology cover Whenever I’m looking for a drink recipe, I first seek out my trusty Mixology pamphlet, courtesy of the Southern Comfort Corporation, ca. 1974. What better source could there be for cocktail recipes than a pamphlet that mixes astrology with photos of swinging dudes in gaudy polyester leisure suits accompanied by the smiling Stepford wives. Sure, this guy is in a pretty reasonable looking sweater, but steel yourself now for what is to come.

Back in the 70′s, subliminal messaging was a controversial topic. I remember running across a book from that same year called Subliminal Persuasion that had a lot of images from advertising and movies with supposedly embedded suggestive words and images, primarily of a sexual nature. Check out the Subliminal Manipulation blog if you don’t sex believe me.

Take a closer look at the hair of the guy on the Mixology cover. Now, think about other parts of a man’s body. Good luck getting this image out of your head (no pun intended) anytime soon. I’m really, really sorry.

Next up we’ve got a dude confidently sporting a pink suit. The previous sentence is the only known sentence on the internet including the words dude and pink suit, but not the word pimp. Ignore the faint yellow polka-dots. I double-checked the pamphlet and they must be a scanning artifact. I was so hoping they weren’t, though. I think my scanner understandably puked on the image.

Almost everyone knows his Zodiac sign today. But few have any real knowledge of astrology.
Intent of astrology data herein is simply to inform, not to advise. Therefore any personal application is the individual’s responsibility.

Check out Mr. quilted pants on the left. What grandmother wouldn’t want to see her granddaughter coming home with a nice boy wearing a handmade quilt? OK, besides any grandmother with something against hobos. Those pants are so appalling that I almost didn’t notice the crazy blue plaid suit in the back. He’s channeling Rodney Dangerfield from Caddyshack, but coming up well short. Powder blue cardigan boy looks positively normal in this photo.

If your party kilt is at the drycleaner, a full tartan suit is always a great substitute. It’s a little hard to see, but, yes, those are matching pants. Fortunately, I don’t think it’s the royal Stewart tartan. Too bad his promiscuous plaid partner up front isn’t in a matching tartan. I can’t identify the tartans for certain due to the cumulative retinal scarring, but I’m suspecting they’re variants of the Montgomery Ward tartan.

If you’re daring and desperate for the full pamphlet in a high enough resolution to actually read it, download the 2 MB zipfile.

10/22/2011: 12:23 pm: The Unusual and the Weird

The competition for toilet supremacy is heating up. The NY Times has a great review of Kohler’s Numi, which opens up like a Transformer to accept your tributes. Someone should hack the opening chime to play a recording of Optimus Prime saying ”No sacrifice is too great in the service of freedom.” And I would love to see it do battle with Toto’s Megatron, I mean Neorest.

Kohler Numi transforming

When I first glanced at the image of the remote control, I thought the bottom left button said “Lasers”. Now, that would be freaking awesome. Whether as a laser light show to accompany the event or as a modern alternative to the incineration of your contributions, I’m all for it. And surely a couple frickin’ lasers would come in handy when kicking some Neorest butt.

Kohler Numi remote control

6/3/2011: 5:00 pm: Java, MySQL

When you use the MySQL JDBC driver to select rows from a table, the connection will block until the entire ResultSet has been pulled over to the client. In most cases this makes sense, especially if the server is on a different host. Retrieving the entire ResultSet will minimize the number of TCP packets that must be sent from the server.

However, if you are returning a very large ResultSet, the client will have to allocate a lot of memory on the heap. If you end up accessing each row to create an object from the data, then you will need enough heap space for the entire ResultSet plus all of the objects you instantiate.

The driver documentation explains how to force the driver to stream the ResultSet row-by-row.

The first catch is that you must be using a regular Statement object, not a PreparedStatement.

The documentation says you need to add the following non-intuitive code before executing the query:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
              java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);

though you can actually just use conn.createStatement() since TYPE_FORWARD_ONLY and CONCUR_READ_ONLY are the defaults.

There are a couple caveats in the documentation, though they are fairly obvious. You should process the ResultSet as quickly as possible, since locks will be held as long as the statement (and any transaction it is in) is open.

In addition to being non-intuitive, setting the fetch size to Integer.MIN_VALUE might cause unexpected results if you run your code against a database server other than MySQL.

If you’re willing to go all out in committing to MySQL, you can cast the return value of createStatement() to com.mysql.jdbc.Statement.StatementImpl and then call enableStreamingResults(). That will, at least, make the behavior of your code more obvious.

At work I needed to cache a lot of data from a couple of tables. Using the default behavior caused the heap to grow to over 12.5 GB. That made for trouble when running on my 8 GB laptop. By switching to streaming the ResultSet, the heap maxed out at only 5 GB.

3/20/2011: 3:06 pm: Google App Engine, PhoneBlogger, Python, VoiceXML

In late 2002, I thought it would be cool to build an application that allowed you to blog by phone. Tools, libraries and hosted services were a bit more limited back then, but after a few months of learning, coding and debugging, I managed to release the first version of PhoneBlogger in January 2003. Along the way, I learned a lot about Python, VoiceXML, JavaScript, XML-RPC, audio encoding, shared web hosting and command line tools for Linux.

Fast forward nearly ten years and not only have the tools and libraries come a long way, but there are many more free or inexpensive hosted services that simplify building a tool/service like PhoneBlogger. Instead of hosting the application code on a shared hosting site, I can now build and deploy on Google App Engine. Though scalability is not an issue for my personal use of PhoneBlogger, if it were turned into a public service, App Engine would make scaling much simpler and more economical. App Engine also makes deployment a snap, though with a small amount of work, so would Fabric. For my PhoneBlogger rewrite, I decided to use App Engine.

In the original version of PhoneBlogger, I coded a bunch of static VoiceXML and JavaScript for managing the telephone interaction with a caller. At the time, three of the most prominent services for VoiceXML developers were Tellme (now owned by Microsoft), BeVocal (now owned by Nuance) and Voxeo (still independent). I had to write slightly different code for Tellme and BeVocal, but the differences weren’t that significant. I think it would have been pretty simple to port to Voxeo, as well. Improved support of VoiceXML 2 would now likely allow me to use the same code on each platform.

While VoiceXML is still a great option for building speech apps, a couple of new services bring you simple APIs for building speech or DTMF (touchtone) applications, at the cost of portability. This time around I’ve started with Twilio. I very quickly turned a Python/GAE example from the Twilio website into a DTMF app for tweeting by phone. Although speech recognition allows you to build much more complex and natural applications, many simple applications can be built quickly and easily with just support for pressing keys to provide input. PhoneBlogger falls into that category, for now.

One very convenient thing about Twilio is that I can use their platform to capture and host recordings in a format that is simple to play back in a web browser. If I were really concerned about longevity of the recordings I could easily retrieve them and store them elsewhere, but I’m okay with keeping them on Twilio servers for now. That’s an easy enhancement to add later. The biggest downside for tweeting the Twilio links is that the Twilio recording URLs are ginormous. Fortunately, the goo.gl URL shortener made quick work of that problem.

I’m also going to take a look at porting my code to Tropo, which is a service offered by Voxeo. Tropo is built on Voxeo’s Prophecy platform and offers speech recognition as an option.

I decided to begin the rewrite by first supporting tweeting by phone. Twitter offers a great API, which is made even simpler by libraries like Tweepy. I highly recommend first checking out the OAuth support in any library for Twitter you might consider using. OAuth can be a complex beast, but libraries like Tweepy make it almost trivial.

The original PhoneBlogger source code and a couple iterations of it are available on SourceForge. I wasn’t particularly interested in learning about CVS at the time, so I just uploaded tarballs of all the code. While SourceForge has improved a lot, I’ve become more of a fan of GitHub. Google Code, LaunchPad and BitBucket are also great options. I started using LaunchPad when working on a Java library for Gearman, but then set up a couple of repos on GitHub when I started working on Log4mongo-Java. I’m much happier with Git, Bazaar and Mercurial than Subversion and CVS (Caveman Versioning System). I’ve already started posting code for the new phoneblogger project on GitHub.

As of now, the new version of PhoneBlogger supports tweeting by phone. All the code is on GitHub, along with a README file with the basic steps to set it up for yourself. In an upcoming blog post I’ll walk through those steps in a little more detail.

2/10/2011: 10:33 am: Software
12/29/2010: 10:45 pm: MySQL

If a replication slave gets out of sync with the master, you can bring them back in sync by running statements that don’t execute on every server in the replication chain. There are sane and insane ways to do this.

The right way is to execute SET SESSION sql_log_bin=0; on your current connection before running the statements you don’t want replicated. Then, either execute SET SESSION sql_log_bin=1; or close the connection.

The crazy way is to execute the statements while the default database is set to a database that is not replicated. Many DBAs configure MySQL servers so that the mysql database is not replicated, since it may contain user and host info that is specific to a server instance. When the default database is set to a database that is not replicated, mysqld will not replicate statements affecting any database. Feature or bug, you be the judge.

In my example scenario below, db1 is the master and db2 is the slave.

First, create a table in the test database and verify it is replicated. Here’s an example create statement.

mysql> CREATE TABLE test.rs (a INT);

Then, use the MySQL CLI to connect to the master database server (in this case, db1), set the default database to mysql (-D mysql) and execute an INSERT statement. The short version of this is:

[me@server ~]$ mysql -u me -D mysql -h db1 -e "INSERT INTO test.rs VALUES (1);"

Then, verify that the row was added on the master, but not on the slave.

[me@server ~]$ mysql -u me -h db1 -e "SELECT * FROM test.rs;"
+------+
| a |
+------+
| 1 |
+------+
[me@server ~]$ mysql -u me -h db2 -e "SELECT * FROM test.rs;"
[me@server ~]$

And if you haven’t already guessed, the insane way is frequently the accidental source of many out-of-sync situations. Use the sane way to fix the damage.

Of course, the easy way to sync up tables on out of sync servers in a replication chain is to use mk-table-sync.

12/7/2010: 9:57 pm: Conference, MongoDB

Mongo SV 2010 badge

10Gen put on another excellent MongoDB conference last Friday, this time at Microsoft Research in Mountain View. Like Mongo SF, there was a good balance between intro and advanced material, as well as between 10Gen presenters and third party presenters, like myself. Registration was smooth, sessions ran on time, they made it easy on presenters, presentations were videotaped, audio was recorded with a direct feed, food was adequate (though putting squeaky bags of chips in rooms was not a good idea), 10Gen people were really helpful and the after party at Tied House was excellent.

And best of all, I talked my way out of an undeserved traffic ticket while leaving Microsoft Research. The car in front of me pulled into the street and then stopped to wait for a car to go by. I pulled up to the stop sign and stopped, since he was already in the street. When he finally pulled away, so did I. Fortunately, the cop accepted my side of the story and let me go. He was actually pretty nice about it.

Below are the slides from my presentation on Logging Application Behavior to MongoDB. If you’re interested in logging to MongoDB from Java, Python, Ruby, PHP and/or C#, I hope you’ll find them useful.

I’m currently on the agenda to present on the same topic at the January 18 San Francisco MongoDB Meetup. By then, I plan to have more detailed info on analyzing data that has been logged to MongoDB.

11/23/2010: 10:00 pm: The Unusual and the Weird

While it was tempting to write this post in all caps, that would cause me too much pain.

Yesterday I received my first ever 419-style spam/scam by US Postal mail. It was actually posted from Dar es Salaam, Tanzania, with a cool rhino stamp that cost somebody 800 Tanzanian shillings, which is currently about 55 US cents.

Tanzanian rhino stamp

My correspondent is allegedly CHARLES TAYLOR (JNR), son of Liberian strong man Charles Taylor. Daddy is now locked up in The Hague awaiting trial for his role in the civil war in Sierra Leone. And, no, he did not have a role in the design of the classic Converse sneakers, though it hasn’t been proven yet that Naomi Campbell didn’t decorate her chucks with sanguine diamonds courtesy of the Chuckster. Seems like Chuckie, Jr., is reaching out to people in Spain, as well.

Tanzanian spam letter

It’s a pretty sweet offer, as I would be in line for a minimum of nearly $90 million USD.


Fork me on GitHub