7/20/2010: 8:40 pm: RobertMac

This morning I had a strange problem where sound stopped working on my Mac (Powerbook running 10.6). The speaker icon in the menu bar was grayed out and the image brought up when pressing the volume up and down controls indicated that sound was muted. Bringing up the Sound control panel in System Preferences didn’t help.

Last night I had been listening to music on headphones using the line out jack, so I was puzzled as to what might have caused the problem. Restarting and zapping the PRAM didn’t help. Interestingly enough, though, the start up chime played at system startup. So, I knew the built-in speakers still worked. However, the login sound was muted.

When I brought up the Sound control panel, the only output device showing was digital out. The normal Internal Speakers entry was missing. So, I guessed that maybe OS X was confused about whether I was still trying to use the line out.

I started up a song in Songbird and still no sound or ability to control the volume. I then plugged in my headphones and, voila, sound worked. I unplugged the headphones, but the internal speakers didn’t take over. In the Sound control panel, Headphones appeared as the output device when I plugged in the headphones, but Digital Out reappeared when I unplugged them.

Finally, on a whim I plugged back in the headphones and clicked on the speaker icon in the menu bar that represents the volume control. While holding down the mouse button, I unplugged the headphones and suddenly the internal speakers started working on. In the Sound control panel, the Internal Speakers entry had returned. I don’t know exactly why, but that solved the problem.

6/1/2010: 11:58 pm: RobertDjango, DreamHost, Python, Tutorials

I recently successfully migrated my first Django app from DreamHost shared hosting to a DreamHost virtual private server. I hope this info will be useful to others. The following two blog posts were absolutely key to me getting this working, especially the first one. Thanks Preston and Graham!

Disable Passenger if Migrating from Shared Hosting

If you set up your shared site for Django recently, you likely enabled Phusion Passenger to serve your Django projects. While it works great on shared hosting, you’re better off using the more conventional mod_wsgi approach on a PS. If you don’t disable it, it will interfere with mod_wsgi AND use up a lot of the memory you are paying for on your PS. In the graph below, memory use had been steady at about 250 MB. After disabling Passenger, memory use leveled off at around 50 MB. This chart is from the DreamHost control panel.

Disable DreamHost Management of Web Server

To enable mod_wsgi and make a few directories accessible by Apache, you will need to modify httpd.conf. The DreamHost control panel does a great job of managing httpd.conf for you. But, unless you want to be constantly merging in your changes, you need to manage httpd.conf yourself.

Under Private Servers, select Configure Server. Then in the Web Server Configuration section, uncheck DreamHost Managed and click the Save psNNNNN settings… button.

To edit the config and restart Apache, you need to setup a user with sudo privileges. I created a special user account solely for this purpose. After creating the new user account through the control panel, give that user sudo privileges via the Manage Admin Servers entry in the Private Servers section of the DreamHost control panel.

Important Directories

In addition to knowing about /usr/local/dh/apache2, there are a couple of other important directories, some of which need to be added. Replace YOU with your username. This directory layout isn’t required, but you’ll have to adjust the rest of the instructions if you save things elsewhere.

  • /home/YOU/src – Convenient place for downloading libraries for building and installing
  • /home/YOU/wsgi-scripts – Will contain a wsgi-script for each Django project
  • /home/YOU/projects – Will contain all Django projects

Since Apache will be accessing these directories as a different user, make sure the directories are accessible (i.e., chmod 755). I got bit by this as part of my shared hosting migration. On shared hosting, scripts get run under your user, so my project didn’t need to be group or other executable.

Inside of a Django project, I use the following layout. In this example, I’m serving up static files with the same web server. In the instructions below, I’ll be setting up an alias to the media directory from Apache.

__init.py__
manage.py
settings.py
templates
  |---- app1
     |---- base.html
     |---- index.html
     |---- ...
  |---- app2
     ...
media
  |---- css
  |---- js
  ...
...

To keep people from accessing directory listings of your media directory, you can add a .htaccess file to that directory with the following contents.

Options -Indexes

Setup Python

I ran into issues with Python 2.6 (No module named _md5), so I recommend sticking with Python 2.5. In case the default Python version on your server is 2.4, make 2.5 the default:

$ sudo rm /usr/bin/python
$ sudo ln -s /usr/bin/python2.5 /usr/bin/python

Create a directory in which to download and install libraries.

$ mkdir ~/src
$ cd ~/src

Download and install setuptools for Python 2.5.

wget http://pypi.python.org/packages/2.5/s/setuptools/setuptools-0.6c11-py2.5.egg#md5=64c94f3bf7a72a13ec83e0b24f2749b2
sudo sh setuptools-0.6c11-py2.5.egg

Upgrade Django, if desired.

$ sudo easy_install pip
$ pip install -U django

Download and build MySQLdb library. I prefer oursql, but it is easier to get started with MySQLdb.

$ tar xzf MySQL-python-1.2.3c1.tar.gz
$ cd MySQL-python-1.2.3c1
$ sudo python setup.py install

Preston’s instructions also cover the Python Imaging Library and virtualenv. I’ve skipped them here to keep the instructions shorter. You can easily install them later.

Before preceding with mod_wsgi, make sure your Django app runs on the Django dev server. It’s a lot easier to sort out issues with settings.py at this stage.

Create your Django project with django-admin.py or copy an existing project into the projects directory mentioned above.
In settings.py, add a variable that determines the location of your project directory. You’ll also need to add import os at the top of the file.

PROJECT_ROOT = os.path.realpath(os.path.dirname(__file__))

And then set up your media and templates:

MEDIA_ROOT = os.path.join(PROJECT_ROOT, 'media')
TEMPLATE_DIRS = (os.path.join(PROJECT_ROOT, 'templates'),)

In the instructions below, I set up aliases for Apache to the project media and admin media directoris, so I set both media URLs to /media/.

MEDIA_URL = '/media/'
ADMIN_MEDIA_PREFIX = '/media/'

Running with the Devserver

From your project directory, create the database if your app needs one (this assumes you’ve already set up a MySQL database through the control panel or manually).

$ chmod u+x manage.py
$ ./manage.py syncdb

Start the dev server and make sure your app loads.

./manage.py runserver 0.0.0.0:8000

If not, check the output that is being logged to the console. You may also want to temporarily change settings.py so that DEBUG = True. Kill the dev server with ctrl-c once you’ve got things sorted out.

Install mod_wsgi

Download, build and install mod_wsgi.

$ cd ~/src
$ wget http://modwsgi.googlecode.com/files/mod_wsgi-3.0c5.tar.gz
$ tar xzf mod_wsgi-3.0c5.tar.gz
$ cd mod_wsgi-3.0c5
$ ./configure --with-apxs=/usr/local/dh/apache2/template/sbin/apxs --with-python=/usr/bin/python
$ make
$ sudo make install

Create a WSGI Script

In the /home/YOU/wsgi-scripts directory create an PROJECT.wsgi file, where you replace PROJECT with your project’s name. Here’s the contents of mine . You just need to replace YOU and PROJECT with appropriate values.

import os
import sys
 
# setup_tools will try to cache egg files in surprising locations where the
# apache user likely doesn't have write access unless you do this
os.environ['PYTHON_EGG_CACHE'] = '/tmp'
 
# Ensure that parent directory of project is on PYTHONPATH
sys.path.insert(0,'/home/YOU/projects')
 
os.environ['DJANGO_SETTINGS_MODULE'] = 'PROJECT.settings'
 
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
 
def test_wsgi(environ, start_response):
    status = '200 OK'
    output = 'Hello World! wsgi py \n' + sys.version + '\n' + '\n'.join(sys.path)
    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)
    return [output]
 
# Uncomment next line to test that basic mod_wsgi setup works
# and to see the contents of PYTHONPATH
application = test_wsgi

Note, that the last line is uncommented. That will allow us to first make sure that mod_wsgi is working, in general. Later, we’ll comment it out.

Test mod_wsgi

Next you need to edit the Apache config file. Change psNNNNN to your actual private server ID in each of the following steps.

To keep some of the following commands short, cd into the apache2 directory:

$ cd /usr/local/dh/apache2

To edit the Apache config (of course, you can use another editor besides vi):

$ sudo vi apache2-psNNNNN/etc/httpd.conf

To restart Apache:

$ sudo /etc/init.d/httpd2 restart apache2-psNNNNN

To view the Apache error log:

$ sudo less apache2-psNNNNN/logs/error.log

Open httpd.conf in an editor, search for the modules section and add the following line:

LoadModule wsgi_module /dh/apache2/template/lib/modules/mod_wsgi.so

Then, search for the VirtualHost section for your domain. At the end before , add the following while replacing YOU and PROJECT as appropriate:

# wsgi
<Directory /home/YOU/wsgi-scripts>
Order allow,deny
Allow from all
</Directory>

# absolute path
WSGIScriptAlias / /home/YOU/wsgi-scripts/PROJECT.wsgi

<Directory "/home/YOU/projects/PROJECT/media">
Allow from all
Order allow,deny
</Directory>

<Directory "/usr/lib/python2.5/site-packages/django/contrib/admin/media">
Allow from all
Order allow,deny
</Directory>

Alias "/media/" "/home/YOU/projects/PROJECT/media/"

# note some use hyphen - others _
Alias "/media/" "/usr/lib/python2.5/site-packages/django/contrib/admin/media/"

Only the first two sections are needed for a basic mod_wsgi test. If you run into any problems, try it with just the part up to and including the WSGIScriptAlias.

Restart Apache and access what would be the normal URL for your Django app. You should see something like:


Hello World! wsgi py
2.5 (release25-maint, Jan 24 2010, 14:37:54)
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)]
/home/YOU/projects
/usr/lib/python2.5/site-packages/setuptools-0.6c11-py2.5.egg
/usr/lib/python2.5/site-packages/MySQL_python-1.2.3c1-py2.5-linux-x86_64.egg
/usr/lib/python2.5/site-packages/pip-0.7.1-py2.5.egg
/usr/lib/python25.zip
/usr/lib/python2.5
/usr/lib/python2.5/plat-linux2
/usr/lib/python2.5/lib-tk
/usr/lib/python2.5/lib-dynload
/usr/local/lib/python2.5/site-packages
/usr/lib/python2.5/site-packages
/usr/lib/python2.5/site-packages/PIL
/usr/lib/site-python

Test Django with mod_wsgi

Once that’s working, comment out the last line in your WSGI script and hopefully your app will load.

Troubleshooting

I found it helpful to open a couple of console windows where I could:

  • Log in as admin user and tail the Apache error log (sudo tail -f apache2-psNNNNN/logs/error.log)
  • Log in as admin user and restart Apache as necessary
  • Log in as regular user and tail the HTTP error log (tail -f /home/YOU/DOMAIN/http/error.log)
  • Log in as regular user and edit my wsgi script, settings.py, etc.

The output in the error logs was key to solving most of the problems I ran into. The two web pages I mentioned at the beginning of this post were also crucial. Graham’s post includes a lot of useful detail on WSGI. Reading it enabled me to better understand what the error messages I was seeing really meant.

5/25/2010: 6:52 pm: RobertDjango, DreamHost, Google App Engine

DreamHost ended up at the top of a Lifehacker poll today for best personal web host. Well, not counting “Other”. Given how many web hosts there are and the likelihood that most of the people voting have experience with only one web host, it’s not surprising that Other came out on top. Still, DreamHost got almost 26% of the vote, more than double their closest competitor.

I’ve hosted this site on DreamHost since 2002 and have been very happy with their performance, attitude and sense of humor. I’ve added two more domains since then.

A few months ago I developed and deployed a simple Django app for calculating magic numbers for soccer leagues on shared hosting. The app scrapes league standings off websites and displays a matrix of teams and the magic numbers. DreamHost did a fantastic job of making it really easy to deploy a Django app on shared hosting with Phusion Passenger. I only had to make a few config changes from running it locally on the Django development server.

Recently I moved onto a Private Server and ported my app to run on mod_wsgi. Although mod_wsgi is the preferred way (I think) of deploying Django apps, it required a lot more debugging and config work on my part. I’m working on a blog post to cover everything I had to do to get it working. Part of the problem came from me having previously had it running on shared hosting. Someone starting with a private server would have had fewer issues.

I’m also planning to port the app to run on Google App Engine. Since there is no database backend, it should be very straightforward.

This is all leading up to a much more complicated web app that I plan to run on my DreamHost PS, as well as GAE. The data it will be storing is a natural fit for a relational database, so it will be interesting getting the app working on the App Engine datastore. I’m also going to try MongoDB as the datastore.

5/18/2010: 9:16 pm: RobertEverything Else

During a fun debug session at work, Drew and I tracked down the cause of a misbehaving web app. The problem related to how Apache Tomcat decides to recompile JSPs and how some of our source code was branched and modified. Beware of modified JSPs in branches when you move to newer branches.

The problem appeared after a new version was deployed in production. The symptom was that a chunk of HTML on a page in a web app displayed on the QA servers, but not on the production servers. We confirmed it was the same warfile, the same 6.0.16 version of Tomcat and the same 1.5.14 version of the JDK. There were no errors in the app log or in catalina.out. Having seen errors in other apps recently in the localhost log file, I decided to look there. We found a message logged at SEVERE regarding a JspException from not finding a value on an object using operator “.”. So, this obviously indicated  a JSP couldn’t be compiled because it referred to a non-existing field on a Java class.

So, then we hunted down the Java class that the Jasper compiler generated from the JSP and compared it between production and QA. We found that the Java code on production had an extra method that related to this missing property. So, even though we deployed the same warfile, the Java code for the JSP on the file system was different. By default, the code generated from JSPs ends up in tomcat/work/Catalina/localhost/{context}/org/apache/jsp/WEB_002dINF/jsp. Reverse engineering it to correlate the Java with the JSP code wasn’t as bad as I expected.

Next, we looked at the dependencies of the previous version and found that the class in question contained the field mentioned in the error message. Drew had switched to a newer branch for the new build, so the new file wasn’t strictly newer. And there’s the rub.

The timestamp of the class file in the older branch was newer than the timestamp of the class file in the newer branch, because it had been modified after it was branched.

When the app was deployed on the QA servers, the previous version had been undeployed. this caused the context directory for this app in the work directory to be deleted. When the app was deployed, Jasper compiled Java from all the JSPs as they were accessed.

However, on production, the new version was deployed as a replacement for the previous version. So, the context directory wasn’t deleted. When the JSP was accessed, Jasper saw that the timestamp for the previous version was newer, so it didn’t generate new Java code for the JSP file. However, the Java class the JSP page depended on had changed and the code no longer worked.

The quick fix was to undeploy the web app to let Tomcat clean up the work directory, and then to deploy it again.

However, it turned out that the change in the older branch was important. So, be sure to also diff the JSPs before moving to a new branch.

5/2/2010: 10:15 pm: RobertMySQL

Monitoring MySQL with Cacti

  • Cacti, like many other tools (Munin, Cricket, etc.), is a wrapper around RRDTool
  • Baron feels that existing tools (zabbix, ZenOSS, OpenNMS, etc.) that try to do graphing and alerting do at least one of them poorly, though Reconnoiter looks promising.
  • Cacti does graphing well
  • RRDTool does interpolations, so you lose the original data. By default, it doesn’t keep data very long.
  • With MRTG you have to create new graphs. Cacti has simpler approach using templates.
  • Data source, graph and host templates.
  • Problem – templates not always well written. Not always version compatible. Not always well parameterized.
  • Better Cacti Templates project has more than just MySQL templates.
  • Cacti prefers to use SNMP, but often easier to directly connect to mysqld or to use ssh and command line client
  • The wiki for Better Cacti Templates has very detailed installation instructions.

High Throughput MySQL at Facebook

  • Currently upgrading to 5.1.45 and innodb plugin 1.6
  • Perf testing starts with Sysbench. Have a tool called Shadow that allows them to direct all traffic to another server that is heavily instrumented.
  • Across shards – 7ms read time on avg, 13 million reads/sec peak, 370 million rows read/sec peak 210 m avg, 3.5 m rows modified peak 1.9 m avg, InnoDB disk IO/sec 4.4 m peak 3.3m avg
  • Row-based semi-sync replication
  • InnoDB plugin for perf and compression
  • mk-query-digest, mk-slave-prefetch, mk-upgrade
  • They don’t use stored procedures, but they might be helpful to solve some of their network latency issues
  • Get about 1500 IOPS from locally attached storage. 8 disks per server.
  • For 5.1 upgrade, planning to dump and reload and switch to InnoDB file per table (and moving to new hosts)
  • Paging via LIMIT is O(N*N)
  • Do pessimistic concurrency control with select for update and retries
  • Internal requirement for performance critical queries to be index only
  • They improved high concurrency throughput by disabling deadlock detection and depending on lock wait timeout. Reduced wait timeout to 15 sec, but dynamic per session. Dynamically reduce as low as 1 when server extremely busy.
  • They make temporary changes to some MySQL settings to survive extreme peak
  • They use gdb (see posts by Domas) to change variables that normally aren’t dynamic
  • LRU patch protects buffer pool from full table scans (e.g., from mysqldump).
  • Need to also monitor on client side, since server doesn’t see network issues, client timeouts, etc.
4/25/2010: 9:05 pm: RobertMySQL

Replication Tips – slides

  • Single master can typically handle a few dozen slaves
  • If you have a lot of slaves and want to do filtering for all, maybe use blackhole engine to do filtering in one place. Especially if running blackhole slave on same server to limit network bandwidth.
  • Replicate over WAN – slave_net_timeout, master_connect_retry, slave_compressed_protocol

Gearman – slides

  • UDFs allow you to run functions in a wide range of locations on servers other than the database server
  • Synch and asynch jobs
  • gman_sum() provides simple example of map-reduce with gearman
  • - Asynch queues, a.k.a., background tasks
  • e.g., email notifications (avoid blocking on sendmail), full text indexing, stat counters, etc.
  • Can enable you to do batch operations, when you don’t want to block on a large batch
  • Maybe use trigger to call a UDF that does a background task. Allows you to implement much more complex triggers with external side effects (e.g., sending notifications).
  • Putting Gearman workers between app and MySQL can allow you to parallelize queries (scatter-gather).
  • Persistent queues – if job server crashes, it can replay the queue on startup or you start another job server and point it at the queue
  • Queues can be stored in Drizzle, MySQL, PostgresQL, SQLite and even in Memcached.

Mastering the Art of Indexing – slides

  • Lock contention and indexing
    • update/delete by full scan is sequential rw. By index scan is random. So, full scan can be better in some cases.
    • In 5.1 can disable next key locking. use binlog-format=row and transaction-isolation=read-committed. But read committed can perform worse with high concurrency.
  • Deadlock caused by indexing
    • Sometimes single query update or delete has to update secondary index as well as PK table in InnoDB.
    • If 1 statement accesses and modifies sec index before accessing table records while another statement modifies same table record and needs to update same row in secondary index, you can get deadlock.
  • Covering index with range, limit, count
    • Range queries over large ranges often slow, especially if looking in secondary index before using range on table records. Can result in many random reads. Covering index can improve perf by eliminating random reads.
    • Covering index can also help a lot with LIMIT (especially when skipping) if the query would otherwise have to randomly access table records to apply other where clause conditions.
    • Similarly, covering index can help with COUNT, since you don’t have to randomly access table records.
  • Covering index with blob
    • Assuming you have a table with a blob that is rarely selected (e.g., < 10% of time). To avoid blob being loaded into buffer pool, typical optimization is to move blob into another table with 1:1 relationship, though this breaks normalization.
    • For his tests, performance with 2 tables is better until percentage of queries requesting blob (thus requiring two SELECTs) get up to around 45%.
    • Instead, can use covering index to include all desired columns (except the blob, of course) for most queries. To get good perf, initial column should be chronologically order, e.g., auto inc or a timestamp.
    • May have to use FORCE INDEX to get optimizer to use your index.
  • Sorting, indexing and query execution plans
    • If query does ORDER BY on a secondary index that is used for the query, can skip filesort, which is O(nlogn).
    • If not, optimizer chooses btw full scan and using index for sort vs. index scan and a filesort.
    • Index merge can be used to filter records, but not to sort.

Understanding the Role of IO as Bottleneck – slides

  • 7200 RPM drive can generally do about 100 IOPS.
  • HDD disk seek about 10 ms. Reading 1 MB sequentially from HDD is about 20 ms.
  • When you do a read on InnoDB, rows are loaded into buffer pool.
  • Writes to InnoDB go to buffer pool, but also are flushed to log file. Sequential IO to log file is fast. Buffer pool pages (16 KB/page) are marked as dirty until flushed to tablespace.
  • Writes to tablespace are sorted by innodb to be sequential as much as possible.
  • iostat is more useful than vmstat for IO analysis
  • If svctm large (maybe more than a few milliseconds), may have large competing IO requests, e.g., a busy shared SAN.
  • Queue size tells you how much data could be lost if server crashes. Wait time indicates how long ops may be blocking and also how much risk of lost data from power loss with no battery back up (or non-catastropic disk crash).
  • Indexes and archiving old data can reduce your working set. If your server is read-heavy, lots of RAM can really help.
  • RAID 5 good for reads, but worse for writes. RAID 10 good for both.
  • RAID controller with BBWC can reorder some random writes to sequential writes without risk of a power loss losing data.
  • Fusion-IO card can theoretically do up to 100,000 IOPS. Another presenter suggested that InnoDB is currently limited to around 15,000 on these cards.
  • Often cheaper to buy more memory, though that doesn’t help as much with writes.
  • Write options – mount files sytem noatime, compress blobs, reduce number of filesorts on disk.
  • If RAID with no BBWC, use innodb_flush_log_at_trx_commit of 2 (but be aware of possible data loss) or group statements into short transactions.
  • Referred to Facebook talk – they said maximum desirable transaction length for their workload and servers is about 4 seconds.

Mysteries of the Binary Log – slides

  • Safer to purge binary logs by using PURGE BINARY LOG {TO, BEFORE}, since it updates index file.
  • If you use expire_log_days, removal happens at server restart or log flush (which happens during a file rotate).
  • RESET MASTER will delete everything and create new binlog file. Disconnect any slaves before doing this.
  • Common header about 19 bytes. 4 byte timestamp, type, server id, length, file position, flags. File position is the end of the event.
  • SYSDATE() is dangerous, because it actually runs on the slave, so the generated timestamp will be different than on the master.
  • Context events (user defined vars, rand seed, auto_increment, last_insert_id()) are grouped with the query into a binlog event.
  • Since replication user can read binary logs, it can read all changes. Can be security issue.
  • For setting passwords, better to set passwords by setting them into a user variable as a hash and then setting hash directly as the password. Only the already hashed value in the SET statement is replicated.
  • Statements in transaction go into a cache. It is flushed to binary log on commit. It is emptied if a transaction with only transactional statements is rolled back. Complicated if you mix in MyISAM changes.
  • Transaction cache spills to a file if it gets too big. This is one of the reasons large transactions can cause bad performance.
  • DDL statements like CREATE and ALTER cause implicit commits. Changes to mysql databse and LOAD DATA INFILE also are implicitly committed.
4/19/2010: 9:01 am: RobertMySQL

What’s New in MySQL 5.5? Performance Unleashed! - slides

  • Default storage engine is InnoDB (what had been called the InnoDB plugin, but now is the only version).
  • Many locks and mutexes held back scalability in 5.1.
  • Biggest win was fixing uses of LOCK_open. It was used for many things, some of which made no sense.
  • Introduced MetaData Locking (MDL). Added new mutex to protect MDL hash, but hash is calculated outside mutex.
  • Moved some things to 3 new mutexes, made some things atomic variables and removed useless uses of LOCK_open.
  • LOCK_thread_count used for too many things. Still used to protect list of connections, but changed other uses to atomic variables.
  • LOCK_open and the new MDL hash mutex are the biggest bottlenecks, but they are a problem at 32 cores now, instead of 8 cores.
  • InnoDB had many mutex issues for large number of cores. Buffer pool and log mutexes are worst. Adaptive hash index mutex is global and can be a problem, but can be disabled.
  • InnoDB Kernel mutex started to become a problem again, since other fixes allow you to reach higher number of concurrent transactions.
  • Log mutex held about 75% of time (for Sysbench RW) and buffer pool mutex about 50%.
  • Split the Log mutex into separate mutexes for independent functionality. Need to grab 2nd while holding 1st, but then can release 1st.
  • 1st was hottest, so separating acquisition of it from acquisition of buffer pool index was a big win.
  • Buffer pool, either split mutex functionality into multiple mutexes or split buffer pool into multiple pools.
  • InnoDB plugin with compression made having multiple mutexes much more complex, so decided to split buffer pool. Need 3 locks to get buffer pool page. Adding page hash mutex would have made it 4.
  • New config setting. Defaults to 1. Also split the flush list from buffer pool mutex.
  • Initial testing suggests 8-16 buffer pools gives best performance, but need lots more testing before making recommendations.
  • Rollback segment mutex problem with many connections. Can even perform IO while holding it. Split into 128 mutexes.
  • Purge was done as part of master thread. HIgh transaction rate mean lots of purging. Master thread can end up spending most of its time purging. Moved purging to separate thread.
  • Huge improvement in recovery (e. g., 7 hours to 14 minutes). Reduced algorithmic complexity of 2 impt activities.
  • Also huge improvement to delete performance.
  • Still looking into query cache issues.

MySQL Cluster: an Introduction – slides

  • NDBCluster Engine does not store to local disk. It pushes data to data nodes, which don’t need to be running MySQL.
  • Data nodes are ndbd processes and not necessarily bound to a single server. For testing, you can run multiple on the same server.
  • Data usually stored in 2 data nodes. Storing in 3 is experimental feature, requires more servers and performs worse. Storing in 1 obviously reduces availability, but gives great performance.
  • Storage is transactional.
  • You can have many mysqld instances all accessing the same data nodes. You can also use NDB API to access data nodes.
  • A hash is used to partition data across data nodes.
  • Always at least 1 replica of stored data. Node groups allows you to group replicas. Group selection is done according to order the server is listed in config file.
  • If a node goes down, its replica takes over as primary for the data. When node comes back, if not too far behind, it will sync the changes. If too far behind, it will sync a dump of all data.
  • Or you can bring up a replacement data node server with same IP address. Config is by IP.
  • If 2 data nodes in a group goes down, cluster must halt. Consistency over availability, CAP-wise.
  • SQL Node is a node running MySQL server.
  • Up to 48 data nodes. Very rarely more than 10 in practice, due to network and maintenance issues. Usually just a few.
  • Management node started first and it reads in the config. All data nodes need it in order to start. Also used for monitoring and arbitration.
  • If 2 data nodes can’t communicate, but 1 can still reach management node, that data node will take over. The one that can’t reach the mgmt node will shut down.
  • Mgmt node should be on different physical machine than the data nodes. You generally want two management nodes.
  • Only one will be the arbitrator. If primary fails, the data nodes will elect the 2nd mgmt node as the arbitrator.
  • There are also API nodes. A SQL node is a special case of an API node. API nodes use C++ library. Can also use Cluster/J in 7.1 for Java apps.
  • Only have to create tables on one MySQL server. The table creation will be auto detected by other servers.
  • User and grant info is independent, so that must be done on each server.
  • But, you can set up replication between MySQL servers. This allows you to more easily mix in non-cluster engine tables.
  • Need to handle retries in app. If data node dies during transaction, app must retry so transaction goes to other data node.
  • Table limits: 8052 bytes per row (can use blobs, but perf not good), 128 columns per table.
  • Indexed data limited to available RAM. Non-indexed data can be stored on disk.
  • Features: Online backup, global replication (can use MySQL to replicate clusters), NDB & MGM API, online additions of new data nodes.

EXPLAIN Demystified

  • Estimated query plan. Only for SELECT. Does not generate byte code. Unfortunately, you can’t ask MySQL for the plan it actually used for an executed query.
  • MySQL does everything as nested loop JOINs. Even a single table query is a single iteration loop. Always a left-deep tree, though it may reorder joins within the tree.
  • EXPLAIN output starts with deepest JOIN.
  • If no subquery or UNION, id for each row will be 1.
  • For select_type, ‘simple’ means only one SELECT keyword in entire query.
  • Derived means executed as a temp table, e.g., result of a subquery. If the table column is , the n is the subquery’s id column.
  • Union – rows spooled into temp table then read out with null in a row that says UNION RESULT
  • Be aware that EXPLAIN actually executes subqueries and unions, which can make it run very slowly. Also, can cause side effects if you update variables.
  • For derived, the higher id row is executed first. For UNION, the output tells you the execution order, e.g., .
  • IN and NOT IN queries are poorly optimized. They are converted into a dependent subquery.
  • UNCACHEABLE refers to the item cache. Not well documented. A few non-deterministic things like RAND() or a user variable can cause this.
  • Type is the JOIN type. Really more of an access type. Worst to best is ALL, index, range, ref, eq_ref, const, SYSTEM, null.
  • eq_ref means if there is a match, there will only be one row, perhaps due to a unique constraint on an index
  • Some non-range clauses get labeled as range because of details of the optimizer implementation.
  • ref tells you which columns or constants from preceding tables are used for lookups in the index named in the key column
  • rows is number of estimated rows that must be read. 5.1 tries to use LIMIT clause, but it is still sometimes inaccurate.
  • filtered: # of rows satisfying a condition. Added in 5.1. Not very useful.
  • extras: useful extra info (using index -> covering index; using where -> server post filters rows after using index; using temporary -> implicit temp table but not specified if memory or disk, using filesort -> means some kind of sort, but could be in memory)
  • filesort is an algorithm name. It doesn’t mean a disk-based file was used.
  • Use “pager mk-visual-explain” to pipe output through mk-visual-explain, assuming you installed Maatkit.

Connecting MySQL and Python – slides

  • Standard driver is MySQLdb. Another option is oursql.
  • If using MySQLdb library – db.connect(use_unicode=True, charset=’UTF8′)
  • The oursql driver has same API. Uses mysqlclient. Also implemented in C.
  • Both depend on MySQL client already being installed.
  • MySQL Connector/Python doesn’t depend on MySQL client libraries. Written by Geert.
  • Connector/J can be used with Jython. from java.sql import DriverManager
  • Current version is 0.1.3, but 0.1.4 coming soon.
  • Performance for INSERTs is about twice as slow as for MySQLdb or oursql.
  • Need more benchmark tests, since not clear whether that was including the network roundtrip time, since that should be roughly the same.
  • After the talk I talked to Chromakode about drivers and he recommended oursql. Much better code and documentation. I’ve switched to oursql and I like it.

Better Database debugging for shorter downtimes – slides

  • Tracking changes over time
    • Slowly Changing Dimension type-2. Start with a baseline and only track changes.
    • Need PK of original object, maybe from information_schema. Select attributes to track.
    • Add new row for each new property and when properties change. Use current_flag column to indicate which version is current. Also use effective_date and end_date, which can also tell you the current version.
  • Job logging via stored procedures
    • When they run a job (e.g., crontab entries), they use stored procedures to track when and how long it ran.
    • Really need to review slides and the stored procedures to get much value from this.

Linux Performance Tuning and Stabilization Tips – slides

  • To minimize mem usage, use compact datatypes. Remove unused indexes. Archive data when possible to minimize your working set.
  • Use direct IO -> innodb_flush_method=O_DIRECT.
  • Don’t alloc too much memory to mysqld. Makes filesystem cache too small and causes swapping.
  • If you disable swap & run out of memory, OOM killer will likely kill mysqld. It takes a long time to kill it and it aborts the process, which triggers crash recovery.
  • MyISAM, backup tools & admin tools use the filesystem cache.
  • Copying large files can cause swap. This can happen during backup.
  • set vm.swappiness=0 in /etc/sysctl.conf. Default is 60. This makes OS prefer making filesystem cache small over swapping.
  • tcmalloc faster than malloc. Can config InnoDB to use tcmalloc. Mostly matters for CPU bound loads with large buffer pool.
  • Don’t use too much session memory This can happen if you make per session memory variables to big. read_buffer_size can cause large memory blocks to be allocated.
  • Use BBWC on raid cards. Then, disable write cache on disks.
  • Don’t set write barrier if using BBWC (in ext3, barrier = 0)
  • Appending+fsync much slower than updating+fsync
  • If you use sync-binlog=1, fsync is called on append. Very slow.
  • Set innodb_autoextend_increment to higher than default of 8, maybe use 20.
  • pdflush takes care of async writes for MyISAM, file copy, mysqldump, etc.
  • Facebook moved from ext3 to xfs. ext3 very slow to delete large files. Makes dropping large table slow.
  • Writing to file is serialized.
  • Use dir_index to speed up searching and disable write barrier.
  • With xfs, disable write barrier with “nobarrier”. Concurrent writes possible with O_DIRECT.
  • Fusion SSD cards are 2-3 times faster than Intel SSD cards and 10-20 times faster than HDD. Fusion cards really benefit from xfs.
  • I/O Scheduler default is cfq, but noop or deadline are often better. InnoDB is already doing I/O scheduling itself, so cfq results in useless overhead.
  • Big difference in write performance between cfq and noop when lots of random reads mixed with writes. Recommend noop.
  • I/O scheduler queue size. Default is 128. Setting large like 100000 can greatly improve perf for MyISAM, since it doesn’t sort writes. Doesn’t help InnoDB, since it already sorts them.
  • iostat -x
  • Check r/s, w/s, svctm, %util, though svctm more reliable than %util
  • mpstat gives per core stats, vmstat averages across cores
  • mpstat -P ALL 1
  • SystemTap can be used with MySQL binary builds with DWARF symbols
  • iotop provides per process io stats
4/18/2010: 10:15 pm: RobertMySQL

This year’s MySQL Conference was once again jam packed with very technical presentations. Best of all was the news about all the performance and feature enhancements in MySQL 5.5. The benchmark results for the performance improvements were very impressive. After attending this year’s conference, I’m more confident about MySQL’s future than I’ve been for a very long time. I’ll post my notes from the three days I attended.

It was also a great time to meet up with the friends I’ve made in the MySQL community over the last few years. Some other open source communities have a really poisonous atmosphere of spending a lot of time attacking competing open source products. Fortunately, I see very little of that time-wasting crap within the MySQL community. Well, except for the occasional fork sniping, but that’s easy for me to ignore.

Update: Finally managed to post my notes from all 3 days.

3/14/2010: 12:10 am: RobertEverything Else

This week I attended a showing of beer-related films at the Hearst Museum of Anthropology at Cal. Research anthropologist Ira Jaknis, curator of the ongoing exhibit 99 Bottles of Beer: Global Brewing Traditions 2500 B.C. — Present was our host for the films. Sadly, only three other people were attendance. But, because the Prelinger Archives are 99.9% awesomesauce, you can watch them, too.

The original event title was “Two Short Films About Traditional Beers”, but Ira wasn’t able to obtain the copy of “Caxiri or Manioc Beer” that is at the Moffitt Library on campus. That short was filmed of the Waiãpi Indians of Brazil by Victor Fuks in 1988. I was bit bummed, because I had tried to brew manioc beer last week. I got no fermentation action, so I was hoping to pick up some video tips on proper chewing and spitting of the manioc. It appears one key step is to mix in a bit of still fermenting brew from the previous batch, which is a bit problematic for me.

However, we did get to see the fascinating “Brewing Millet Beer in Africa” by Christopher D. Roy (2006). You can watch a trailer for the film via a link on that page. However, the trailer gives you only a hint of how interesting and well-composed the film is. The opening shot of the women grinding the millet is pretty great, though. The film was shot in Burkina Faso.

As in a few other cultures, the women brew and sell the beer. Apparently, some of the more successful African women are able to develop a relatively large scale business and purchase trucks to ship the beer outside of their village.

In the film we get to watch and listen to Zenabu Bamogo as she sprouts the millet, dries it, grinds it, boils it, filters out the wort (albeit with only a woven basket), ferments it overnight, cleans clay pots, pours out the beer and sells it. It’s fun to listen to her as she probably sees the action of the yeast at night for the very first time, since she wouldn’t normally have access to a bright flashlight. I’m not sure exactly which language she is speaking, but you can hear a man translating in French in the background. The film is subtitled in English.

In place of “Caxiri or Manioc Beer”, Ira showed us 4 of the 6 shorts on a DVD he personally owns. I believed the DVD was named something like Historic Beer Film in America. Anyway, that describes the content pretty well.

First, we watched “For Good Living“, a hysterically poorly produced silent film from 1937 made as a promo by Safeway for their Brown Derby beer, which they declared to be “Imported’s Equal”. The West Coast Grocery company actually contracted with Humboldt Malt and Brewing Company in Eureka, California, and later many other breweries, to produce this pilsner-style beer with “the genuine Pilsner tang!” West Coast Grocery distributed the beer to Safeway and at least one other grocer, though it later became solely Safeway’s house brand. According to rustycans.com (an awesomely informative website), the Brown Derby restaurant in LA sued them for copyright infringement and forced them to redesign the label.

The captions are scrolled into view at times by someone lowering a black piece of paper at glacial places, and other times by turning a large roller on which they are printed. The scene about 3.5 minutes in of entering the Office of the Brewmaster feels like a scene from Psycho. Any second, I was expecting to see him tossed head first into a lauter tun.

We also get to see malt that is “finely ground and cleaned” before apparently being quality checked by being sifted through by hand by a guy in suspenders with greasy fingers. There is more quality checking on the assembly line, which appears to consist of removing every 10th bottle.

Thanks to the Prelinger Archives, you can watch all 19 wacky minutes of “For Good Living“.

After the film ended, I commented to Ira that film editing was apparently not invented until after 1937.

This was in sharp contrast with “Help Yourself” by Pabst, which reminded me of the old newsreels that were popular in theaters back in the 1940′s and 50′s (not that I was around to know). More than a little heavy-handed, this film attempts to browbeat and guilt trip distributors into putting all their efforts into getting retailers to sell more Pabst beer.

While the first film muddles up the picture of the three-tier system (the US system where breweries sell to distributors who sell to retailers who sell to customers), the Pabst promo film makes the picture clearer. The third film mentioned below lobbies hard for the three-tier system.

Next up was “As We Like It“, a not even remotely Shakespearian tale produced on behalf of the United States Brewers Foundation. Starting off with the ancient history of beer and quickly moving through the pilgrims up to the modern science of food, we get a full on lobbying press for the status quo. And did we mention brewers pay lots of taxes?

As shown in the mirror of our friendly bartender, the five principles that will keep the prohibitionist bogey man away are:

  • Quality of product
  • Cleanliness
  • Friendliness
  • Observance of the law
  • Good citizenship

You can watch this video at the Internet Archive, or on the Beer In Food blog/YouTube.

There was also a 20-second Labatts commercial that was only slightly creepy.

1/10/2010: 11:43 pm: RobertDjango, Python

Django includes a Bash shell completion script, which can save a lot of time. The tutorial recommends installing it, but doesn’t explain how. Fortunately, its the same as for any other Bash shell completion script. Here’s how I did it for Django.

  • Add an environment variable for the location where you installed Django. For example, add the following line to ~/.bash_profile:
    export DJANGO_HOME=/Users/rstewart/code/Django-1.1.1
  • Add a line after this one in ~/.bash_profile to read the script, e.g.,
    source $DJANGO_HOME/extras/django_bash_completion
  • After saving .bash_profile, open a new terminal or run source ~/.bash_profile

Now, you can press tab to complete commands. Below, I pressed tab after entering “manage.py”.

$ python manage.py
compilemessages   diffsettings      loaddata          runserver         sqlclear          sqlreset          syncdb
createcachetable  dumpdata          makemessages      shell             sqlcustom         sqlsequencereset  test
createsuperuser   flush             reset             sql               sqlflush          startapp          validate
dbshell           inspectdb         runfcgi           sqlall            sqlindexes        startproject