The New Technology of Daily Kos

Or, a look at what I've been working on for the site.

It's been three months or so in the making, but today I've finally finished moving the site to its new hardware. Many of the major and minor things that could go wrong, went wrong, but at last everything's squared away and we're up and running on the new machines. Now that that's finished, I can even start rolling out some new features that I've been waiting on until this was all finished.

WARNING: There will be honest, frank discussion of computers and technology issues in this diary. Please, for the love of God, do not post a comment saying that you don't understand what I'm talking about. It will only antagonize the violent beast within.

Let's get started.

In the leadup to the various primary elections and caucuses, I made a lot of optimizations to the site’s internals. It’s not stuff you’d generally notice from casual browsing of the site, but there was a wide expansion of memcached usage and changes with how certain tables are indexed. You can see it with the fancy (but backwards compatible) URLs with the story id at the end. My last minute hacking memcached support into mod_magnet for lighttpd was an immense help as well. Future expansion became a concern, however, so I looked into having our webservers netboot and mount their root partitions over NFS from a central location. At the same time, I was getting a little worried about disk space on the database servers, with the growing size of the db, so I asked about upgrading the db servers down the road.

We hadn't upgraded our machines in a couple of years, so we had 10 dual Xeons with 2GB RAM each running as webservers, and 2 dual Opterons with 8GB RAM with a dedicated 73GB SCSI disk with a nicely tuned XFS setup for the InnoDB files running MySQL. There's also the outgoing SMTP & search server, the image/NFS server, the memcached server, and the MySQL slave Xen VM we take backups off of, but those aren't particularly relevant to the discussion at hand, since they're sticking around.

For the new webservers, we got six quad core Xeons, each with 8GB RAM and an 80GB SATA disk for logs and such. After much trial, tribulation, and confusion with both nfsroot and iSCSI, they were finally set up with an nfsroot served up from a Sun x4500. This way, they all share one root filesystem for ease of maintenance. Plus, if required we can throw extra machines into the pool and they'll come right up, even configuring swap space if it isn't there.

Unfortunately it took a while to get to this place. Trying to set it up to run off an nfsroot exposed an obscure bug where Linux, when trying to mount an nfsroot partition from a Solaris server, would send a malformed RPC request that the Solaris server would never answer (Linux NFS servers, however, would happily respond). I spent about a week or so digging around in multiple versions of the kernel trying to dig out the problem and adding debug code to sunrpc in the kernel trying to see where the problem was. Nothing I tried would even get the messages to print, though. Eventually it occurred to me that maybe it wasn't the kernel, because the normal mount program could mount NFS volumes from a Solaris server just fine. The problem, in the end, turned out to be with the "nfsmount" program supplied by busybox used by initrd in Debian etch. Using the most recent busybox, instead of the one from May, 2006, cleared the nfsroot problem up. This was, of course, not the last problem with these machines, but progress picked up considerably after that.

The new database machines are eight core (or, as I like to say, OCTOCORE) Xeons with 16GB RAM, one 73GB disk for the OS, one 73GB disk dedicated to /tmp, and a 6x73 GB RAID-10 for the database files (and with /tmp and the db RAID each having a finely tuned XFS filesystem set up on them). Setting those machines up was easier than the webservers, except for the time involved in loading all the data onto them and getting kicked in the head with this MySQL bug, necessitating me upgrading all the MySQL servers to 5.0.51. For the database servers, I'm running a 64 bit Debian etch and the icc compiled MySQL 5.0.51 server. The difference between the icc and gcc versions of MySQL don't seem to be too extreme, but I'm keeping icc for the moment anyway.

The MySQL settings are still being tweaked, but for Google's benefit this is what I've done so far:

  1. /tmp was formatted as mkfs.xfs -l size=64m /dev/sdb1, and mounted with rw,auto,nouser,async,dev,exec,noatime,nodiratime,logbufs=8 as the mount options.
  1. The RAID-10 was formatted with mkfs.xfs -l size=128m -d agcount=90 /dev/sdc1 and the same mount options as above.
  1. I’ve changed /proc/sys/vm/swappiness (also found in vm.swappiness in sysctl.conf) to 15 from its default of 60. This value is subject to adjustment as needed.
  1. I’ve set all the disks to use the deadline I/O scheduler. A convenient way to turn that on on a running system is echo “deadline” > /sys/block/(dev)/queue/scheduler. Add elevator=deadline to your kernel’s boot parameters to do it on startup.
  1. renice -20 your mysql server process.
  1. Finally, remember kids: if at all possible, adjust all your MySQL settings to avoid swap if at all possible. Hitting the swap unnecessarily will send your load through the roof and generally make things slower.

THE FUTURE

In the grim future of Hello Kitty, there is only war. You, however, have a little more to look forward too - a friendlier preview, Hunter's new comment rendering system, some memcached tweaks, and coming improvements to how tags are handled internally. Most of these things aren't things you'll actually notice, but they'll make your experience at Daily Kos that much smoother. There's some other goodies coming, but sadly they are still Very Secret, so you'll just have to wait.

Ta ta, and I know you're not going to needlessly antagonize me in the comments, right?

 
comments powered by Disqus