Mod_mcpage

mod_mcpage for lighttpd updated

This is not a huge update for mod_mcpage, but it changes the receive timeout, and adds configuration options for auto ejecting hosts, the server failure limit, and the retry limit. This fixes a problem where if a memcached server goes down lighttpd + mod_mcpage would wait way too long before giving up on memcached. The README.mod_mcpage has more information on how to use the new options. The repository is, of course, at https://github.

mod_mcpage for lighttpd 1.5 svn patch 0.96.1 (working)

This new version of the mod_mcpage only adds one new feature (an option, mcpage.announce, to enable or disable the 'X-served-by-memcached' response header), but features a number of behind the scenes improvements. Internally, the request handling was cleaned up and taking content from the chunkqueue was streamlined. The module will also store pages that were compressed by the backend. If the client sends the appropriate headers, it will receive the pre-compressed page, but if it does not the module will decompress it before sending it to the client.

mod_mcpage for lighttpd 1.5 svn patch 0.94.1 (preliminary)

In preparation for some DK4 stuff, I went and revisited my mod_mcpage work and finally knocked a couple of little things off the list of things to do with it. As usual, this is a patch for lighttpd 1.5. As usual again, there's a tarball too. This patch still only applies to lighttpd 1.5; a version for 1.4 is on the to-do list. In terms of obvious new features, this one doesn't have much.

mod_mcpage for lighttpd 1.5 svn patch 0.93.0 (working)

And working quite nicely, too. This is a mod_mcpage patch for lighttpd 1.5. (There's a tarball too.) This is the long awaited mod_mcpage where you can add a local memcached instance as a fast local cache with a quick timeout. From README.mod_mcpage in the tarball: This patch adds the ability to add a local fast memcached with a quick time-out for extra quick serving. It tiers with the other memcached servers (which you can have an array of) so that data from the remote memcached servers gets put into the fast local memcached server, where it will live for a few seconds until it times out and gets fetched from the remote again.

Surprising mod_mcpage finding

Apparently I'll be able to get the patch ready sooner than I expected. I had been experimenting with using localmemcache's C API to store pages in the fast local cache, assuming that it would be faster than using a local instance of memcached running on the server, and that I'd just have to add expiration and LRU removal to localmemcache (not the smallest task, but certainly quite doable). After converting mod_mcpage to use the binary protocol, I got a bit of a speed boost, so that mod_mcpage serves stuff up about as quickly as it lighttpd serves a static file (sometimes a few microseconds faster, sometimes a few microseconds slower).

Intriguing mod_mcpage development using a local memcached instance

No patch yet, but I added a local memcached instance that only stores stuff for a few seconds to mod_mcpage to see how it worked, and got some very interesting benchmarks. My original plan had been to craft up a cache with POSIX shared memory (which is pretty sweet), semaphores, and Glib hashes, but I thought about it some and decided to just let memcached handle the housekeeping stuff, since with a custom shared mem cache I'd have to worry about expiring older stuff and removing unused entries myself.

In summary, though, being served from a paravirtualized one proc Xen VM with 128MB RAM, a local memcached instance of 16MB, and a remote memcached server with 64MB running on a fully virtualized one proc Xen VM with 128MB RAM (processor-wise, the Xen host has a Intel Core2 Duo E8500 @ 3.16GHz):

Running 1000 requests with a concurrency of 1 had 95% of all requests served in 3ms with both local and remote memcacheds, 95% within 17ms with just the remote memcached, and 95% within 780ms with no memcached at all.

Running 1000 requests with a concurrency of 5, with both 95% were within 135ms (80% within 9ms, 50% within 7ms), with just the remote 95% were within 90ms (80% were within 69ms and 50% were within 63ms), and no memcaching at all 95% were within 3766ms (80% were within 3621ms and 50% were within 3554ms).

Mean times for concurrency of one with both, remote only, and no memcached: 4.004ms, 13.878ms, and 778.267ms.

Mean times (across all requests) for concurrency of five with both, remote only, and no memcached: 3.926ms, 13.359ms, and 718.912ms.

Update: As I was cleaning up mod_mcpage to make a patch out of it in the relatively near future, I took out a bunch of random debugging stuff that was going into the log and re-ran ab again, this time with the index page of the site I was testing with using the tiered local + remote caches against the same file being served statically. Over 10,000 requests at a concurrency of one, the tiered cache had a mean time per request of 2.216ms. The static file had a mean time of 2.110ms. Over 10,000 requests with a concurrency of five, the tiered cache had a mean time per request of 1.120ms across all requests. The static file had a mean time of 1.025ms across all requests.

Not too bad. Doing it in shared memory would be blazingly fast, but who knows when I'll have time to get all the little bits of that done. Something worth shooting for down the road though.

Full benchmarks (including the update) below the fold.

mod_mcpage for lighttpd 1.5 svn patch 0.92.2 (working)

This is a working mod_mcpage patch for lighttpd 1.5. The issues working with compressed data, mod_deflate, uncompressed data, and some other strange combinations has been ironed out. Local files, fastcgi, and proxy data have been testing with various combinations of mod_deflate being turned on and off (in lighttpd and the backend), and it's all working now. Took some jumping around, too. This module stores content, either local or proxied, in memcached so it can be served out of there rather than hitting the disk or the backend server.

mod_mcpage for lighttpd 1.5 svn patch 0.92.1 (preliminary)

Another preliminary patch for lighttpd 1.5 mod_mcpage has been released. It has all the features of the previous version, with some resolved issues: Checks added so it doesn’t try to load objects larger than 1MB (or a limit you define at compile time) into memcached. Stores Expires: and Cache-Control: HTTP headers. To do: Needs to be non-blocking. Option to MD5 keys Binary & local data needs more testing MIME type checking for compression & appending debug data to pages.

Preliminary 1.5 mod_mcpage patch

It actually works pretty well, but I'd still like to unify the 1.4 and 1.5 versions of it. The plugin interface changed between the versions, so they work pretty differently now, but the gist is the same. The preliminary patch is here: mod_mcpage-0.90.5-1.5-svn.patch. It has all of the functionality of the 1.4 version, plus: FILE_CHUNK now supported (aka local files). Content compressed by the backend will now be passed through correctly.

Back from the dead! Now with more mod_mcpage stuff.

I'm finally caught up on a huge glob of work that didn't lend itself to blogging (stuff with poll data parsing -- just not really worth writing about), especially on top of holidays and then a recent trip to Disneyland. However, while on said trip, I started poking at mod_mcpage again, and managed to fix some stuff. Then, while I was at it, I made some improvements and cleaned some things up too.

New mod_mcpage patch

If it seems like all I'm writing about right now is mod_mcpage, well, it's what I've been doing the most work on. Thanks to the helpful folks in #lighttpd on freenode, who made some suggestions and found some problems, there've been some changes. I've made a new patch for mod_mcpage, this time against the svn branch of lighttpd 1.4.x, although it should apply against 1.4.19 and 1.4.20 as well. However, if you apply this patch against an earlier release (which I haven't actually tested if it would work yet), you would at minimum need to rerun autoreconf -fi, and possibly .

New, improved version of mod_mcpage

Update: And that's what I get for not upgrading to the latest version of libmemcached before making the patch; MEMCACHED_HASH_KETAMA was dropped from libmemached a little while ago, and the previous version of the patch didn't reflect that. A new version of the patch has been uploaded.

Updated Update: Slightly updated patch uploaded again - small bug turned up in production use that didn't show up in testing where pages would occasionally be stored in memcached when they shouldn't have been, despite not actually being served up from memcached down the road. Latest version fixes.

You can get it from here. The patch applies to lighttpd 1.4.19 (I haven't tried it against 1.4.20 yet, but I'd be surprised if it had much trouble). There's a README that explains what it does and how to set it up in more detail, but here's the short version:

With the older version of mod_mcpage, the backend application was responsible for placing the page into memory,  gzipping if necessary, so lighttpd + mod_mcpage could serve it up directly without having to hit the backend at all. Now, no backend changes are required -- mod_mcpage will place pages (and their content types (no more "text/html" only)) into memcached, handle compressing them if the page will be too large, and serve up pages from memcached. If lighttpd + mod_mcpage find a page in memcached, it will decompress (if necessary) the page, set the content type, and return the page to the user. In production use, serving pages up for anonymous users from memcached has been a massive help for Daily Kos. This new version should simplify things a bit by not requiring the backend app and lighttpd to coordinate the page caching. Note: I have only used this with lighttpd running as a proxy in front of a webserver running a backend app (Scoop in this case). Also, mod_mcpage uses libmemcached, not the older libmemcache. Make sure libmemcached is installed before you try to install this.

TODO:

As always, there's more stuff to do. At the moment, expiration headers aren't set. That might be nice, although probably of somewhat limited utility with dynamic pages that are likely to change soon. The plugin only supports reading in write_queue content that's in memory, not from a file. This works fine with content returned by mod_proxy, which is what I wanted it for, but does not currently support putting files that would be served from disk into memcached. In my opinion, that wouldn't be particularly useful, but it might be desirable behavior. Small frequently loaded files can be placed in a tmpfs mount and read out of memory that way without the overhead of memcached. Still more memcached behaviors could be set from the configuration file, but aren't. Only text files are supported currently - binary data gets garbled up. This should be fixed, but the workaround is to exclude those from being served by memcached (and why are you proxying them anyway?).

Lessons from mod_mcpage

I made a bunch of progress over the weekend on mod_mcpage, and now the plugin is placing pages into memcached as well as retrieving them, and setting the content type correctly. It's pretty cool, although there's still some rough edges to wear down, and I'll need to run it through valgrind some to make sure there's not horrid memleaks somewhere. While I was working on getting mod_mcpage to put pages into memcached, I also figured out some stuff that seems painfully obvious now.

Working on mod_mcpage - Stupid Programming Tricks Edition

One of the things I want to do with mod_mcpage is move placing pages into memcached out of the backend application and into lighttpd, so it's all handled there. To do so, the page's content type will need to be stored along with the page in memcached. The easiest way to store it was to just put the content type at the beginning of the string to store, but I had to think for a bit to figure out the best way. I came up with two. Each involves jumping through some hoops, but at different points.

The First Way

This is probably the more correct way to store the content type, but requires somewhat fancier footwork further down the line. Here, we store the content type and the page together with null bytes separating them.


char *content_type = "text/plain";
char *page = "Some text and whatnot";

char *store = malloc(strlen(content_type) + strlen(page) + 2);

char *i; /* We'll want this later to reset the pointer */
i = store;

strcat(store, content_type);
store += strlen(content_type);
*(++store) = '\0';
strcat(store, page);

/* Reset *store back to the beginning */

store = i

The hoop jumping here is that strlen() won't work anymore, since there's a null byte in the middle of the string, so you'll need to get the length of the string to be stored in two steps. You'll also need to extract the content type and the page in the same manner. Below, we'll go ahead and do both at the same time to illustrate.


size_t contype_len = strlen(store);
i = store; /* Using char *i from our earlier example - if you want to free the stored string, you'll need the pointer somewhere. */

char *contype = malloc(contype_len + 1);
strcpy(contype, store); /* Now we have the content type in its own string */

/* Advance *store past the end of the content type. */

store += contype_len + 1;

/* Now let's get the length of the page, and the overall length of the stored string */
size_t page_len = strlen(store);
size_t overall_len = contype_len + page_len + 2;

/* Now we have both the content type and the original page (accessed from *store) so we can use them. */

... do stuff ...

/* Once we're done, we can free them  up. This is why we saved the pointer in i = freeing *store will lead to Bad Things. */

free(i);
free(contype);

Again, pretty straightforward. The downside is that if you want the length of the string to store, you need to make sure you count both null terminated strings inside of it.

The Second Way

This was actually the first way I thought of, but is probably the less correct way to do it. However, it avoids having null bytes in the middle of the string. With this method, we store the length of the content type in one and a half chars at the beginning of the string we're storing. Using one and a half chars does limit the possible content type lengths to 4096 bytes (the first char is ORed with 0xF0 to avoid a null byte at the beginning of the string). The bit manipulation here may also cause problems with endianness - if I end up using this way, I'll have to set up Debian/390 under Hercules and see what happens.


char *content_type = "text/plain";
char *page = "Some text and whatnot";

char *store = malloc(strlen(content_type) + strlen(page) + 2);

/* Get the content type length and transfer it to the chars. */

#define MAX_CONTENT_TYPE_LEN 4095
size_t c = strlen(content_type);
/* Make cure that the length of the content type string isn't over 4K - 1. Unlikely, but you never know. */
if(c > MAX_CONTENT_TYPE_LEN){
   ... handle the error ...
   }

unsigned char b, d;
unsigned int fl = 0x00000F00;
unsigned int fg = 0x000000FF;
b = (c & fl) >> 8; /* mask out everything but the second byte, shift right by 8 bits so the second byte becomes the first, and assign to b. */
d = c & fg; /* mask out everything but the first byte. */
b |= 0xF0; /* Don't want to end up with a null byte there. */

*store++ = b;
*store++ = c;
strcat(store, content_type);
strcat(store, page);
store -= 2; /* get it back down to where we started. */

So now the string is ready for storage. What do we do when we want to use it?


char *i; /* For storing *store's starting point for later */
i = store;

unsigned char n = *store++;
unsigned char m = *store++
n &= 0x0F; /* clear the bits in the top half of that byte that were there to prevent having a null byte. */

unsigned int clen = 0;
clen = n << 8;
clen |= m;

int y;
char *content_type = malloc(clen + 1);
char *bb;
bb = content_type;
for(y = 0; y < clen; y++)
   *content_type++ = *store++;
*content_type = '\0';
content_type = bb;

/* Now we have the content type string and the page (accessed, as in the previous example, from *store). */

.... do stuff ....

/* We remembered to save the stored strings pointer in *i earlier, because we need it to free our memory. */
free(i);
free(content_type);

Those were the two ways I'd come up with to handle storing the content type. The second one occurred to me because I wanted to avoid dealing with embedded nulls, but it's also pretty complicated. The first requires remembering to handle the null bytes, but is a lot less complicated (and can handle theoretical content type strings longer than 4K, should they come up). I'm planning on using the first method for storing the pages in memcached, but I'm keeping the second in reserve should there end up being some overriding reason. Also, it's so ridiculous, I couldn't help but share it with the world (although the world will likely point and laugh, and I'll deserve it).