time to pull the plug

Tiered Caching Thoughts

19 Sep 2009, 21:20

I should have written about this a while ago, but predictably I never got around to it, so I have to recreate the benchmarks. These are just some initial comments.

One of the best things I did for Daily Kos when I was getting ready for the 2006 midterm elections was to start using memcached. When I was getting ready for the 2008 elections, I greatly expanded the use of memcached throughout the backend, and it was one of the biggest factors in getting through Election Day in one piece. With a multi-webhead setup like ours, we were able to spread the work around to all the webheads with memcached so the webheads could take advantage of processing done to render pages done by other webheads. Really handy, and it helped keep our load below 3 on Election Night.

Unfortunately, I saw areas that memcached wasn't perfect. No matter what, it's better than not caching at all, or just caching locally, but using it as your only cache isn't ideal either. Fetching data from memcached over the network takes time that can really add up. You could cache stuff locally in your app too, but the other processes wouldn't be able to take advantage of it.

What I've experimented with, and have perl and ruby clients for, is a tiered cache. Basically, the cache uses memcached for persistent and distributed cache storage, but keeps a fast cache that quickly times out (I've been using 5 seconds, but a shorter time might be better) that's local to the machine it's running on, so all of the processes of your app can make use of it. When your app gets something from the cache, it first looks in the local cache, and if it's not there it looks in memcached. If it's in memcached, it retrieves the data (or, if it's not in memcached, it does whatever the app would do without the cache of course and then puts it in memcached for future use) and stores it in the local cache before moving on. After that, until it times out from the local cache, the app can fetch the data from the fast local cache without having to hit memcached at all. The short expiration time attempts to avoid the problem of data going stale, but both clients provide ways to delete data from the local cache at the same time it's deleted from memcached.

To implement the local cache, the Perl client uses Cache::Mmap (although now I've found Cache::FastMmap, which I don't remember seeing before). The ruby client, developed with the help of the ever helpful wycats uses Berkeley DB for the local cache, but since the ruby client is built with Moneta, swapping BDB out for whatever else you'd prefer would be easy.

I can't find the benchmarks I did back in March or thereabouts, so I'm going to have to recreate them. Some random guy on the internet's memory isn't a good benchmark, after all. I remember that the difference between the tiered cache and just memcached was pretty astonishing, but I don't remember the specifics. I'll get some benchmarks together over the next few days. I do remember that the difference wasn't as extreme if the cache was write heavy, but the tiered cache was significantly faster with a read heavy benchmark (which is more like DKos' cache usage). When the new benchmarks are done, I'll post them.

Astro log: Sept. 18th, 2009

19 Sep 2009, 20:42

Equipment: Orion Skyview 6 Deluxe EQ, using the 32, 15, and 10 mm eyepieces and the broadband and ultrablock filters. New objects observed: M28, Lagoon Nebula (M8) & NGC 6530, M20 (kinda), M21, M54. Notes: Another night of below average conditions. The only reason I went out for a little while in the evening was to try out the viewing shelter I'd build since my mom was nice enough to come and help by making the curtains for it (more on that later), and I decided to do some more observing around Sagittarius.

Astro log: Sept. 17th, 2009

18 Sep 2009, 18:45

Equipment: Orion Skyview 6 Deluxe EQ, using the 32, 25, and 10 mm eyepieces and the broadband and ultrablock filters. New objects observered: M22 Previously observed objects: M31, M32 Notes: The night of the 17th was partly cloudy and had poor transparency, so I didn't go out for very long. I had originally intended to look for M81 and M82 briefly, but clouds were obscuring that part of the sky. I turned by scope towards Sagittarius, but misidentified what part of the constellation I was looking at.

Astro log: July 2009 - September 15th, 2009

16 Sep 2009, 02:15

I've decided to keep a running log of what I've been observing here. This initial entry is, sadly, from memory over the last couple of months, so no dates are provided. It's just a list from what I've been able to see from my backyard in Tacoma, WA. Hopefully once the shelter I'm working on (more on that later) gets done, I'll be able to see more things. Equipment: Telescopes: Celestron AstroMaster 76 EQ, Orion Skyview 6 Deluxe EQ; Filters: Polarizing, set of color filters, Broadband, H-Beta, OIII; Eyepieces: 32mm, 25mm, 20mm correcting, 15mm, 10mm, 2 9mm, 6mm, 4mm

deep breath

16 Sep 2009, 01:58

OK, I can get off my ass and do this. Hooah!

Stupid db tricks - ORDER BY time DESC vs. ORDER BY countdown

14 Mar 2009, 00:38

I've been doing some experiments with some relatively large MySQL tables with trying to get around using a filesort when selecting rows ordered by a datetime in descending order. I'm not sure how much real world benefit this has, but some of the initial results are encouraging. We'll see.

Since MySQL doesn't allow you to define a reverse index (which would be useful for a field that you're constantly sorting by descending order), I'm experimenting with an indexed column that's set with 2^32 - UNIX_TIMESTAMP(time), where time's a normal MySQL timestamp. This way, you can ORDER BY countdown without the DESC keyword. I'm still experimenting with different indexes and queries, but there seem to be at least some areas where it leads to improved performance by avoiding the file sort. If more results seem encouraging enough, I'll write up the comparisons.

Preliminary 1.5 mod_mcpage patch

27 Feb 2009, 03:25

It actually works pretty well, but I'd still like to unify the 1.4 and 1.5 versions of it. The plugin interface changed between the versions, so they work pretty differently now, but the gist is the same. The preliminary patch is here: mod_mcpage-0.90.5-1.5-svn.patch. It has all of the functionality of the 1.4 version, plus: FILE_CHUNK now supported (aka local files). Content compressed by the backend will now be passed through correctly.

Back from the dead! Now with more mod_mcpage stuff.

25 Feb 2009, 09:07

I'm finally caught up on a huge glob of work that didn't lend itself to blogging (stuff with poll data parsing -- just not really worth writing about), especially on top of holidays and then a recent trip to Disneyland. However, while on said trip, I started poking at mod_mcpage again, and managed to fix some stuff. Then, while I was at it, I made some improvements and cleaned some things up too.

Simple templating system

22 Dec 2008, 23:37

Sadly, going to New York, banging one's face against CSS and web design issues, and being snowed in a few days before Christmas doesn't lead to a lot of posting for some reason. However, the horrible stuff with CSS and web design stuff got me thinking about better ways to handle templating in Scoop.

Right now, when Scoop builds a page, it takes a template out of the database, substitutes a bunch of keys, and returns the page. Unfortunately, at least in some circumstances, the substitutions can get stuck and just sit spinning, eating up CPU and RAM. Also, the current box mechanism could be more flexible. Finally, putting a new page design in can involve a lot of delicate surgery which gets pretty old.

I looked at stuff like HTML::Embperl, but none of the perl things out there really did what I wanted. I wanted something lighter-weight that I could use to build pages inside of Scoop, not something to do all the Apache handling. After working on it for a while, I wrote a parser that can generate code from a page with <% %>, <%= %>, and <%# %> tags in it. I haven't integrated it into Scoop yet, but here's what the code looks like:

sub parse_text { my $T = shift; my $string = shift; $string .= "<% ; %>"; # helps parsing my $code; while($string =~ /(.*?)<%(.*?)%>/sg){ my $t = $1; my $snip = $2; $t =~ s/\\/\\\\/g; $t =~ s/'/\\'/g; $code .= qq|\$pageRR .= '$t';| if $t; if ($snip =~ s/^=//){ $code .= qq|\$pageRR .= $snip;|; } elsif ($snip =~ /#/){ $code .= $snip . "\n"; } else { $code .= $snip . ";"; } } return $code; }

Using this text as an example:

my $string = shift; <%= "florp" %> This is some text. And \tmore text. And "text\". <% my $foo = "feep"; %> A perl value: <%= $foo %> <% my $k = 0; %> More text to be had, testing a for loop: <% foreach my $j (qw(a b c d)){ %> Item: <%= $j %> glop <% } %> A little more. 'text' \\\'? ''''''\' <% # a comment %> Finish it up. hmm. <% my $i = 0; while ($i < 5){ # Huh? $i++; } %> <%# feep %> i?<%= $i %> Now finished for real. |

And running it through that perl function above, it spits this perl out:

$pageRR .= "florp" ;$pageRR .= ' This is some text. And \\tmore text. And "text\\". '; my $foo = "feep"; ;$pageRR .= ' A perl value: ';$pageRR .= $foo ;$pageRR .= ' '; my $k = 0; ;$pageRR .= ' More text to be had, testing a for loop: '; foreach my $j (qw(a b c d)){ ;$pageRR .= ' Item: ';$pageRR .= $j ;$pageRR .= ' glop '; } ;$pageRR .= ' A little more. \'text\' \\\\\\\'? \'\'\'\'\'\'\\\' '; # a comment $pageRR .= ' Finish it up. hmm. '; my $i = 0; while ($i < 5){ # Huh? $i++; } $pageRR .= ' ';# feep $pageRR .= ' i? ';$pageRR .= $i ;$pageRR .= ' Now finished for real. | '; ; ;

As you can see, it generates perfectly good, although kind of hideous, perl, which is suitable for evaling and putting into a namespace for later use, like so (NB: this is stripped down from what I'm actually doing with it so far):

my $code = 'package Foobaricus; sub foo { my $pageRR; ' . $code . '}'; eval ($code);

After that, you can call Foobaricus::foo() and get the formatted page back. This code also does its best to preserve line numbers, although inline comments may lead to extra lines.

Moving forward, Scoop integration is the next step. After that, if there seems to be a need or interest, I'll look at generalizing this code for use elsewhere. Note that this code is not designed to do the heavy lifting of Apache handling, like Apache::ASP or HTML::Embperl do -- this is just an attempt to render pages better, leaving the rest of it to other code. The syntax of the <% %> tags is pretty common, but here's now the tags work:

<% %> - used for perl code you don’t want to return a value to the page.
<%= %> - inserts the value of the perl variable in that code. If you want to have more than one statement in a set of <%= %> tags, either move most of the code to <% %> tags and only have the value you want in the page in the <%= %> tags, or wrap your code in a do { }; statement.
<%# %> - comments.

I'll provide more code and examples later, once this is integrated and I have a better idea of real world usage.

Totally quiet note

08 Dec 2008, 04:47

I've been meaning to merge all the DK Scoop changes back into mainline Scoop for a while now, but somehow never found the time. I'd like to get those changes out there, though, so right now I'm leaning towards making a clean copy of the database, doing yet another audit of the code for vulnerabilities, and just releasing it that way. A script to allow importing an existing Scoop site into the DK version of Scoop might be handy too. I'll need to run another diff between the two branches, but I know from when I've tried to do it before that merging would be an absolutely beastly endeavour. We'll see what happens.

time to pull the plug

This is a subtitle. There are many like it, but this one is here.