Simple templating system

Sadly, going to New York, banging one's face against CSS and web design issues, and being snowed in a few days before Christmas doesn't lead to a lot of posting for some reason. However, the horrible stuff with CSS and web design stuff got me thinking about better ways to handle templating in Scoop.

Right now, when Scoop builds a page, it takes a template out of the database, substitutes a bunch of keys, and returns the page. Unfortunately, at least in some circumstances, the substitutions can get stuck and just sit spinning, eating up CPU and RAM. Also, the current box mechanism could be more flexible. Finally, putting a new page design in can involve a lot of delicate surgery which gets pretty old.

I looked at stuff like HTML::Embperl, but none of the perl things out there really did what I wanted. I wanted something lighter-weight that I could use to build pages inside of Scoop, not something to do all the Apache handling. After working on it for a while, I wrote a parser that can generate code from a page with <% %>, <%= %>, and <%# %> tags in it. I haven't integrated it into Scoop yet, but here's what the code looks like:

sub parse_text {
       my $T = shift;
       my $string = shift;
       $string .= "<% ; %>"; # helps parsing
       my $code;
       while($string =~ /(.*?)<%(.*?)%>/sg){
               my $t = $1;
               my $snip = $2;
               $t =~ s/\\/\\\\/g;
               $t =~ s/'/\\'/g;
               $code .= qq|\$pageRR .= '$t';| if $t;
               if ($snip =~ s/^=//){
                       $code .= qq|\$pageRR .= $snip;|;
                       }
               elsif ($snip =~ /#/){
                       $code .= $snip . "\n";
                       }
               else {
                       $code .= $snip . ";";
                       }
               }
       return $code;
       }

Using this text as an example:

       my $string = shift;
<%= "florp" %>
       This is some text. And \tmore text. And "text\".
       <% my $foo = "feep"; %> A perl value: <%= $foo %>
<% my $k = 0; %> More text to be had, testing a for loop:
       <% foreach my $j (qw(a b c d)){ %>
               Item: <%= $j %> glop
       <% } %>
       A little more. 'text' \\\'? ''''''\'
       <% # a comment %>
       Finish it up.
       hmm. <% my $i = 0;
               while ($i < 5){ # Huh?
                       $i++;
                       }
               %>
      <%# feep %> i?<%= $i %>
Now finished for real.
|

And running it through that perl function above, it spits this perl out:

$pageRR .=  "florp" ;$pageRR .= '
       This is some text. And \\tmore text. And "text\\".
       '; my $foo = "feep"; ;$pageRR .= ' A perl value: ';$pageRR .=  $foo ;$pageRR .= '
'; my $k = 0; ;$pageRR .= ' More text to be had, testing a for loop:
       '; foreach my $j (qw(a b c d)){ ;$pageRR .= '
               Item: ';$pageRR .=  $j ;$pageRR .= ' glop
       '; } ;$pageRR .= '
       A little more. \'text\' \\\\\\\'? \'\'\'\'\'\'\\\'
       '; # a comment
$pageRR .= '
       Finish it up.
       hmm. '; my $i = 0;
               while ($i < 5){ # Huh?
                       $i++;
                       }
               
$pageRR .= '
      ';# feep
$pageRR .= ' i? ';$pageRR .=  $i ;$pageRR .= '
Now finished for real.
|
'; ; ;

As you can see, it generates perfectly good, although kind of hideous, perl, which is suitable for evaling and putting into a namespace for later use, like so (NB: this is stripped down from what I'm actually doing with it so far):

my $code = 'package Foobaricus; sub foo { my $pageRR; ' . $code . '}';
eval ($code);

After that, you can call Foobaricus::foo() and get the formatted page back. This code also does its best to preserve line numbers, although inline comments may lead to extra lines.

Moving forward, Scoop integration is the next step. After that, if there seems to be a need or interest, I'll look at generalizing this code for use elsewhere. Note that this code is not designed to do the heavy lifting of Apache handling, like Apache::ASP or HTML::Embperl do -- this is just an attempt to render pages better, leaving the rest of it to other code. The syntax of the <% %> tags is pretty common, but here's now the tags work:

  • <% %> - used for perl code you don’t want to return a value to the page.
  • <%= %> - inserts the value of the perl variable in that code. If you want to have more than one statement in a set of <%= %> tags, either move most of the code to <% %> tags and only have the value you want in the page in the <%= %> tags, or wrap your code in a do { }; statement.
  • <%# %> - comments.

I'll provide more code and examples later, once this is integrated and I have a better idea of real world usage.

Comments imported from the old site.
Sadly

Now I have to get back to web design/CSS hell, so I'll have to put this down for a few hours. :-( Once it's working, though, I hope it'll make that job easier.

XSLT

Have you thought about using XSLT for the templating?

Hmm, hadn't yet, no.

I'll check it out though. I've been (once again) being pulled in about 16 different directions, so this has been chugging along at a slower pace than I'd like. It's also complicated by having to still straddle between the two different templating systems right now, which is a bit restraining.

Coupla questions...
  1. Have you already considered, and rejected, HTML::Mason and Text::MicroMason?  Or other templating systems from CPAN?
  1. What's the context where you're parsing these templates?  Regex is a pretty expensive way to parse, and if there's frequent re-parsing you might get better results using XML/XHTML.  The catch being that then you have to stick to well formed XML in all your templates, but some people think that's a feature :D  

Disclaimer: I don't particularly like XSLT (too high maintenance), and I very much like Mason, but I've written some taglibs in Perl using Expat, and been happy with the results...  

Coupla answers
  1. Yeah, I looked through a bunch of stuff on CPAN, but nothing fit what I wanted exactly. This is what some would call a "bummer".  I was looking for something that was relatively lightweight, would allow putting actual perl in the template, and wouldn't try to do the Apache handling itself, and I couldn't find anything that fit the bill.
  1. You're absolutely right right about regexp being an expensive way to parse, which is why I want to get away from that. Currently, Scoop uses a bunch of regexps in its templating system, and interpolating through several passes to make sure it gets everything. Beyond not being super efficient, I've noticed that in some cases the substitutions can get stuck and spin endlessly, eating up CPU and RAM until apache's restarted. That's no good. With this, the idea is to render the template as a subroutine once, and then be able to build the pages after that by passing it arguments for the values of the page like any other subroutine. Make sense?

I promise I'll have more on this soon, but I've been swamped with less glamorous, but very time consuming, work, so I've had to push the more interesting stuff aside for a while. I'll be back on interesting stuff before long.

Coupla answers

"the idea is to render the template as a subroutine once, and then be able to build the pages after that by passing it arguments for the values of the page like any other subroutine. Make sense?"

which is what you can do with XSLT.

you can pre-compile the templates and pass in raw xml and parameters for the transform. the performance is WAY better than regexp's or even string replacements. if you are really a speed freak; Intel has a high performance XML suite that blows the doors off of the fastest stuff out there. relatively cheap for a couple hundred bucks.

 
comments powered by Disqus