The Curious Schemer

The following sentence is false. The preceding sentence is true.

Archive for the ‘JavaScript’ Category

Y Combinator for Dysfunctional Non-Schemers

with 21 comments

A lot of software developers don’t come from a Computer Science background. I think in the long run this doesn’t matter, since I’ve seen a lot of CS grads who have completely forgotten things that are supposed to set them apart from the rest of us (“Lisp? You mean that AI language with a lot of parentheses? Yeah I used it before in uni. So?”). Besides, a lot of CS grads can’t program anyway. Plus, if one really cares about the programming craft, during his/her journey of ever improving his/her efficiency and effectiveness as a software programmer, one tends to come full circle and go back to the root, which is made of the stuff CS grads are forced to read about in university.

Now one of the thing that I find really interesting, yet had baffled me for a long time, is the Y Combinator (no, not Paul Graham’s company). Maybe CS students eat Y Combinator for breakfast. But I graduated as an electrical engineer. It’s only recently, when my programming self-improvement routine brought me to study Lisp, Scheme, and recursion in greater details, that I came across this strange Y thing that so many very smart people, like this guy, this guy, and this guy have written about. Before this I’ve never heard of it in my life. It’s like I’m trying to digest what I think is a very cool and profound concept, and then I come across these mental landmines like “pass the function as the first argument to itself”, and my brain will just explode and I have to restart all over again.

What is Y Combinator, exactly? Why does it work? I do real-world applications in Java/C#/C++/JavaScript or whatever, I don’t do Scheme for a living. What’s in it for me? Is it just a cool idea with no practical applications whatsoever? I find that the best way for me to understand something is to write about it. So here it is.

JavaScript: Lisp in C’s Clothing

All examples will be in JavaScript. Examples in Lisp/Scheme can be difficult to read, especially when you’re not used to the language yet. Writing the examples and illustrations in the familiar C-syntax will make it easier for me (and you, if you’re one of the two or three people who are reading this). The examples won’t be in Java, because Java’s obsession with nouns will make it awkward to write them. The fact that functions are first-class in JavaScript makes things a LOT easier. (If you need a short intro of JavaScript’s true capabilities, I’d shamelessly recommend this article.) So, with that out of the way, let’s start!

The Problem

Let’s start by a simple basic recursive function: factorial. Simplest function in the world, right? It’s like the Hello World of recursion.

function factorial(num) {
    if(num < 2) {
        return 1;
    }
    return num * factorial(num - 1);
}

Of course, functions are first-class in JavaScript, so we could’ve written it like this:
var factorial = function(num) {
    if(num < 2) {
        return 1;
    }
    return num * factorial(num - 1);
};

But now we have a potential problem, don’t we? The recursive call to factorial within the method only works because we happen to name the variable “factorial” as well. Should we name the variable differently, say, “fact”, instead of “factorial”:
var fact = function(num) {
    if(num < 2) {
        return 1;
    }
    return num * factorial(num - 1);
};

Then we get an error, because when we call “factorial(num – 1)”, the name “factorial” is not bound to anything. We can fix it for this case by changing the call to “fact(num – 1)”, of course, but this approach is a quick fix that doesn’t work, because this function can be assigned to any variable of any name.We have a problem that can be summed up thus: a recursive function is a function that calls itself. But an anonymous function has no name. So… how is it supposed to call itself for recursion?

First Attempt

(If you’re thinking: “Just give the bloody function a bloody name so you can make it recursive and get on with your life!”, I can’t say I totally disagree with you at this point. But anyway.)

So what can we do here? Well for one, we can keep the function anonymous, and get the name for recursion from a parameter passed to the anonymous function, like this:

var fact = function(forRec, num) {
    if(num < 2) {
        return 1;
    }
    return num * forRec(forRec, num - 1);
};

Then when we want to use it, we just pass the name of the function to itself, like this:
js> fact(fact, 0)
1.0
js> fact(fact, 1)
1.0
js> fact(fact, 2)
2.0
js> fact(fact, 3)
6.0

No matter what the name is, as long as we keep passing the same name as the first parameter, we’ll be OK–the anonymous function will be correctly calling itself.

But… OK, Second Attempt

The solution kinda works. But it’s not nice, requiring your users to your function name twice everytime they want to use it. Besides, now the code becomes less clear–everybody knows factorial, but this self-passing-to-self business is obfuscating the code. I believe we can do better. Let’s try to separate the “passing a function to itself” bit from the “calculate factorial of” bit, by currying it. Like this:

var createFact = function(forRec) {
    return function(num) {
        if(num < 2) {
            return 1;
        }
        return num * (forRec(forRec))(num - 1);
    };
};

(Currying, or Schönfinkelisation, is a lesson to all of us to choose names that are easy to spell, remember, and pronounce. Or else you may invent something and the other guy with the catchier name–who can compete with Curry?–gets the credit.)In the snippet above, the outer anonymous function (the one with forRec as a parameter) returns another anonymous function (the one accepting parameter num). The latter is very similar to our original factorial function (remember that our objective is to separate the passing-function-to-itself bit from the factorial bit), except for the bit in green:

(forRec(forRec))(num - 1);

That line is where the inner function needs to recurse. But instead of requiring a name to recurse, it calls the outer function… which returns the inner function itself. And that returned inner function is in turn called, with “num – 1″ as its argument. There we have our recursion.So now we have a slightly cleaner solution. We can use the outer function to create the inner function like this:

js> var factorial = createFact(createFact);
js> factorial(10)
3628800.0

Note that this is equivalent to this one-liner:

js> createFact(createFact)(10)
3628800.0

Hmmm. The code for the factorial function is still polluted, though. Let’s try to take out the anonymous recursion part from the factorial function entirely.

Third Attempt: Wrap, and wrap, and wrap, and wrap…

Our second attempt is still not as clean as we want it to be. Ideally, we want to separate the part that takes care of the anonymous recursion, from the part that actually does the factorial computation. Let’s see our last function again:

var createFact = function(forRec) {
    return function(num) {
        if(num < 2) {
            return 1;
        }
        return num * (forRec(forRec))(num - 1);
    };
};

The only difference between the inner function and a typical factorial function is the recursive part. Let’s try to take that forRec(forRec) bit out–that is, instead of doing it inside, let’s see if we can do it outside and pass it in as a parameter. Here’s the function again, with the forRec(forRec) taken out of the picture:
var fact = function(rec) {
    return function(num) {
        if(num < 2) {
            return 1;
        }
        return num * rec(num - 1);
    };
};

And like I said above, we do the forRec(forRec) outside, and then pass it to the fact function:

var recur = function(forRec) {
    return function(n) {
        return fact(forRec(forRec))(n);
    }
};

Then we use it like this:js> var factorial = recur(recur);
js> factorial(6)
720.0

Or, as we’ve seen above:

js> recur(recur)(6)
720.0

Did you follow what happened during the recur(recur) call this time? From recur‘s definition, a call to recur(recur) returns an anonymous function like this (substituting the parameter with the actual argument):


function(n) {
    return fact(recur(recur))(n);
}

Let’s see what happens when this anonymous function is called: it returns the result of calling fact(recur(recur)) with argument n. Now what does fact(recur(recur)) evaluate to? If we go back to the definition of fact, it returns the following anonymous function:
function(num) {
    if(num < 2) {
        return 1;
    }
    return num * recur(recur)(num - 1);
};

which does the actual computation of the factorial. And when we reach this line:

return num * recur(recur)(num - 1);

We see that we have recur(recur). Which we have shown above, to eventually evaluate back to this factorial-computing anonymous function itself. In other words, it is calling itself. Ladies and gentlemen, we have recursion!

Okay… So What?

Indeed. So what? In the last attempt, we still have to call recur(recur) before using it? Well, the difference is that we have separated the anonymous recursion mechanism from the factorial logic. So for instance, instead of hard-coding the call to fact inside recur, we can make it a parameter, like this:

var recurWrapper = function(f) {
    var recur = function(forRec) {
        return function(n) {
            return f(forRec(forRec))(n);
        }
    };
    return recur(recur);
};

Then we can use it like this:js> recurWrapper(fact)(6);
720.0

Eh, that’s better! No more of this passing self to self bit (because it’s wrapped inside the wrapper). And now we can tidy up recurWrapper a bit. Shortening parameter names a bit and naming the recur function (instead of assigning the anonymous function to a variable called recur) gives us this:

var recurWrapper = function(f) {
    function recur(forRec) {
        return function(n) {
            return f(forRec(forRec))(n);
        }
    };
    return recur(recur);
};

There is a better name for recurWrapper, and that is Y:
function Y(f) {
    function recur(r) {
        return function(n) {
            return f(r(r))(n);
        }
    };
    return recur(recur);
}

which is really the JavaScript version of the (applicative-order or not? we shall see) Y Combinator. And it works with any single argument anonymous function that is supposed to be recursive. For example, here’s Y with a function to compute the Fibonacci number:
var fibo = Y(function(f) {
    return function(n) {
        if(n <= 2) {
            return 1;
        } else {
            return f(n - 1) + f(n - 2);
        }
    }
});

Applicative Order Y Or Not (And What The Heck Is That?)?

Here’s a good explanation. In short, there are two ways of evaluation in programming languages. Applicative order is eager evaluation: arguments to a function are evaluated first before the function itself is executed. Whereas the normal order is lazy. Arguments to functions are evaluated when they need to be evaluated.

As such, there are two flavours of Y as well. The classic Y Combinator works when we’re using normal order of evaluation, but will hang when the evaluation is applicative order (just like in JavaScript, which evaluates the arguments first before a function is called). This normal Y Combinator is defined as such in lambda calculus:

Y = λf·(λx·f (x x)) (λx·f (x x))

which is closer to this:

function normalY(f) {
    function recur(x) {
        return f(x(x));
    }
    return recur(recur);
}

Which will hang, obviously, if you think in the applicative order way of thinking. The Y we just derived earlier, on the other hand, is applicative order. Note the difference in lambda calculus definition (the difference is in bold italic):

Z = λf. (λx. f (λy. x x y)) (λx. f (λy. x x y))

and its corresponding JavaScript version:

 

function applicativeOrderY(f) {
    function recur(x) {
        return function(y) {
            return f(x(x))(y);
        }
    };
    return recur(recur);
}

Right. Okay. So What’s In It For Me?

Yeah. That’s it. What’s in it for me beyond the “oh, neat” factor? Er. Frankly, I’m not sure. In the real world, if I need to write a recursive function, I will just give it a bloody name. Like alucard(). And doing Y combinator in your JavaScript codebase will probably piss off a web designer who has the misfortune of maintaining your code in the future.

I guess the main benefit of this whole exercise is that I feel good about understanding the Y combinator at last. It won’t make me a better programmer, at least in the short run, but heck. Having an iPod also doesn’t make your life any better other than making you feel good. So there.

UPDATE: I was surprised to see a big jump in my blog stat! Turned out that Matt Jaynes submitted this article to Y Combinator Startup News, and then linuxer submitted it to programming subreddit!

Christophe Grand and others in reddit and news.ycombinator pointed out that JavaScript has a built-in way of doing this using arguments.callee (see also Christophe’s comment below for a short example of how this is done). My intention was to derive the Y Combinator using a language with which a lot of people (including myself) are familiar (that is, JavaScript), instead of answering the question “how does one make an anonymous function call itself in JavaScript?”, but thanks anyway, guys!

Chris Rathman pointed out here that I’m still using a named function for my definition of Y(). Here’s my definition of Y again:

function Y(f) {
    function recur(r) {
        return function(n) {
            return f(r(r))(n);
        }
    };
    return recur(recur);
}

This definition is probably easier to understand because it uses JavaScript constructs that are familiar to most people, but like Chris said, we can go for the fully anonymous variant. Douglas Crockford also has a fully anonymous version in his The Little JavaScripter page, but let’s see how we can get to there from the definition we’ve seen in this article.First of all, remember that in JavaScript, we can define and call a function at one go like this:js> var y = function(x) { return x * x; }(2);
js> y
4.0

So with that in mind, we can replace the last line “return recur(recur);” with an anonymous function that wraps around it like this:
var Y = function(f) {
    return function(recur) {
        return recur(recur);
    }( /* we must pass something here */ );
};

And now what’s left, is to call this anonymous function directly, passing the (anonymized) body of the recur function in my original definition of Y. Like this:
var anonY = function(f) {
    return function(recur) { return recur(recur); } (
        function (r) {
            return function(n) {
                return f(r(r))(n);
            }
        }
    );
};

Which looks a little different from Douglas Crockford’s version (which is more similar to the one Chris posted in reddit), but they’re really different ways of saying the same thing. Man! This whole thing surely has been real educational for me :) (And I hope for you too!) Thanks very much, everyone!

Written by rayfd

May 6, 2007 at 5:37 am

Posted in Java, JavaScript, Technology

I’m Singing 99 Bottles of Beer!

leave a comment »

Man! Just when I thought things couldn’t get any better with the publication of my article, the gentlemen at 99 Bottles of Beer sent me a mail, telling me that my E4X submission had been accepted! Damn that feels good :)

Written by rayfd

April 15, 2007 at 11:08 am

Posted in Java, JavaScript, Technology

My MSDN Magazine JavaScript Article, Published!

with 2 comments

Woohooo!!! Few things are more satisfying than seeing your article being published in a respectable magazine like MSDN Magazine. Without further ado, ladies and gentlemen, I present you with… JavaScript: Create Advanced Web Applications With Object-Oriented Techniques.

In fact, let me be cheesy for a while and save a screenshot of MSDN Magazine with my name on it to, er, show my future children or something…

D

Written by rayfd

April 12, 2007 at 2:30 pm

Posted in JavaScript, Technology

Singing 99 Bottles of Beer With E4X (And How Mozilla Lied To Me)

leave a comment »

How would you generate a song about 99 bottles of beer programmatically? This site holds a collection of the song in 1071 languages (and counting!) at the time of this writing. I was thinking of submitting a tail recursive implementation in JavaScript… but that’s kinda boring. Besides, the submission page says: “Your example should demonstrate the main advantages and features of the language“. Hmmm. JavaScript and tail recursion are kinda too… well-known.

And then it hit me: nobody has submitted E4X implementation of the Beer Song! E4X is one of the more interesting new features of Rhino: JavaScript for Java, which was excluded from the version of Rhino that comes bundled with Java SE 6.0. Luckily, getting E4X is not difficult. In minutes I’m ready to do my beer song.

Setting E4X In 3 Simple Steps

  1. Download Rhino here. By the way, ignore what they say about how you need xbean.jar from XMLBeans. They lied to you, you need an extra jar to get it working. I banged my head against the wall for about an hour wondering what could I’ve done wrong with such a simple line: java -cp xbean.jar;js.jar org.mozilla.javascript.tools.shell.Main, until I found out that another JAR was required. Grrr.
  2. E4X requires xbean.jar AND jsr173_1.0_api.jar from XMLBeans. You can download it here, and get the 2 JARs from the “lib” directory.
  3. Type the following command (assuming that all those 3 JARs are in the same directory):

java -cp js.jar;xbean.jar;jsr173_1.0_api.jar org.mozilla.javascript.tools.shell.Main

And you’re set. Try this out at the prompt:

js> var me = <person><name>Ray</name></person>;
js> me.name
Ray

Shweet!

Beer Drinkin’ Rhino

OK, once we’ve configured this thing, the rest is easy. Here’s the code, in it’s full glory:

var rawLyrics = readUrl(“http://99-bottles-of-beer.net/lyrics.html”);

var wellFormedRawLyrics = rawLyrics.replace(/<br>/gi, “<br/>”);

var xmlLyrics = new XML(wellFormedRawLyrics);

default xml namespace = xmlLyrics.namespace();

var songVerses = xmlLyrics..*.(/^99 bottles of beer/.test(p.text()));

for each(var verse in songVerses.children()) {

var verseText = verse.text().toString().replace(“\.”, “.\n”);

print(verseText + “\n”);

}

I saved this snippet into a file, and run the following command:

java -cp js.jar;xbean.jar;jsr173_1.0_api.jar org.mozilla.javascript.tools.shell.Main -f 99BottlesOfBeer_E4X.js

to generate (or rather, rip) the full song from the 99 bottles website. Let’s hope they’ll take in my submission! :)

Written by rayfd

April 7, 2007 at 6:54 am

Posted in Java, JavaScript, Technology

Slimming Your JavaScript Files Down

with one comment

When you have hundreds of thousands of users accessing your website, even a seemingly small difference in file size can add up to eat a LOT of bandwidth. A friend told me that in his company, they even wrote their own JavaScript library from scratch to squeeze away those tens of kilobytes (which amounts to saving a lot of bandwidth money in their case).

Personally, I wouldn’t do such a thing, though. I have more faith on widely tested libraries like this one instead of a homegrown JavaScript library written specifically for size. So I googled around a bit for JavaScript compressors, of which there are many. But in this post I’ll just take a look at two of them.

One Can Never Be Too Slim

Prototype 1.5.0 is 70kb. Prototype 1.5.1 rc2 is 92kb in size. I think it’s quite reasonable to assume that by the time it’s released it’ll probably break the 100kb barrier. Some slimming sessions are certainly in order here! Let’s take a look at a few alternatives. Before slimming down Prototype, I’ll test the minifiers discussed in this article using my own JavaScript-Commons StringUtils, simply because I have the unit test written for it and I can check for the minification accuracy.

JSMin, The JavaScript Minifier

JSMin is written by none other than Douglas Crockford. After you’ve downloaded it, running it is easy. My StringUtil.js is originally 17,415 bytes. After running it:

C:\Documents and Settings\schemer\jsmin>jsmin <StringUtils.js >MinifiedStringUtis.js

C:\Documents and Settings\schemer\jsmin>dir

03/31/2007 11:52 AM 17,415 StringUtils.js
03/31/2007 12:27 PM 7,379 MinifiedStringUtils.js

Ain’t that sweet? JSMin reduces the size of StringUtils by more than half. But a small JS file is of no use if it’s broken…

And turned out JSMin’s output passed all StringUtils’s JSUnit tests with flying colours. Sweet! It surely looks ugly though…

Minified StringUtils

Now let’s try it with Prototype 1.5.0 and 1.5.1. The result:

C:\Documents and Settings\schemer\jsmin>jsmin <prototype-1.5.0.js >prototype-min-1.5.0.js

C:\Documents and Settings\schemer\jsmin>jsmin <prototype-1.5.1_rc2.js >prototype-min-1.5.1_rc2.js

C:\Documents and Settings\schemer\jsmin>dir

03/31/2007 12:47 PM 71,260 prototype-1.5.0.js
03/31/2007 12:49 PM 54,193 prototype-min-1.5.0.js

03/31/2007 12:47 PM 94,024 prototype-1.5.1_rc2.js
03/31/2007 12:49 PM 70,353 prototype-min-1.5.1_rc2.js

Hmmm, not bad. 24% and 25% for 1.5.0 and 1.5.1_rc2, respectively. Douglas Crockford is a very intelligent guy, but the fact that JSMin does its minification using text manipulation means that we can’t be sure that the result of JSMin will be 100% functionally identical to the original file. For minification fidelity, we have Dojo ShrinkSafe.

Shrink Your JavaScript Safely

First of all, if all you care about is just slimming your JavaScript file RIGHT NOW, go here to do it without any download or configuration whatsoever. The compression result of the three files above are shown below (note that the online compressor seems to have a maximum file name length set to 31–as you can see the name of the shrunk prototype 1.5.2 gets truncated in the beginning):

C:\Documents and Settings\schemer\shrinksafe>dir

03/31/2007 11:52 AM 17,415 StringUtils.js
03/31/2007 12:27 PM 7,379 MinifiedStringUtils.js
04/01/2007 10:18 AM 6,762 StringUtils.compressed.js

03/31/2007 12:47 PM 94,024 prototype-1.5.1_rc2.js
03/31/2007 12:49 PM 70,353 prototype-min-1.5.1_rc2.js
04/01/2007 10:22 AM 64,332 ototype-1.5.1_rc2.compressed.js (truncated name?)

03/31/2007 12:47 PM 71,260 prototype-1.5.0.js
03/31/2007 12:49 PM 54,193 prototype-min-1.5.0.js
04/01/2007 10:21 AM 49,059 prototype-1.5.0.compressed.js

So, Dojo ShrinkSafe actually outperforms JSMin in all cases. Not surprising, because instead of doing text manipulation, Dojo ShrinkSafe is based on Rhino, a true JavaScript engine. Which means that ShrinkSafe has much more knowledge about your JavaScript file compared to your regular text manipulation compressor–it knows with a lot more certainty the context of a variable, function, and so on.

Ah, so there. Next time I’m creating another release of JavaScript-Commons, I know which compressor I’ll be using.

Written by rayfd

April 1, 2007 at 2:37 am

Posted in Java, JavaScript, Technology

Follow

Get every new post delivered to your Inbox.