Archive for the ‘Java’ Category
But It Works On My Machine…
public class TestVolatile implements Runnable {
private boolean stopRequested = false;
public void run() {
while(!stopRequested) {
// do something here...
}
}
public void stop() {
stopRequested = true;
}
public static void main(String[] args) throws InterruptedException {
TestVolatile tv = new TestVolatile();
new Thread(tv, "Neverending").start();
Thread.sleep(1000);
tv.stop();
}
}
What happens when we run main()?
Of course, the erudite, tech-savvy, good-looking Java people (the type who read my blog), can answer that in a heartbeat.
The answer is that we don’t know for sure when the program will terminate, because the stopRequested variable is not marked volatile. So when we call stop() in the main thread, the Neverending thread may never see that stopRequested has changed to true, and it just keeps going… and going… and going… just like the Energizer Bunny. We never know when it’s gonna stop.
But… it works just fine on my machine
A friend of mine wasn’t convinced though, so just to prove it to him, I typed that in Eclipse and ran it, fully expecting that it would run forever.
It stopped in about one second. Hmmm. Weird.
Seems that the change made from the main thread to stopRequested is immediately visible from the Neverending thread, although stopRequested is not volatile.
Out of curiosity, I modified the program a bit to look like this:
import java.util.concurrent.TimeUnit;
public class TestVolatile implements Runnable {
private boolean stopRequested = false;
private volatile long justAfterStopRequested;
private volatile long afterNeverendingHasStopped;
public void run() {
while(!stopRequested) {
// do something
}
afterNeverendingHasStopped = System.currentTimeMillis();
}
public void stop() {
stopRequested = true;
justAfterStopRequested = System.currentTimeMillis();
}
public static void main(String[] args) throws InterruptedException {
TestVolatile tv = new TestVolatile();
Thread t = new Thread(tv, "Neverending");
t.start();
// let main thread sleep for 1 second before requesting
// Neverending to stop
TimeUnit.SECONDS.sleep(1);
tv.stop();
// wait until Neverending has stopped
t.join();
System.out.println(tv.afterNeverendingHasStopped - tv.justAfterStopRequested);
}
}
I’m trying to see how much time the Neverending thread needs to realize that stopRequested has changed to true, and get outside the loop. The result is 0.
But it can’t be instantaneous–there must be some difference in time. So I changed the System.currentTimeMillis() calls to System.nanoTime(). The result is virtually the same, ranging from around -300 to +300 nanoseconds (we can’t say which one gets modified first, justAfterStopRequested or afterNeverendingHasStopped–it differs on each run). But for all practical purposes, we can say that the Neverending thread sees the change to stopRequested almost immediately.
Strange.
In Effective Java, 2nd ed., Joshua Bloch says in his machine such a program never stops running. Not that I have a self-esteem problem or whatever, but when it comes to Java, if I have to choose whether to trust Joshua Bloch or me, I choose the former. Sorry, me.
On another machine, however…
But you know what? I found that TestVolatile did run forever on Solaris! (Well, it ran until I was back 15 minutes later and killed it anyway.) Is it because of the differences between Windows and Solaris? Not really–it’s more about the differences between the Server VM and the Client VM. My Solaris test platform is a server-class machine, so by default I was using the Server VM. Whereas on my Windows machine by default I was running the Client VM.
Indeed, when I ran the program again on Solaris with the Client VM (with the -client option), it stopped after roughly a second, without having to make stopRequested volatile. Conversely, when I ran the program on Windows with the -server option, it never stopped, and would only stop if I made stopRequested volatile.
This shows that the Client VM may deceive us into thinking that we don’t need volatile (until we run our program on a server-class machine and things start breaking in strange ways). Superficially, the Client VM and the Server VM may sound like they’re not that different. But some differences do matter: what we see here is a real example where the differences between the Client VM and the Server VM can make or break your application.
And that’s not all. There’s another non-obvious thing that may mask the need of volatile in our programs. Like System.out.println(), for example.
Hidden synchronized blocks sometimes hide the need for volatile
If we want to print a counter variable within the loop like this:
public class TestVolatile implements Runnable {
private boolean stopRequested = false;
private int i = 0;
public void run() {
while(!stopRequested) {
System.out.println(i++);
}
}
public void stop() {
stopRequested = true;
}
// the rest
}
Then the program will always terminate even if we don’t mark stopRequested as volatile. Why? Because there is a synchronized block inside System.out.println(). When the thread that runs the run() method calls System.out.println(), its copy of stopRequested gets updated with the latest value, and the while loop terminates.
Note though, that this is NOT guaranteed by the Java Memory Model. The JMM guarantees visibility between two threads that enter and exit synchronized blocks protected by the same lock. If the synchronized blocks are protected by different locks, then the only safe assumption is to assume that there’s no synchronization at all.
This program below does stop,
public class TestVolatile implements Runnable {
private boolean stopRequested = false;
private int i = 0;
private final Object lock1 = new Object();
private final Object lock2 = new Object();
public void run() {
while(!stopRequested) {
synchronized(lock1) {}
i++;
}
}
public void stop() {
stopRequested = true;
synchronized(lock2) {}
}
// the rest
}
even though the two threads are entering synchronized blocks guarded by different locks. We can even remove the synchronized block guarded by lock2 in stop(), and the program still stops. But not the other way around (that is, if we remove lock1’s synchronized block and leave lock2’s there, again the program will run forever).
So it seems that the thread that reads a variable gets the up-to-date value when it executes a synchronized block, regardless of the lock used to guard the block. Even if the variable is not volatile. Which means it’s possible to have code that used to work when you had System.out.println() calls sprinkled throughout suddenly stops working properly when you remove those calls!
Does volatile cascade to member variables? Array elements? Items in collections?
Now let’s say instead of a primitive boolean, stopRequested is a member variable of another class, like this:
import java.util.concurrent.TimeUnit;
public class TestVolatile implements Runnable {
private A wrapper = new A();
public void run() {
while(!wrapper.stopRequested) {
// do something
}
}
public void stop() {
wrapper.stopRequested = true;
}
public static void main(String[] args) throws InterruptedException {
TestVolatile tv = new TestVolatile();
new Thread(tv, "Neverending").start();
// let main thread sleep for 1 second before requesting
// Neverending to stop
TimeUnit.SECONDS.sleep(1);
tv.stop();
}
}
class A {
boolean stopRequested = false;
}
Run with the -server flag, this one never stops either. But what if we mark wrapper as volatile:
private volatile A wrapper = new A();
Does the “volatility” of the reference cascades to the member variable stopRequested as well? Turned out, looks like the answer is yes. We can either mark wrapper as volatile, or stopRequested as volatile, and the program will terminate in about a second.
I’m not surprised that the program terminates when I mark stopRequested as volatile. But why wrapper as volatile works as well? The same thing happens when we use an ArrayList, like this:
import java.util.ArrayList;
import java.util.List;
public class TestVolatile implements Runnable {
List<Boolean> wrapper = new ArrayList<Boolean>();
public TestVolatile() {
wrapper.add(Boolean.FALSE);
}
public void run() {
while(wrapper.get(0) == Boolean.FALSE) {
// do something
}
}
public void stop() {
wrapper.set(0, Boolean.TRUE);
}
// the rest of the code...
}
Run as it is, it never stops. If we mark the List<Boolean> wrapper as volatile, it stops pretty fast.
I wonder why? The object reference to that List instance itself never changes. We’re only changing an element within the List, and not the List itself. There’s no hidden synchronized block. Why does the Neverending thread sees the up-to-date value of stopRequested?
Above and beyond the call of duty
Before JSR 133, if thread A writes to volatile field f, thread B is guaranteed to see the new value of f only. Nothing else is guaranteed. With JSR 133 though, volatile is closer to synchronization than it was. Reading/writing a volatile field now is like acquiring/releasing a monitor in terms of visibility. As the excellent FAQ says: “… anything that was visible to thread A when it writes to volatile field f becomes visible to thread B when it reads f.”
But still, all these new guarantees for volatile doesn’t answer the question:
public class TestVolatile implements Runnable {
private volatile A wrapper = new A();
public void run() {
while(!wrapper.stopRequested) {
// do something
}
}
public void stop() {
wrapper.stopRequested = true;
}
// the rest
}
Why does this work? In stop() we’re not exactly writing to wrapper. We’re just using it to change the value of stopRequested. So why does the other thread see the change?
Unfortunately, in the examples I’ve seen so far in books and countless articles, a volatile field is always a primitive, so it’s kinda hard to find the answer to my question. So I did the only remaining way I know to proceed: asking The Concurrency Expert himself. And I was pleasantly surprised to find that he replied very quickly! Here’s what Brian Goetz said:
The Java Memory Model sets the minimum requirements for visibility, but all VMs and CPUs generally provide more visibility than the minimum. There’s a difference between “what do I observe on machine XYZ in casual use” and “what is guaranteed.”
So there. The VM in this case just goes above and beyond what it is supposed to do. But there’s no guarantee that on another VM and another CPU, the same thing will happen.
Conclusion
So here’s what I’ve learned from this little experiment:
- If your application is a server application, or will run on a server-class machine, remember to use the -server flag during development and testing as well to uncover potential problems as early as possible. The Server VM performs some optimizations that can expose bugs that do not manifest on the Client VM.
- Just because it works on your machine, doesn’t mean that it’ll work on other machines running other VMs too. It’s important to know what are exactly guaranteed, and code with those minimal guarantees in mind, instead of assuming that other VMs and OSes will be as forgiving as the ones you’re using for development.
- (This is closely related to #2 above.) Because VMs and CPUs generally provide more visibility than the minimum guaranteed by JSR 133, it’s good to know the extra things that they do that may mask a potential bug. For example, at least in some VMs, calling
System.out.println()forces the change to a non-volatile variable to be visible to other threads because it has a synchronized block inside. This can explain a bug that appears after you’ve made a seemingly unrelated change (that removes a synchronized block from the execution path, for instance).
When A Synchronized Class Isn’t Threadsafe
Every Java programmer has heard this advice before: “Prefer ArrayList over Vector. Vector is fully synchronized, and as such you’re paying the synchronization penalty even when you don’t need it.”
ArrayList is not synchronized, so when you need it you need to perform synchronization yourself, or alternatively, as the ArrayList javadoc entry says: “… the list should be “wrapped” using the Collections.synchronizedList method.” Something like this:
List list = Collections.synchronizedList(new ArrayList());
The resulting List will be synchronized, and therefore can be considered safe.
Or is it?
Not really. Consider the very contrived example below.
final List<String> list = Collections.synchronizedList(new ArrayList<String>());
final int nThreads = 1;
ExecutorService es = Executors.newFixedThreadPool(nThreads);
for (int i = 0; i < nThreads; i++) {
es.execute(new Runnable() {
public void run() {
while(true) {
try {
list.clear();
list.add("888");
list.remove(0);
} catch(IndexOutOfBoundsException ioobe) {
ioobe.printStackTrace();
}
}
}
});
}
As long nThreads is 1, everything runs just fine. However, increase the number of nThreads to 2, and you start getting this:
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.RangeCheck(Unknown Source)
at java.util.ArrayList.remove(Unknown Source)
at java.util.Collections$SynchronizedList.remove(Unknown Source)
Changing the synchronized List to Vector doesn't help either. What happened here? Well, individual method calls of synchronized List and Vector are synchronized. But list.add() and list.remove() can be called in any order between the 2 threads. So if you print list.size() after list.add(), the output is not always 1. Sometimes it's 0, sometimes it's 2. Likewise, thread 1 may call list.add(), but before it gets a chance to call list.remove(), thread 2 gets into action and calls list.clear(). Boom, you get IndexOutOfBoundsException.
In that example above, the 3 calls to List's methods have to be atomic. They must happen together as one unit, no interference from other threads, or else we'll get the IndexOutOfBoundsException again. The fact that the individual methods are synchronized is irrelevant. In fact, we can go back to using the non-synchronized ArrayList, and the program will work, as long as we synchronize properly to make the 3 calls happen as one atomic, indivisible unit of execution:
synchronized (list) {
list.clear();
list.add("888");
list.remove(0);
}
The moral of the story is that just because a class is fully synchronized, doesn't mean it's threadsafe (UPDATE: as in doesn't mean your code will be threadsafe from using it--thanks Alex). You still have to be on the look for those sequence of method calls that have to occur atomically, because method level synchronization won't help in this regard. In other words, watch what you're doing. (And yes, we should still prefer ArrayList over Vector.)
Threadsafe Iteration & ConcurrentModificationException
Sometimes it's not so obvious when exactly we're supposed to synchronize our use of Collections. Ever encountered a ConcurrentModificationException before? I bet it's probably because your code looks something like this (a.k.a.: why the for-each loop isn't such a great idea actually):
final List<String> list = new ArrayList<String>();
list.add("Test1");
list.add("Test2");
list.add("Test3");
for(String s : list) {
if(s.equals("Test1")) {
list.remove(s);
}
}
ConcurrentModificationException will be thrown in this case, even when there's only a single thread running. To fix this problem, we can't use the for-each loop since we have to use the remove() method of the iterator, which is not accessible within the for-each loop. Instead we have to do this:
for(Iterator<String> iter = list.iterator(); iter.hasNext();) {
String s = iter.next();
if(s.equals("Test1")) {
iter.remove();
}
}
The point is that iteration is something that we'd probably want to happen atomically--that is, while we're iterating over a collection, we don't want other threads to be modifying that collection. If it happens most probably something is wrong with our design.
This is why if you look into the JDK source code, the implementation for Iterator usually checks the expected modification count (i.e.: how many times is this collection supposed to have been modified?) against the collection's current modification count (how many times this collection has been modified). If they don't tally, the Iterator assumes that another thread has modified the collection while the iteration is going on, so it throws the ConcurrentModificationException. (This is exactly what happens in our single-threaded case about too by the way--the call list.remove() increases the modification count such that it no longer tallies with the one that the iterator holds (whereas iter.remove() resets the mod count so they still tally.) ConcurrentModificationException is a useful exception--it informs us of a probable fault in our design.
When Volatile Fails
However, ConcurrentModificationException is not 100% reliable. The implementation of Iterator.next() may look something like this:
public E next() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
// other stuff
That is, ConcurrentModificationException is supposed to be thrown when the mod counts don't tally. But it may not get thrown because the thread on which the modCount check is running is seeing a stale value of modCount. That is, let's say you have thread A iterating through a collection. Whenever you call iter.next(), it checks that modCount == expectedModCount. But modCount may have been modified by thread B, and yet A is still seeing the unmodified value. If you remember, this is what the volatile keyword is about--it is to guarantee that a thread will always see the most recent value of a variable marked as such.
So why didn't Joshua Bloch (or whoever took his place in Sun to take care of the Collections API) mark the modCount volatile? That would at least make the concurrent modification detection mechanism more reliable, yes? Well... no. Actually marking modCount volatile won't help, because although volatile guarantees uptodateness, it doesn't guarantee atomicity.
What does that mean? Well, if you examine the implementation of ArrayList, you'll see that methods that modify the list increment the modCount (non-volatile) variable by one (i.e.: modCount++). So theoretically, if we mark modCount as volatile, whenever thread B says modCount++, thread A should always immediately see the value and throws ConcurrentModificationException.
But there is a problem: the increment operator (++) is not atomic. It is actually a compound read-modify-write sequence. So while thread B is in the middle of executing modCount++, it's entirely possible that the thread scheduler will kick thread B out and decide to run thread A, which then checks for modCount before B has a chance to write back the new value of modCount.
Hidden Iteration
As if things aren't hairy enough as they are, it's not always obvious when an iteration over a collection is happening. Sure, it's probably pretty easy to spot iteration code we've written ourselves, so we can synchronize those. However, much less obvious are the iterations that happen within the harmless-looking methods of the Collections API. If you examine the source code of java.util.AbstractCollection class, for example, you'll see that methods like contains(), containsAll(), addAll(), removeAll(), retainAll(), clear()... practically almost all of them trigger an iteration over the collection. Iteration suddenly becomes a LOT harder to spot!
So What Do We Do?
Sounds pretty hopeless, isn't it? Well... nah. A very, very smart CS professor named Doug Lea has figured it out for the rest of us. He came up with concurrent collection classes, which handle the problems listed above for the most common cases. These concurrent collections have been part of the standard Java API in java.util.concurrent package. For most cases, they are drop-in replacement for the corresponding non-threadsafe classes in java.util package, and if you haven't taken a good look at them, it's time that you do!
Don’t Force Premature Processing in Your Logging Statements
My transformation into a nagging old man is becoming more and more complete everyday I see something like this sprinkled liberally throughout the code:
// some code
log.debug("The result: " + doSomethingReallyExpensive());
// some code
The reason is that even when the log level is not DEBUG and that string ends up not being used at all, doSomethingReallyExpensive() is still evaluated, its result is still being concatenated with the string “The result: “, only to be discarded soon after. In other words, we’re wasting cycles evaluating something that is not going to be used at all, except in DEBUG mode. I’ve worked in a project where fixing these wasteful premature processing improved the performance by more than 30%.
(This doesn’t apply only to Commons Logging, which I used in the examples in this post. Here’s a similar entry in Log4j FAQ. It’s a general case of Java evaluating a method’s arguments first before executing the method itself. The same applies to C# applications using log4net, for instance.)
Fortunately, fixing it is easy. Just check whether we’re in DEBUG mode first:
if(log.isDebugEnabled()) {
log.debug("The result: " + doSomethingReallyExpensive());
}
This way, doSomethingReallyExpensive() is only evaluated when it is needed, that is, in DEBUG mode. (Of course, it’s good to check whether doSomethingReallyExpensive() has side effects first! Or else other parts that depends on its side effects may stop working because it is no longer called when we turn off DEBUG. But then again, anybody who relies on the evaluation of logging arguments to get side effects should really be sent to Timbuktu. No, not the real Timbuktu. The one Donald Duck often goes to.)
10 Eclipse Navigation Shortcuts Every Java Programmer Should Know
Man, I’m such an impatient guy. I cringe whenever I see somebody squint and frown, looking for a JSP file in Eclipse by browsing painfully through the gazillion JSPs in multiple folders in the Package Explorer. I squirm whenever I see somebody looking for a Java class by clicking through packages, one by one, backtracking if it’s the wrong package, and so on, until he sees the correct Java class.
I mean, any resource in the workspace is literally seconds away. Ditto to classes (and interfaces, and members, and so on). Why waste time and brain cycles to wade through countless lines in countless files? I thought that every Eclipse user knows this, in fact, if you’re reading this, most probably you already know this too. But thousands of Eclipse JDT users who never bother to read tech blogs in all probability will also never bother to find out what Eclipse can do for them. And it’s a pity, really, because they’re really missing out a lot. So maybe if you know one, you can forward this to them or something. Make them more productive or something, ya know. 30 seconds saved for every file can add up to really a lot!
So without further ado, let’s say you want to:
- Open any file quickly without browsing for it in the Package Explorer: Ctrl + Shift + R. This shortcut opens a dialog box that accepts the name of the file you’re looking for. It even accepts wildcard characters, yo. Typing *-conversion.properties will give you the list of all files that ends with -conversion.properties. So everytime you want to open a file–stop that hand from going to the mouse, and press Ctrl + Shift + R instead!
- Open a type (e.g.: a class, an interface) without clicking through interminable list of packages: Ctrl + Shift + T. If what you want is a Java type, this shortcut will do the trick. Unlike the previous shortcut, this even works when you don’t have the Java source file in your workspace (e.g.: when you’re opening a type from the JDK).
- Go directly to a member (method, variable) of a huge class file, especially when a lot of methods are named similarly: Ctrl + O. Say, you’re browsing through a file which has 500+ lines of code. How do you look for a method? Don’t use Ctrl + F and then type the method name. Use Ctrl + O, which gives you a list of candidates that match what you’ve typed so far. Select the member you want using the arrow keys, and press Enter. (Alternatively, if you just want to jump from one member to the next (or previous), you can use Ctrl + Shift + ↓ or Ctrl + Shift + ↑, respectively.) UPDATE: As Nick pointed out in the comments section, pressing Ctrl + O again shows the inherited members. Thanks Nick!
- Go to line number N in the source file: Ctrl + L, enter line number. Of course if the stack trace is in the Eclipse console, you can just click the hyperlink. But if it’s in a log file or something, just use this shortcut to go to the line in a jiffy.
- Go to the last edit location: Ctrl + Q for . If you have a big file, it’s annoying to jump from one location in line 1000+ to 2000+ only to realize after looking at line 2017 that you’ve made a mistake in that location near line 1000+ just now. This shortcut brings you right to where you last edited a file. Very handy in a big file. Gone are the days of “let’s see… where did I edit it again… nope, nope… ah there it is”. (This even works when you’re already looking at a different file.)
- Go to a supertype/subtype: Ctrl + T. Before I found this, if I want to go to the superclass of a class, I’d go the the very top of the file, hover my mouse over its superclass, hold Ctrl, and click. Disgusting. Now I just press Ctrl + T and I get this dialog below, which toggles between supertypes and subtypes when you press Ctrl + T again.
- Go to other open editors: Ctrl + E. I know you can cycle through the editors using Ctrl + F6 as well, but I prefer Ctrl + E because Ctrl + F6 has this annoying behaviour of requiring you to keep the Ctrl key down, and the distance between Ctrl and F6 is so far I have to twist my left hand to do that. Just press Ctrl + E, and either use the arrow buttons, or type the name of the file you’re editing.
- Move to one problem (i.e.: error, warning) to the next (or previous) in a file: Ctrl + . for next, and Ctrl + , for previous problem. No need to lift your hands off the keyboard to click on that red or yellow stripe.
- Hop back and forth through the files you have visited: Alt + ← and Alt + →, respectively. I have to admit I don’t find myself using these two often, though.
- Go to a type declaration: F3. Alternatively, you can hold Ctrl down and click the hyperlinked variable or class or whatever it is the declaration of which you want to see–but why lift your hand off the keyboard? Just press F3 and Eclipse will bring you to the declaration of whatever is at the cursor at that moment.
OK, that’s it for this post. There are tons of other Eclipse shortcuts not covered by this article. To see the whole list, just open up your Eclipse (I’m assuming Eclipse 3.2 here–in older or more recent versions this may differ slightly), go to Help → Help Contents → Java Development User Guide → Reference → Menus and Actions. The whole motherload is there, from generating comments, correcting indentations, surrounding with, and so on.
The point I’m trying to get across is: Eclipse has a LOT of shortcuts to make things real easy for you. Java (or heck, any software) development is hard. We shouldn’t make it harder on ourselves by fighting our tools! Let our tools help us as much as possible, so we all can go back on the dot and spend more time with our family, lovers, or whatever it is we want to spend more time on. There’s no honour in working hard inefficiently. Only disgrace.
Y Combinator for Dysfunctional Non-Schemers
A lot of software developers don’t come from a Computer Science background. I think in the long run this doesn’t matter, since I’ve seen a lot of CS grads who have completely forgotten things that are supposed to set them apart from the rest of us (“Lisp? You mean that AI language with a lot of parentheses? Yeah I used it before in uni. So?”). Besides, a lot of CS grads can’t program anyway. Plus, if one really cares about the programming craft, during his/her journey of ever improving his/her efficiency and effectiveness as a software programmer, one tends to come full circle and go back to the root, which is made of the stuff CS grads are forced to read about in university.
Now one of the thing that I find really interesting, yet had baffled me for a long time, is the Y Combinator (no, not Paul Graham’s company). Maybe CS students eat Y Combinator for breakfast. But I graduated as an electrical engineer. It’s only recently, when my programming self-improvement routine brought me to study Lisp, Scheme, and recursion in greater details, that I came across this strange Y thing that so many very smart people, like this guy, this guy, and this guy have written about. Before this I’ve never heard of it in my life. It’s like I’m trying to digest what I think is a very cool and profound concept, and then I come across these mental landmines like “pass the function as the first argument to itself”, and my brain will just explode and I have to restart all over again.
What is Y Combinator, exactly? Why does it work? I do real-world applications in Java/C#/C++/JavaScript or whatever, I don’t do Scheme for a living. What’s in it for me? Is it just a cool idea with no practical applications whatsoever? I find that the best way for me to understand something is to write about it. So here it is.
JavaScript: Lisp in C’s Clothing
All examples will be in JavaScript. Examples in Lisp/Scheme can be difficult to read, especially when you’re not used to the language yet. Writing the examples and illustrations in the familiar C-syntax will make it easier for me (and you, if you’re one of the two or three people who are reading this). The examples won’t be in Java, because Java’s obsession with nouns will make it awkward to write them. The fact that functions are first-class in JavaScript makes things a LOT easier. (If you need a short intro of JavaScript’s true capabilities, I’d shamelessly recommend this article.) So, with that out of the way, let’s start!
The Problem
Let’s start by a simple basic recursive function: factorial. Simplest function in the world, right? It’s like the Hello World of recursion.
function factorial(num) {
if(num < 2) {
return 1;
}
return num * factorial(num - 1);
}
Of course, functions are first-class in JavaScript, so we could’ve written it like this:
var factorial = function(num) {
if(num < 2) {
return 1;
}
return num * factorial(num - 1);
};
But now we have a potential problem, don’t we? The recursive call to factorial within the method only works because we happen to name the variable “factorial” as well. Should we name the variable differently, say, “fact”, instead of “factorial”:
var fact = function(num) {
if(num < 2) {
return 1;
}
return num * factorial(num - 1);
};
Then we get an error, because when we call “factorial(num – 1)”, the name “factorial” is not bound to anything. We can fix it for this case by changing the call to “fact(num – 1)”, of course, but this approach is a quick fix that doesn’t work, because this function can be assigned to any variable of any name.We have a problem that can be summed up thus: a recursive function is a function that calls itself. But an anonymous function has no name. So… how is it supposed to call itself for recursion?
First Attempt
(If you’re thinking: “Just give the bloody function a bloody name so you can make it recursive and get on with your life!”, I can’t say I totally disagree with you at this point. But anyway.)
So what can we do here? Well for one, we can keep the function anonymous, and get the name for recursion from a parameter passed to the anonymous function, like this:
var fact = function(forRec, num) {
if(num < 2) {
return 1;
}
return num * forRec(forRec, num - 1);
};
Then when we want to use it, we just pass the name of the function to itself, like this:
js> fact(fact, 0)
1.0
js> fact(fact, 1)
1.0
js> fact(fact, 2)
2.0
js> fact(fact, 3)
6.0
No matter what the name is, as long as we keep passing the same name as the first parameter, we’ll be OK–the anonymous function will be correctly calling itself.
But… OK, Second Attempt
The solution kinda works. But it’s not nice, requiring your users to your function name twice everytime they want to use it. Besides, now the code becomes less clear–everybody knows factorial, but this self-passing-to-self business is obfuscating the code. I believe we can do better. Let’s try to separate the “passing a function to itself” bit from the “calculate factorial of” bit, by currying it. Like this:
var createFact = function(forRec) {
return function(num) {
if(num < 2) {
return 1;
}
return num * (forRec(forRec))(num - 1);
};
};
(Currying, or Schönfinkelisation, is a lesson to all of us to choose names that are easy to spell, remember, and pronounce. Or else you may invent something and the other guy with the catchier name–who can compete with Curry?–gets the credit.)In the snippet above, the outer anonymous function (the one with forRec as a parameter) returns another anonymous function (the one accepting parameter num). The latter is very similar to our original factorial function (remember that our objective is to separate the passing-function-to-itself bit from the factorial bit), except for the bit in green:
(forRec(forRec))(num - 1);
That line is where the inner function needs to recurse. But instead of requiring a name to recurse, it calls the outer function… which returns the inner function itself. And that returned inner function is in turn called, with “num – 1″ as its argument. There we have our recursion.So now we have a slightly cleaner solution. We can use the outer function to create the inner function like this:
js> var factorial = createFact(createFact);
js> factorial(10)
3628800.0
Note that this is equivalent to this one-liner:
js> createFact(createFact)(10)
3628800.0
Hmmm. The code for the factorial function is still polluted, though. Let’s try to take out the anonymous recursion part from the factorial function entirely.
Third Attempt: Wrap, and wrap, and wrap, and wrap…
Our second attempt is still not as clean as we want it to be. Ideally, we want to separate the part that takes care of the anonymous recursion, from the part that actually does the factorial computation. Let’s see our last function again:
var createFact = function(forRec) {
return function(num) {
if(num < 2) {
return 1;
}
return num * (forRec(forRec))(num - 1);
};
};
The only difference between the inner function and a typical factorial function is the recursive part. Let’s try to take that forRec(forRec) bit out–that is, instead of doing it inside, let’s see if we can do it outside and pass it in as a parameter. Here’s the function again, with the forRec(forRec) taken out of the picture:
var fact = function(rec) {
return function(num) {
if(num < 2) {
return 1;
}
return num * rec(num - 1);
};
};
And like I said above, we do the forRec(forRec) outside, and then pass it to the fact function:
var recur = function(forRec) {
return function(n) {
return fact(forRec(forRec))(n);
}
};
Then we use it like this:js> var factorial = recur(recur);
js> factorial(6)
720.0
Or, as we’ve seen above:
js> recur(recur)(6)
720.0
Did you follow what happened during the recur(recur) call this time? From recur’s definition, a call to recur(recur) returns an anonymous function like this (substituting the parameter with the actual argument):
function(n) {
return fact(recur(recur))(n);
}
Let’s see what happens when this anonymous function is called: it returns the result of calling fact(recur(recur)) with argument n. Now what does fact(recur(recur)) evaluate to? If we go back to the definition of fact, it returns the following anonymous function:
function(num) {
if(num < 2) {
return 1;
}
return num * recur(recur)(num - 1);
};
which does the actual computation of the factorial. And when we reach this line:
return num * recur(recur)(num - 1);
We see that we have recur(recur). Which we have shown above, to eventually evaluate back to this factorial-computing anonymous function itself. In other words, it is calling itself. Ladies and gentlemen, we have recursion!
Okay… So What?
Indeed. So what? In the last attempt, we still have to call recur(recur) before using it? Well, the difference is that we have separated the anonymous recursion mechanism from the factorial logic. So for instance, instead of hard-coding the call to fact inside recur, we can make it a parameter, like this:
var recurWrapper = function(f) {
var recur = function(forRec) {
return function(n) {
return f(forRec(forRec))(n);
}
};
return recur(recur);
};
Then we can use it like this:js> recurWrapper(fact)(6);
720.0
Eh, that’s better! No more of this passing self to self bit (because it’s wrapped inside the wrapper). And now we can tidy up recurWrapper a bit. Shortening parameter names a bit and naming the recur function (instead of assigning the anonymous function to a variable called recur) gives us this:
var recurWrapper = function(f) {
function recur(forRec) {
return function(n) {
return f(forRec(forRec))(n);
}
};
return recur(recur);
};
There is a better name for recurWrapper, and that is Y:
function Y(f) {
function recur(r) {
return function(n) {
return f(r(r))(n);
}
};
return recur(recur);
}
which is really the JavaScript version of the (applicative-order or not? we shall see) Y Combinator. And it works with any single argument anonymous function that is supposed to be recursive. For example, here’s Y with a function to compute the Fibonacci number:
var fibo = Y(function(f) {
return function(n) {
if(n <= 2) {
return 1;
} else {
return f(n - 1) + f(n - 2);
}
}
});
Applicative Order Y Or Not (And What The Heck Is That?)?
Here’s a good explanation. In short, there are two ways of evaluation in programming languages. Applicative order is eager evaluation: arguments to a function are evaluated first before the function itself is executed. Whereas the normal order is lazy. Arguments to functions are evaluated when they need to be evaluated.
As such, there are two flavours of Y as well. The classic Y Combinator works when we’re using normal order of evaluation, but will hang when the evaluation is applicative order (just like in JavaScript, which evaluates the arguments first before a function is called). This normal Y Combinator is defined as such in lambda calculus:
Y = λf·(λx·f (x x)) (λx·f (x x))
which is closer to this:
function normalY(f) {
function recur(x) {
return f(x(x));
}
return recur(recur);
}
Which will hang, obviously, if you think in the applicative order way of thinking. The Y we just derived earlier, on the other hand, is applicative order. Note the difference in lambda calculus definition (the difference is in bold italic):
Z = λf. (λx. f (λy. x x y)) (λx. f (λy. x x y))
and its corresponding JavaScript version:
function applicativeOrderY(f) {
function recur(x) {
return function(y) {
return f(x(x))(y);
}
};
return recur(recur);
}
Right. Okay. So What’s In It For Me?
Yeah. That’s it. What’s in it for me beyond the “oh, neat” factor? Er. Frankly, I’m not sure. In the real world, if I need to write a recursive function, I will just give it a bloody name. Like alucard(). And doing Y combinator in your JavaScript codebase will probably piss off a web designer who has the misfortune of maintaining your code in the future.
I guess the main benefit of this whole exercise is that I feel good about understanding the Y combinator at last. It won’t make me a better programmer, at least in the short run, but heck. Having an iPod also doesn’t make your life any better other than making you feel good. So there.
UPDATE: I was surprised to see a big jump in my blog stat! Turned out that Matt Jaynes submitted this article to Y Combinator Startup News, and then linuxer submitted it to programming subreddit!
Christophe Grand and others in reddit and news.ycombinator pointed out that JavaScript has a built-in way of doing this using arguments.callee (see also Christophe’s comment below for a short example of how this is done). My intention was to derive the Y Combinator using a language with which a lot of people (including myself) are familiar (that is, JavaScript), instead of answering the question “how does one make an anonymous function call itself in JavaScript?”, but thanks anyway, guys!
Chris Rathman pointed out here that I’m still using a named function for my definition of Y(). Here’s my definition of Y again:
function Y(f) {
function recur(r) {
return function(n) {
return f(r(r))(n);
}
};
return recur(recur);
}
This definition is probably easier to understand because it uses JavaScript constructs that are familiar to most people, but like Chris said, we can go for the fully anonymous variant. Douglas Crockford also has a fully anonymous version in his The Little JavaScripter page, but let’s see how we can get to there from the definition we’ve seen in this article.First of all, remember that in JavaScript, we can define and call a function at one go like this:js> var y = function(x) { return x * x; }(2);
js> y
4.0
So with that in mind, we can replace the last line “return recur(recur);” with an anonymous function that wraps around it like this:
var Y = function(f) {
return function(recur) {
return recur(recur);
}( /* we must pass something here */ );
};
And now what’s left, is to call this anonymous function directly, passing the (anonymized) body of the recur function in my original definition of Y. Like this:
var anonY = function(f) {
return function(recur) { return recur(recur); } (
function (r) {
return function(n) {
return f(r(r))(n);
}
}
);
};
Which looks a little different from Douglas Crockford’s version (which is more similar to the one Chris posted in reddit), but they’re really different ways of saying the same thing. Man! This whole thing surely has been real educational for me
(And I hope for you too!) Thanks very much, everyone!







