February 27, 2007

What is the point of macros?

One of the distinguishing features of Lisp and Scheme is the ability to define macros that allow you to extend the base language with new language constructs. Here are two examples what this actually means and why this is a good idea.


Let's take a common idiom from Java: Whenever you want to open a file, process its contents, and close it again afterwards, you have to write code roughly like this:

FileInputStream in = new FileInputStream(filename);
try {
doSomething(in);
} finally {
in.close();
}

Note how the orange code is the code that you are actually interested in: That's the part that states which file you want to open, that you want to open it for reading, and that you want to do something with the file. The rest is just idiomatic code that you have to write just to make the machinery work: You declare a variable outside of the try block (this is important for the visibility of the variable!), and you have to put the call of the close method in a finally block to ensure that the file is always closed even when an exception occurs.

The problem here is that you have to write this code over and over and over again, and everytime you write this code, you have to get the details right. For example, when I was still a Java hacker, I always have put the variable declaration inside the try block, just to get the error message from the compiler that this is wrong. This is annoying - it would be nice if I didn't have to think about this. (I have actually gotten it wrong again in the first version of this posting. Go figure.)

Here is the same code in Common Lisp:

(let ((in (open filename)))
(unwind-protect
(do-something in)
(close in)))

This is not yet an improvement: The orange part of the code is still deep down in other code that just ensures that the machinery for opening and closing the file works. I still have to declare the variable first (here with a let), I still have to use the Lisp-equivalent of try-finally (here it's called unwind-protect), and I still have to ensure that the closing of the file happens as part of the clean-up step of an unwind-protect.

However, Lisp gives us a way to get rid of the idiomatic code. What we would actually like to write is this:


(with-input-file (in filename)
(do-something in))


This code states exactly what we want and nothing else: We want to execute some code with an open file bound to some variable, and we implicitly want to ensure that it is closed again at the end of this code fragment.

Here is a macro that defines this construct:

(defmacro with-input-file ((variable filename) &body body)
`(let ((,variable (open ,filename))
(unwind-protect
(progn ,@body)
(close ,variable))))

Here is an explanation of the code:

  • The macro is defined with defmacro and we name it like the language construct we want to build. It takes two variables: a variable name and a piece of code that will give us the filename in the resulting code. It also takes a body of code for the with-input-file construct.

  • The macro creates a piece of code that the Lisp compiler should use instead of the original code that uses the with-input-file construct: We basically define a code template in which the variable, filename and body spots are filled in with the parameters that the macro receives.

  • Some minor details: The progn groups a number of expressions into one - this is similar to grouping code in a pair of curly braces in Java. The backquote, comma and comma-at notations are there to clarify how the parameters should be inserted - that's something that you can better learn in good Lisp tutorials.


Although writing macros takes a little practice, it is very straightforward once you are used to doing this. Macros help you to avoid typing a lot of idiomatic code: whenever you notice some coding pattern, you can put it in some macro and then use it instead.

Syntactic Abstractions



Let's get a little closer to the heart of what macros actually are: They provide you with a way to define syntactic abstractions. Abstractions are used for hiding away implementation details that client code should not be interested in. For example, in classes you can define private fields and methods for internal use and define a public interface that client code has to go through to ensure that the internal details are always used properly. Likewise, with functional abstractions - closures - you can pass around a function that can refer to variables in its lexical environment without giving anyone direct access to these lexical variables.

Many implementation details can be hidden very well with such abstraction mechanisms. However, others cannot be hidden well, and this is when macros become interesting.

For example, assume that your language doesn't provide any iteration constructs but requires you to use recursion instead. You can relatively easily build your own iteration constructs - for example a while function - like this in Lisp:

(defun while-function (predicate block)
(if (not (funcall predicate)) (return))
(funcall block)
(while-function predicate block))

This function works as follows: It takes a predicate function and a block of code, also provided as a function. When calling the predicate (with funcall) returns false, the while-function returns. Otherwise, the block is executed (again with funcall) and then the while-function calls itself again.

Here is an example of using the while-function:

(let ((i 0))
(while-function (lambda () (< i 10))
(lambda ()
(print i)
(incf i))))

This code fragment increments i from 0 up to 10 and prints i at each step. This piece of code doesn't look very nice. One reason is that Lisp's lambda construct is somewhat lengthy. Smalltalk, Ruby, and other languages have nicer syntax for lambda expressions, and would make this code shorter. However, even then there is a problem here: We still have to write idiomatic code to write a while loop although we are actually not interested in the idiom. Here, the idiomatic element is the use of a lambda expression to ensure that some code is not immediatily executed, but only later under the control of the while-function. However, what we actually want to say is this, which doesn't contain any idiomatic code at all:

(let ((i 0))
(while (< i 10)
(print i)
(incf i)))

And, as you might have guessed, here is a macro to implement this while construct:

(defmacro while (expression &body body)
`(while-function (lambda () ,expression)
(lambda () ,@body)))

This macro uses the while-function in the code that it creates. This is actually one of the typical ways to write Lisp code: We first define the functional (or object-oriented, or imperative, or whatever) abstractions, and then we add some syntactic layer on top to make the code look nicer, and especially to hide away implementation details that we cannot hide otherwise.

Why is it so interesting to hide away implementation details, like the use of lambda expressions to delay evaluation? Well, we could actually decide not to use lambda expressions at all in our expansion. Here is an alternative implementation of the while macro:

(defmacro while (expression &body body)
`(tagbody
start (if (not ,expression) (go end))
,@body
(go start)
end))

Yes, Common Lisp provides the tagbody construct inside which you can use go, i.e., a goto statement! Gotos are not a very good idea to use in your own code, but gotos are very useful for generating efficient code in macros. In this second implementation of the while macro, there is for example no need anymore to allocate space for the closures that the two lambda expressions create, because we can simply jump around the code to delay its execution.

Of course, this is a toy example, so the little gain in efficiency here is probably not worth the effort. However, what is important is that the "API" for the while macro hasn't changed at all compared to the first version. You still write this:

(let ((i 0))
(while (< i 10)
(print i)
(incf i)))

That's one of the important advantages of abstractions: You can change internal implementation details while all the client code can be left unchanged!

And this is the whole point of macros: You can abstract away code idioms that you cannot astract away in any other way, and you effictively have a lot more options to change implementation details without affecting any clients.

Macros are also one of the fundamental reasons for why Lispers like Lisp's strange syntax so much: The syntax is a direct representation of what is elsewhere called the abstract syntax tree, and the fact that the AST is directly available for manipulation as a very simple list data structure makes it so straightforward and convenient to implement and use macros.

If you now have gotten the taste of this, you may want to read some more introductory material about Lisp, like my own opinionated guide, Peter Seibel's excellent book about Common Lisp, or any of the other available tutorials.

16 comments:

Unknown said...

If you put all the functionality in a macro, then when you figure out that the cleanup part of WITH-INPUT-FILE is better written as (WHEN IN (CLOSE IN)), you'll need to recompile all the callers (well, it is better than to patch all similar code fragments :-)). On the other hand, if the macro is just a syntactic sugar for (INVOKE-WITH-INPUT-FILE filename fun), only this function need to be patched.

Reinier Zwitserloot said...

Your java example is incorrect. try:

FileInputStream in = new FileInputStream(filename);
try {
// do stuff with 'in'
} finally {
in.close();
}


That's far more 'correct'. Not only in the sense that your example won't even compile, but also in the sense that a failure to open the file in the first place (e.g. a FileNotFoundException) would be masked in your example by the NullPointerException that occurs as a result of trying to call a method on null. (If you were to fix your example, that is).

Macros are sort-of useful but the try/finally problem can be solved in a number of more or less nice ways, while macros also make everyone re-invent the wheel all day. More specifically, it becomes difficult to 'unwind the macro' - to see the machinery that makes your project tick. In java you're a simple ctrl+click on any identifier away from seeing code and, where available, documentation.

Pascal Costanza said...

Adejneka, you only need to recompile all clients when you use a Lisp compiler. During development and testing of macros you can use a Lisp interpreter where macros are expanded on demand. In that case, a change to a macro definition immediately takes effect.

Reinierz, yes my Java code was incorrect. This actually proves my point. ;) But macros do not make everyone reinvent the wheel everyday, at least not from what I can tell. And seeing what a macro does to your code is also just a matter of right-clicking on a Lisp expression and asking the development environment to either show the macro definition or to show the expansion of this concrete piece of code.

Unknown said...

Closures solve the with-input-file problem quite nicely, and there's a concrete proposal on how to add them to Java to solve (among others) this problem. See: this article for an example almost identical to yours. (Let's hope they make it to Java 7.)

Macros provide additional power, though, because you can bind constructs to your own compile-time code (that has full expressivity). You could, for example, also do structural modifications of the program, like introducing a field, getter and setter from a single declaration. The downside of this expressivity is that the compiler (or other tools) cannot reason about your code until after the expansion of all macros; the fact that the tools work on a "derivative product" of your source (a derivative that you cannot reasonably predict) makes traceability harder. Likewise, it is hard to reason about a macro in general, i.e. for all programs that could employ it. Generally, the best you can say is that the macro expansion will produce correct syntax (if it is a type-checked AST-to-AST transformation like in camlp4). Whether the resulting code makes sense is something you can only verify for a concrete program at hand. If you care about the static guarantees that a compiler could give you, then this is an important issue which might urge you to restrict this expressivity. (For example, Java Generics are less expressive than C++ Templates but, in contrast, you can reason about them without expansion. Because Generics are still able to capture most common cases, you could say this is a good trade-off.)

Nevertheless, from a user standpoint, I still think it is paternalistic not to provide a macro system in languages like Java; it addresses otherwise unsolved problems and the technology is hardly rocket science (i.e. what's obstructing you from doing it now is accidental complexity). And together with the addition of closures, it would give Lisp hackers one less incentive to start blogging... :-)

stromboli said...

@reinierz

It's misleading to say the example won't compile - yours won't compile either for the same reasons.

The usual idiom is to check if a connection is null before you close it, but a code example is allowed to be abbreviated.

Yes, in Java you can track down a variable's type to examine its workings, but the subject here is quite different, namely syntax and how you can extend or adapt it, a much more powerful mechanism for creating new levels of abstraction

Peter Scott said...

Your while macro with a tagbody can explode spectacularly:

(tagbody
(let ((i 0))
(while (< i 10)
(print i)
(go start)
(incf i)))
start)

This is not the sanest example ever created, but it demonstrates my point: your tagbody can interact with others. In this case, it creates an infinite loop.

Here's a way to fix it:

(defmacro while (expression &body body)
(with-gensyms (start end)
`(tagbody
,start
(if (not ,expression) (go ,end))
,@body
(go ,start)
,end)))

I used with-gensyms for convenience; the standard CL way would be to use gensym a couple times.

Pascal Costanza said...

Peter,

Yes, you are illustrating the issue of macro hygiene. The goal of my posting was not to give a comprehensive overview of all the issues involved in macro programming, but to illustrate the essential ideas why macros might be beneficial. The existing tutorials on Common Lisp, for example those that I mention at the end of my posting, discuss these issues in detail.

Unknown said...

Good example, but please change your color scheme.

I couldn't read the blue on black and had to highlight every code block with my mouse to read the thing.

Pascal Costanza said...

dpapathanasiou,

Thanks for your comment - I have changed the colors.

Unknown said...

“When calling the predicate (with funcall) returns true” — should be “[...] returns false I think.

Thanks for the interesting article.

Pascal Costanza said...

Angus,

Right, thanks for spotting this bug. I have just fixed it.

Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...
This comment has been removed by a blog administrator.
rjm said...

On defmacro with-input-file, a closing parenthesis is missing.
It appears that it's missing on the second macro line.
If I add one there, the macro is complete & correct.
(parenthesis hell...)