Tail call optimisation in (g)awk
or “Wait, what? Tail call optimisation in awk?”
Overview
This post covers tail call optimisation (TCO) behaviour in three common awk implementations1: gawk, mawk and nawk (AKA the one true awk).
None of the three implement full TCO, while gawk
alone provides self-TCO. The bulk of this post will therefore be devoted to gawk’s implementation and related pitfalls.
Initial observations
Let’s begin with a simple awk script that defines a single function, recur
, called from the BEGIN
block:
$ nawk 'function recur() {return recur()} BEGIN {recur()}'
Segmentation fault: 11
$ mawk 'function recur() {return recur()} BEGIN {recur()}'
Segmentation fault: 11
$ gawk 'function recur() {return recur()} BEGIN {recur()}'
# ...runs indefinitely [?]...
Note the difference in behaviour here: nawk and mawk blow the stack and segfault while gawk cheerily continues running. Thanks gawk.
But wait! Gawk is actually dynamically allocating additional stack frames—so long as there’s memory (and swap) to consume, gawk will gobble it up and our script will plod on. Below, the first 30 seconds of (virtual) memory consumption are charted2:
Whoops!
The gawk optimiser
In order to obtain (self-)TCO and spare your poor swap partition, gawk provides the -O
switch,
$ gawk -O 'function foo() {return recur()} BEGIN {recur()}'
# ...runs indefinitely; air conditioning no longer required...
and lo and behold,
Doubling down
What about full TCO? Let’s expand our one liner a little to include a trampoline call:
$ gawk -O 'function go() {return to()} function to() {return go()} BEGIN {go()}'
and chart memory consumption again,
Bugger. So, it looks like gawk isn’t keen on full blown TCO. Time to find out why.
The secret sauce
We’ve just seen that gawk seems to optimise self-calls in tail position when the -O
flag is specified. To better understand this functionality we can dump opcodes from the trampoline case and take a look under the hood:
$ echo 'function go() {return to()} function to() {return go()} BEGIN {go()}' > /tmp/trampoline.awk
$ gawk --debug -O -f /tmp/trampoline.awk
gawk> dump
# BEGIN
[ 1:0x7fc00bd022e0] Op_rule : [in_rule = BEGIN] [source_file = /tmp/trampoline.awk]
[ 1:0x7fc00bd02360] Op_func_call : [func_name = go] [arg_count = 0]
[ :0x7fc00c800f60] Op_pop :
[ :0x7fc00c800e20] Op_no_op :
[ :0x7fc00c800ea0] Op_atexit :
[ :0x7fc00c800f80] Op_stop :
[ :0x7fc00c800e60] Op_no_op :
[ :0x7fc00bd01e00] Op_after_beginfile :
[ :0x7fc00c800e40] Op_no_op :
[ :0x7fc00c800e80] Op_after_endfile :
# Function: go ()
[ 1:0x7fc00bd01f20] Op_func : [param_cnt = 0] [source_file = /tmp/trampoline.awk]
[ 1:0x7fc00bd020a0] Op_func_call : [func_name = to] [arg_count = 0]
[ 1:0x7fc00bd01fb0] Op_K_return :
[ :0x7fc00c800ee0] Op_push_i : Nnull_string [MALLOC|STRING|STRCUR|NUMCUR|NUMBER]
[ :0x7fc00c800f00] Op_K_return :
# Function: to ()
[ 1:0x7fc00bd02130] Op_func : [param_cnt = 0] [source_file = /tmp/trampoline.awk]
[ 1:0x7fc00bd02270] Op_func_call : [func_name = go] [arg_count = 0]
[ 1:0x7fc00bd021f0] Op_K_return :
[ :0x7fc00c800f20] Op_push_i : Nnull_string [MALLOC|STRING|STRCUR|NUMCUR|NUMBER]
[ :0x7fc00c800f40] Op_K_return :
Note the lack of a distinct jump or tailcall opcode; instead, even with the optimiser turned on, go
and to
are performing Op_func_call
s. Hmm, okay; we’ll see a different opcode in our original recur
case, though, right? Wrong:
$ echo 'function recur() {return recur()} BEGIN {recur()}' > /tmp/recur.awk
$ gawk --debug -O -f /tmp/recur.awk
gawk> dump
# BEGIN
[ 1:0x7fc1d0408ef0] Op_rule : [in_rule = BEGIN] [source_file = /tmp/recur.awk]
[ 1:0x7fc1d0408f80] Op_func_call : [func_name = recur] [arg_count = 0]
[ :0x7fc1d0802120] Op_pop :
[ :0x7fc1d0802020] Op_no_op :
[ :0x7fc1d08020a0] Op_atexit :
[ :0x7fc1d0802140] Op_stop :
[ :0x7fc1d0802060] Op_no_op :
[ :0x7fc1d0408bc0] Op_after_beginfile :
[ :0x7fc1d0802040] Op_no_op :
[ :0x7fc1d0802080] Op_after_endfile :
# Function: recur ()
[ 1:0x7fc1d0408ce0] Op_func : [param_cnt = 0] [source_file = /tmp/recur.awk]
[ 1:0x7fc1d0408e60] Op_func_call : [func_name = recur] [arg_count = 0]
[ 1:0x7fc1d0408d70] Op_K_return :
[ :0x7fc1d08020e0] Op_push_i : Nnull_string [MALLOC|STRING|STRCUR|NUMCUR|NUMBER]
[ :0x7fc1d0802100] Op_K_return :
¯\_(ツ)_/¯
Time to dig around gawk’s grammar definition. Here’s return
, defined in awkgram.y
:
| LEX_RETURN
{
if (! in_function)
yyerror(_("`return' used outside function context"));
} opt_exp statement_term {
if ($3 == NULL) {
$$ = list_create($1);
(void) list_prepend($$, instruction(Op_push_i));
$$->nexti->memory = dupnode(Nnull_string);
} else {
if (do_optimize
&& $3->lasti->opcode == Op_func_call
&& strcmp($3->lasti->func_name, in_function) == 0
) {
/* Do tail recursion optimization. Tail
* call without a return value is recognized
* in mk_function().
*/
($3->lasti + 1)->tail_call = true;
}
$$ = list_append($3, $1);
}
$$ = add_pending_comment($$);
}
Take a closer look at the code following that comment:
if (do_optimize
&& $3->lasti->opcode == Op_func_call
&& strcmp($3->lasti->func_name, in_function) == 0
) { /* ... */
($3->lasti + 1)->tail_call = true; /* <--- */
}
In other words, during a return
gawk:
- Checks whether the
do_optimize
flag (-O
) is specified. - If so, it checks whether the previous instruction is an
Op_func_call
. - If that call is to a function with the same name as the current one,
- …the
tail_call
flag is set.
So it goes.
At last, a conclusion
Here’re a few takeaways from the above3:
- Don’t rely on TCO if you’re writing awk.
- Just don’t.
- If you do need TCO, make sure you’re using gawk
- Be sure to specify the
-O
flag otherwise you’ll need to buy a new fan, - and make sure you’re not trampolining as the optimiser won’t be of any help.
- Be sure to specify the
Personally, I’ll be sticking with nawk.
comments powered by Disqus