Tail call optimisation in (g)awk

# May 13, 2016

or “Wait, what? Tail call optimisation in awk?”

Overview

This post covers tail call optimisation (TCO) behaviour in three common awk implementations¹: gawk, mawk and nawk (AKA the one true awk).

None of the three implement full TCO, while gawk alone provides self-TCO. The bulk of this post will therefore be devoted to gawk’s implementation and related pitfalls.

Initial observations

Let’s begin with a simple awk script that defines a single function, recur, called from the BEGIN block:

$ nawk 'function recur() {return recur()} BEGIN {recur()}'
Segmentation fault: 11
$ mawk 'function recur() {return recur()} BEGIN {recur()}'
Segmentation fault: 11
$ gawk 'function recur() {return recur()} BEGIN {recur()}'
# ...runs indefinitely [?]...

Note the difference in behaviour here: nawk and mawk blow the stack and segfault while gawk cheerily continues running. Thanks gawk.

But wait! Gawk is actually dynamically allocating additional stack frames—so long as there’s memory (and swap) to consume, gawk will gobble it up and our script will plod on. Below, the first 30 seconds of (virtual) memory consumption are charted²:

Whoops!

The gawk optimiser

In order to obtain (self-)TCO and spare your poor swap partition, gawk provides the -O switch,

$ gawk -O 'function foo() {return recur()} BEGIN {recur()}'
# ...runs indefinitely; air conditioning no longer required...

and lo and behold,

Doubling down

What about full TCO? Let’s expand our one liner a little to include a trampoline call:

$ gawk -O 'function go() {return to()} function to() {return go()} BEGIN {go()}'

and chart memory consumption again,

Bugger. So, it looks like gawk isn’t keen on full blown TCO. Time to find out why.

The secret sauce

We’ve just seen that gawk seems to optimise self-calls in tail position when the -O flag is specified. To better understand this functionality we can dump opcodes from the trampoline case and take a look under the hood:

$ echo 'function go() {return to()} function to() {return go()} BEGIN {go()}' > /tmp/trampoline.awk
$ gawk --debug -O -f /tmp/trampoline.awk
gawk> dump

	# BEGIN

[     1:0x7fc00bd022e0] Op_rule             : [in_rule = BEGIN] [source_file = /tmp/trampoline.awk]
[     1:0x7fc00bd02360] Op_func_call        : [func_name = go] [arg_count = 0]
[      :0x7fc00c800f60] Op_pop              :
[      :0x7fc00c800e20] Op_no_op            :
[      :0x7fc00c800ea0] Op_atexit           :
[      :0x7fc00c800f80] Op_stop             :
[      :0x7fc00c800e60] Op_no_op            :
[      :0x7fc00bd01e00] Op_after_beginfile  :
[      :0x7fc00c800e40] Op_no_op            :
[      :0x7fc00c800e80] Op_after_endfile    :

	# Function: go ()

[     1:0x7fc00bd01f20] Op_func             : [param_cnt = 0] [source_file = /tmp/trampoline.awk]
[     1:0x7fc00bd020a0] Op_func_call        : [func_name = to] [arg_count = 0]
[     1:0x7fc00bd01fb0] Op_K_return         :
[      :0x7fc00c800ee0] Op_push_i           : Nnull_string [MALLOC|STRING|STRCUR|NUMCUR|NUMBER]
[      :0x7fc00c800f00] Op_K_return         :

	# Function: to ()

[     1:0x7fc00bd02130] Op_func             : [param_cnt = 0] [source_file = /tmp/trampoline.awk]
[     1:0x7fc00bd02270] Op_func_call        : [func_name = go] [arg_count = 0]
[     1:0x7fc00bd021f0] Op_K_return         :
[      :0x7fc00c800f20] Op_push_i           : Nnull_string [MALLOC|STRING|STRCUR|NUMCUR|NUMBER]
[      :0x7fc00c800f40] Op_K_return         :

Note the lack of a distinct jump or tailcall opcode; instead, even with the optimiser turned on, go and to are performing Op_func_calls. Hmm, okay; we’ll see a different opcode in our original recur case, though, right? Wrong:

$ echo 'function recur() {return recur()} BEGIN {recur()}' > /tmp/recur.awk
$ gawk --debug -O -f /tmp/recur.awk
gawk> dump

	# BEGIN

[     1:0x7fc1d0408ef0] Op_rule             : [in_rule = BEGIN] [source_file = /tmp/recur.awk]
[     1:0x7fc1d0408f80] Op_func_call        : [func_name = recur] [arg_count = 0]
[      :0x7fc1d0802120] Op_pop              :
[      :0x7fc1d0802020] Op_no_op            :
[      :0x7fc1d08020a0] Op_atexit           :
[      :0x7fc1d0802140] Op_stop             :
[      :0x7fc1d0802060] Op_no_op            :
[      :0x7fc1d0408bc0] Op_after_beginfile  :
[      :0x7fc1d0802040] Op_no_op            :
[      :0x7fc1d0802080] Op_after_endfile    :

	# Function: recur ()

[     1:0x7fc1d0408ce0] Op_func             : [param_cnt = 0] [source_file = /tmp/recur.awk]
[     1:0x7fc1d0408e60] Op_func_call        : [func_name = recur] [arg_count = 0]
[     1:0x7fc1d0408d70] Op_K_return         :
[      :0x7fc1d08020e0] Op_push_i           : Nnull_string [MALLOC|STRING|STRCUR|NUMCUR|NUMBER]
[      :0x7fc1d0802100] Op_K_return         :

¯\_(ツ)_/¯

Time to dig around gawk’s grammar definition. Here’s return, defined in awkgram.y:

| LEX_RETURN
  {
    if (! in_function)
        yyerror(_("`return' used outside function context"));
  } opt_exp statement_term {
    if ($3 == NULL) {
        $$ = list_create($1);
        (void) list_prepend($$, instruction(Op_push_i));
        $$->nexti->memory = dupnode(Nnull_string);
    } else {
        if (do_optimize
            && $3->lasti->opcode == Op_func_call
            && strcmp($3->lasti->func_name, in_function) == 0
        ) {
            /* Do tail recursion optimization. Tail
             * call without a return value is recognized
             * in mk_function().
             */
            ($3->lasti + 1)->tail_call = true;
        }

        $$ = list_append($3, $1);
    }
    $$ = add_pending_comment($$);
  }

Take a closer look at the code following that comment:

if (do_optimize
  && $3->lasti->opcode == Op_func_call
  && strcmp($3->lasti->func_name, in_function) == 0
) { /* ... */
  ($3->lasti + 1)->tail_call = true; /* <--- */
}

In other words, during a return gawk:

Checks whether the do_optimize flag (-O) is specified.
If so, it checks whether the previous instruction is an Op_func_call.
If that call is to a function with the same name as the current one,
…the tail_call flag is set.

So it goes.

At last, a conclusion

Here’re a few takeaways from the above³:

Don’t rely on TCO if you’re writing awk.
Just don’t.
If you do need TCO, make sure you’re using gawk
- Be sure to specify the -O flag otherwise you’ll need to buy a new fan,
- and make sure you’re not trampolining as the optimiser won’t be of any help.

Personally, I’ll be sticking with nawk.

Probably the most common.↩
Output drawn from ps↩
YMMV↩