Introduction

Tulip is an interpreter for a minimal Lisp-like language.

Tulip works by translating Lisp code into Perl functions via ``eval'' and ``sub''. This is less efficient than you'd think, since all variable lookups are actually done using Perl hashes instead of the real Perl namespace.

I have attempted to document this program reasonably well, so hopefully, this code will all make sense. In general, I've tried to go for simplicity over performance or, say, usefulness.

(Note: Any resemblance between Tulip and a published Lisp or Scheme standard is purely coincidental).


License

Tulip and related files are released under the terms of the new BSD license, as follows:

Copyright (c) 2005, Chris Reuter

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. Neither the name of the developer nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


Modules And Pragmas

We start with a few includes. This should all be part of your standard Perl distribution.

(Note that I don't have a Perl version specified. I wrote this using 5.8.3, although it should work on any version recent enough to support the use of ``foreach my $x (...)''. I haven't tested this though, so your mileage may vary.

    
    use strict;
    
    use FileHandle;
    use Getopt::Long;


Arguments

We use global variables for them and parse the argument list with Getopt::Long (which, I should mention, r0xx0rz!!!!)

They are:

$show_trans
Print out generated Perl code.

$show_macro
Print out macro expansions.

$help
Print the help message and quit.

$verbose
Be slightly more verbose.

These are really only useful for debugging.

    
    my $show_trans = 0;
    my $show_macro = 0;
    my $help = 0;
    my $verbose = 0;
    
    GetOptions ('show-trans'        => \$show_trans,
                'show-translated'   => \$show_trans,
                'show-macro'        => \$show_macro,
                'verbose'           => \$verbose,
                'help'              => \$help)
      or do {
        print "Invalid argument.\n";
        $help = 1;
      };
    
    $show_trans and do {$verbose = 1};
    
    if ($help) {
      print "Tulip, a minimal Lisp-like language in Perl
    Options:
    
        --show-trans         -- Show the generated perl code but don't run it.
        --show-macro         -- Print out macro expansions as they're evaluated.
        --verbose            -- Display more messages.
        --help               -- Print this message.
    
    ";
      exit (0);
    }


The Lisp Namespace

We use a hash to hold the global namespace. Actually, this is a bit more complicated than that, as explained below in the section describing the Context Stack.

The variable t is defined here as a 3-element array. I do this because t should be non-nil but not of any other type. That is, t is an object whose type consists only of itself. Internally, we normally just use references to this global.

The variable $EOF is similar but is not visible to Lisp programs. It's used internally to detect end-of-file at read time.

    
    my %globals = ();
    
    my $true = ['t_obj', 't_obj', 't_obj'];
    my $EOF = ['eof', 0, 0];
    $globals{nil} = undef;
    $globals{t} = \$true;


Macros

Macros are simple. They're just Lisp functions that get called by the parser. The built-in function _macro stores a reference to a function in global hash %macros and translate_expr calls it whenever a list begins with a macro name.

The macro function takes one argument--the expression--and its return value is parsed in place of the macro call.

    
    my %macros = ();


Contexts

A context (or ``activation record'' or ``stack frame'') is the collection of variables visible in the current scope. Unlike C (or Perl), we can't just use a stack for them.

Consider the following example:

   (defun+ _init () (count)
     (setq count 0)
     (defun next ()
       (setq count (+ count 1))))
   (_init)
   (printnl (next))
   (printnl (next))
   (printnl (next))

The local variable count needs to hang around as long as the function next exists. Putting it on a stack would make it go away while it was still needed.

Furthermore, Tulip is lexically scoped, so a function doesn't have access to the caller's locals. This means that there are two different ways of organizing contexts, by call and by scope.

Here's how it works:

  1. A context is implemented as a 2-element array, where element 0 is the hash of names-to-values and element 1 is a reference to the outer context:
        return [{}, $outer_context];

  2. Each lambda contains a reference to the context in which it was created.

  3. The global variable @stack contains the call stack. Whenever a function starts up, it first calls push_context which creates a new context and pushes it to the top of the stack. This context takes the lambda's creation context as its outer context. Before returning, it calls pop_context to remove the context from the stack.

  4. Global variables are stored in the context at the bottom of the stack. This is also referenced by the global variable $global_context. Its $outer_context is undef.

  5. When a function tries to access a variable, it checks the current context to see if that name is defined. If not, it tries the context referenced by $context-[1]>, continuing to chase pointers until it either finds a context with the name defined or reaches the end of the chain and flags an error.

In this way, contexts are popped off the stack when their functions return but they stay alive as long as something points to them.

(In other words, we use the Perl garbage collector to do our memory management.)

    
    my @stack = ();
    
    my $global_context = [\%globals, undef];
    push @stack, $global_context;

new_context

Create a new context with the argument as the reference to the outer context.

    sub new_context ( $ ) {
      my $outer_context = shift;
    
      return [{}, $outer_context];
    }

push_context

Create a new context with the argument as the reference to the outer context, then push it onto the call stack.

    sub push_context ( $ ) {
      my $outer_context = shift;
      push @stack, new_context ($outer_context);
    }

pop_context

Pop the top context off the stack.

    sub pop_context () {
      pop @stack;
    }

curr_context

Return the current context.

    sub curr_context () {
      return $stack [-1];
    }

context_with_name

Find the innermost context which as defined $var_name

    sub context_with_name ( $ ) {
      my $var_name = shift;
    
      for (my $curr = curr_context; defined($curr); $curr = $curr->[1]) {
        if (exists($curr->[0]->{$var_name})) {
          return $curr;
        }
      }
    
      return undef;
    }

def_var_in

Define a variable in context $cc with name $name and initial value $value (which may be undef). Raise an error if it's already defined.

    sub def_var_in ( $$$ ) {
      my ($name, $value, $cc) = @_;
    
      die "Variable '$name' already defined in current context."
        if exists($cc->{$name});
    
      $cc->{$name} = $value;
    }

def_var

Define a variable with the name $name in the current context and set $value to it. $value may be undef.

    sub def_var ( $$ ) {
      my ($name, $value) = @_;
      my $cc = curr_context->[0];
    
      def_var_in ($name, $value, $cc);
    }

def_global

Define a variable in the global namespace.

    sub def_global ( $$ ) {
      my ($name, $value) = @_;
    
      def_var_in ($name, $value, $global_context->[0]);
    }

set_var

Set the variable named by $name to $value in the current innermost context, IF $name HAS BEEN DEFINED.

    sub set_var ( $$ ) {
      my $name = shift;
      my $value = shift;
    
      my $context = context_with_name ($name);
    
      die "Undefined variable: $name" unless defined($context);
    
      # Protect the 2 reserved names we've got.
      if ($context == $global_context) {
        if ($name eq 't' || $name eq 'nil') {
          die "Attempted to set read-only variable '$name'.\n";
        }
      }
    
      $context->[0]->{$name} = $value;
    }

lookup_var

Return the value of the variable named by $name if it exists.

    sub lookup_var ( $ ) {
      my $name = shift;
      my $cc = context_with_name ($name);
    
      if (!exists ($cc->[0]->{$name})) {
        die "Unknown name: $name\n";
      }
    
      return $cc->[0]->{$name};
    }

lookup_var_func

Lookup up a variable and raise an error if it's not a function.

    sub lookup_var_func ( $ ) {
      my $result = lookup_var (shift);
    
      die "Expecting function.\n" unless is_lambda ($result);
    
      return $result;
    }


Lists

Tulip implements pairs as 2-element arrays with undef (the Perl undefined value) shoehorned in as the empty list (i.e. nil).

For those that aren't familiar with Lisp lists, here's an overview:

Lisp has a built-in type called a pair or ``cons''. It is a data structure containing two pointers, called (for historical reasons) the ``car'' and ``cdr'' respectively. (I could have gone and called them ``first'' and ``rest'' or something, but that would make it too easy and we can't have that.)

A proper list is either:

  1. nil (i.e. undef)

  2. A bunch of cons cells where the car (first pointer) points to the payload object and the cdr (second pointer) points to either nil or the next cell in the list.

Thus, the cdr of a proper list points to another proper list.

Cool, eh?

    
    sub lforeach ( $$ );

flatten

Convert a Tulip list to a perl array.

    sub flatten ( $ ) {
      my $list = shift;
      my @result = ();
      my $curr;
    
      ensure_cons ($list);
    
      while (defined ($curr = car ($list))) {
        push @result, $curr;
        $list = cdr ($list);
      }
    
      return \@result;
    }

mk_list

Convert a perl array to a lisp list. Takes variable argument list.

    sub mk_list {
      my $result = undef;
      my $curr;
    
      return undef if (scalar (@_) == 0);
    
      do {
        $curr = pop;
        $result = [$curr, $result];
      } while (scalar(@_) > 0);
    
      return $result;
    }

mk_pair

Function to create a pair.

    sub mk_pair ( $$ ) {
      my ($left, $right) = @_;
    
      return [$left, $right];
    }

car

Return the first element of a list or undef (i.e. nil)

    sub car ( $ ) {
      my $list = shift;
    
      ensure_cons ($list);
    
      defined ($list) or return undef;
      return ${$list}[0];
    }

cdr

Return the rest of a list.

    sub cdr ( $ ) {
      my $list = shift;
    
      ensure_cons ($list);
    
      defined ($list) or return undef;
      return ${$list}[1];
    }

cadr

Return the second element of a list

    sub cadr ( $ ) {
      return car (cdr (shift));
    }

caddr

Return the third element of a list

    sub caddr ( $ ) {
      return car (cdr (cdr (shift)));
    }

llength

Return the length of a list.

    sub llength ( $ ) {
      my $list = shift;
      my $count = 0;
    
      ensure_cons ($list);
    
      lforeach $list, sub {$count++};
    
      return $count;
    }

llast

Return the last element in a list.

    sub llast {
      my $list = shift;
    
      ensure_cons ($list);
    
      if (!defined (cdr ($list))) {
        return car ($list);
      }
    
      return llast (cdr ($list));
    }

lforeach

Evaluate a sub over all elements of a list.

    sub lforeach ( $$ ) {
      my $list = shift;        # The list
      my $sub = shift;        # The thing to evaluate over it
    
      return unless defined($list);
    
      for ( ; defined ($list); $list = cdr ($list)) {
        my $is_last = !defined(cdr ($list));
        &{$sub}(car ($list), $is_last);
      }
    }


Types

We implement types using the Perl type system.

All Lisp objects are stored in Perl scalar variables. Under some circumstances, these values are references to Perl objects. From their layout, we can identify their type.

We use the following internal representation:

Cons Cell
Reference to a two-element array:
    [$car, $cdr]

String
Reference to a Perl string.

Symbol
A non-numeric Perl scalar (i.e. a string). Note that unlike the Lisp string type, this is not a reference to a string. The string is in the scalar.

Number
A Perl scalar containing a number. The difference between a number and a symbol is that for $expr, the expression
    $expr != 0 || $expr eq '0'

is true if $expr is a number and false if it is a symbol.

Function (Lambda)
A function (usually called ``a lambda'' from here on) is a reference to an array of (currently 4) elements with the first bieng a code reference.

Note that I may change the number of elements in this array at some later time, so an object is a lambda if it is an array of at least 3 elements and a code reference as the first item.

A lambda looks like this:

    [\&code, $context, \@arg_list, \@local_list]

Context is a reference to the context (i.e. stack frame) in which this lambda was created. The remaining elements are the names (in Perl lists) of formal arguments and local variables.

Quoted Things
A quoted thing is represented as a 1-dimensional array containing the thing quoted as its only element. Evaluating a quoted thing returns the thing being quoted.

For example, suppose we want to pass a list to a function:

    (foo (bar quux baz))

won't work because the evaluator interprets the argument (the list ``(bar quux baz)'') as an expression. To work around this, we have the quote special form:

    (foo '(bar quux baz))

This causes the parser (as such) to parse the thing after the quote but return it as a quoted object, i.e. as a reference to a 1-dimensional array.

You can quote anything but it really only makes sense to quote lists and names.

t
t is implemented as a reference to a specific 3-element array:
    ['t_obj', 't_obj', 't_obj']

Its type isn't really important except in that it's testably different from all of the other types.

t is the value returned by predicates when they succeed. Since every value except nil is considered true, t is only important for not being nil.

$EOF
$EOF is the end-of-file flag returned by read. Like t, its only useful property is that it has a unique type. It is implemented as a 3-element array where the first element is the string 'eof':
    ['eof', 0, 0]

It is not actually visible to user programs. It is mostly used for catching unexpected EOF errors in tulip_read().

Perl objects
Perl objects are just passed around as references. For this to work, they need to be blessed and have a name containing at least one lowercase letter.

Currently, only FileHandle is used in this way.

quote

Given an expression, convert it to a quoted version.

    sub quote ( $ ) {
      my $expr = shift;
      return [$expr];
    }

unquote

Given a quoted thing, unquote it. It is harmless to unquote an unquoted thing.

    sub unquote ( $ ) {
      my $expr = shift;
    
      if (!is_quote_expr ($expr)) {
        return $expr;
      }
    
      return $expr->[0];
    }

Type predicates

This section contains type tests.

is_opaque_obj

Return true if the result is a blessed Perl object. (Actually, this is a bit more picky--any type which is all upper-case is automatically rejected, based on the theory that this is the reserved namespace for built-in types.)

    sub is_opaque_obj ( $ ) {
      my $expr = shift;
      my $type = ref($expr);
    
      if ($type eq "" || $type =~ /^[A-Z]+/) {
        return 0;
      }
    
      return 1;
    }

is_lambda

Return true if the result is a lambda.

    sub is_lambda ( $ ) {
      my $expr = shift;
    
      return
        ref($expr) eq 'ARRAY' &&
        ref($expr->[0]) eq 'CODE' &&
        scalar(@{$expr}) >= 3;
    }

is_number

Return true if expr is a number, false otherwise.

    sub is_number ( $ ) {
      my $expr = shift;
      return ref($expr) eq "" && ($expr != 0 || $expr eq '0');
    }

is_symbol

Determine whether the argument is a name to be looked up (i.e. a bare name--at run time, this will be replaced with whatever it's been set to.)

    sub is_symbol ( $ ) {
      my $expr = shift;
    
      return defined($expr) && (ref ($expr) eq "") && (!is_number ($expr));
    }

is_quoted_symbol

Return true if the argument is a quoted symbol

    sub is_quoted_symbol ( $ ) {
      my $expr = shift;
    
      return is_quote_expr ($expr) && is_symbol ($expr->[0]);
    }

is_quote_expr

Return true if expr is a quoted thing.

    sub is_quote_expr ( $ ) {
      my $expr = shift;
    
      return ref($expr) eq 'ARRAY' && scalar(@{$expr}) == 1;
    }

is_string

Return true if expr is a string. Note that strings and symbols are the same type here.

    sub is_string ( $ ) {
      my $expr = shift;
    
      return ref ($expr) eq "SCALAR";
    }

is_cons

Return true if expr is a cons cell. Nil (i.e. undef) is considered a cons.

    sub is_cons ( $ ) {
      my $expr = shift;
    
      return 1 if (!defined($expr));
      return 1 if (ref($expr) eq 'ARRAY' and scalar(@{$expr}) == 2);
      return 0;
    }

is_proper_list

Determine whether the argument is a proper list. Slow.

    sub is_proper_list ( $ );    # Since the function recurses.
    sub is_proper_list ( $ ) {
      my $expr = shift;
    
      return 0 unless is_cons ($expr);
    
      # The empty list is a proper list.
      return 1 if !defined ($expr);
    
      # It's not a proper list if the cdr sin't a cons.
      return 0 if !is_cons(cdr($expr));
    
      # Otherwise, recurse
      return is_proper_list (cdr ($expr));
    }

is_literal

Determine whether expression is a literal.

    sub is_literal ( $ ) {
      my $expr = shift;
    
      return 1 if is_quote_expr ($expr);
      return 1 if is_number ($expr);
      return 1 if is_string ($expr);
    
      return 0;
    }

is_eof

Return true if $expr is the magical (internal) EOF object.

    sub is_eof ( $ ) {
      my $expr = shift;
    
      return $expr == \$EOF;
    }

Type Assertions

This section contains routines which raise an error if the argument is not of hte expected type.

ensure_cons

Ensure that the argument is a cons cell

    sub ensure_cons ( $ ) {
      my $arg = shift;
      die "Expecting a list, got '@{[printable_string($arg)]}'\n"
        unless is_cons ($arg);
    }

ensure_string

Ensure that the argument is a string

    sub ensure_string ( $ ) {
      my $arg = shift;
    
      die "Expecting a string, got '@{[printable_string($arg)]}'\n"
        unless is_string ($arg);
    }

ensure_symbol

Ensure that the argument is a symbol

    sub ensure_symbol ( $ ) {
      my $arg = shift;
    
      die "Expecting a symbol, got '@{[printable_string($arg)]}'\n"
        unless is_symbol ($arg);
    }

ensure_lambda

Ensure that the argument is a lambda

    sub ensure_lambda ( $ ) {
      my $arg = shift;
    
      die "Expecting a function, got '@{[printable_string($arg)]}'\n"
        unless is_lambda ($arg);
    }

ensure_num

Ensure that the argument is a number.

    sub ensure_num ( $ ) {
      my $arg = shift;
    
      die "Expectinga number, got  '@{[printable_string($arg)]}'\n"
        unless is_number ($arg);
    }


Function Translation

This section contains the code used to translate Lisp into Perl.

The workhorse function here is translate_elem() and that's where you should start looking if you want to understand how this works.

Translation is pretty braindead. Instead of using Perl local variables, we emit calls to lookup_var() to read from variables. (Writing gets done by calls to the built-in functions _set and _define.)

We call Lisp functions by generating calls to the function ``call_func'' with the first argument being the lambda to call and the remaining arguments being translated expressions to produce those.

Arguments to Lisp functions are passed a single proper list.

Probably the easiest way to understand all of this is to run some code through tulip with the --show-translated flag. That will print out the translated code.

translate_quoted_list

Given an unquoted list, return a Perl expression that recreates that list. If $quoted is true, the resulting code produces a quoted list rather than an unquoted one.

    sub translate_quoted_list ( $$$ ) {
      my $list = shift;
      my $quoted = shift;
      my $indent = shift;
    
      my $close = 0;
      my $result = "";
    
      lforeach $list, sub {
        my $elem = shift;
    
        ++$close;
    
        $result .= "[";
        $result .= translate_literal ($elem);
        $result .= ", ";
      };
    
      $result .= 'undef' . (']' x $close);
    
      if ($quoted) {
        $result = "\[$result\]";
      }
    
      $result = ('  ' x $indent) . $result;
      return $result;
    }

make_str_constant

Make the given string into a Perl string constant.

    sub make_str_constant ( $ ) {
      my $arg = shift;
      local $_ = ${$arg};
    
      s/\'/\\'/g;
      return "'$_'";
    }

translate_literal

Given a literal expression, produce a perl expression that reproduces it.

    sub translate_literal ( $$ ) {
      my $expr = shift;
      my $indent = shift;
      my $istr = '  ' x $indent;
      my $quoted = 0;
      my $result = "";
    
      # The argument may be quoted or unquoted and we need to handle both
      # cases.  The Usual Way is to strip off the quote, do the
      # translation and then wrap the result with an array if the quote
      # was there.
      if (is_quote_expr ($expr)) {
        $quoted = 1;
        $expr = unquote ($expr);
      }
    
      # We handle lists here, returning if we've got one.  The quoting
      # gets dealt with by translate_quoted_list since this is a somewhat
      # more complicated case.
      if (is_cons ($expr)) {
        return translate_quoted_list ($expr, $quoted, $indent);
      }
    
      # Otherwise, set $result to the translated value.
      if (is_number ($expr)) {
        $result = $expr;
    
      } elsif (is_string ($expr)) {
        $result = '\\'.make_str_constant($expr);
    
      } elsif (is_symbol ($expr)) {
        $result = "'$expr'";
      } else {
        die "Unknown literal type.\n";
      }
    
      # Requote of necessary.
      if ($quoted) {
        $result = "\[$result\]";
      }
    
      # And fixup the formatting.
      $result = $istr . $result;
    
      return $result;
    }

translate_elem

Translate an element in a list. If $want_func is 1, add code to check to make sure the result is executable. This is the entry point for evaluating a single Lisp thing of unknown nature.

    sub translate_elem ( $$$ ) {
      my $element = shift;
      my $indent = shift;
      my $want_func = shift;
      my $result;
    
      $element = expand_macros ($element);
    
      if (is_symbol ($element)) {
        my $func = $want_func ? "_func" : "";
        $element =~ s.\\.\\\\.g;
        return ('  ' x $indent)."lookup_var$func (\'$element\')";
      }
    
      if (!defined ($element)) {
        return "undef";
      }
    
      if ($want_func) {
        die "Expecting function, got '@{[printable_string($element)]}'\n";
      }
    
      if (is_literal ($element)) {
    
        # Since quotedness is intrinsic to the type, we need to strip that
        # off when passing the element as an argument.
        if (is_quote_expr ($element)) {
          $element = unquote ($element);
        }
        $result = translate_literal ($element, $indent);
      } else {
        $result = translate_expr ($element, $indent);
      }
    
      return $result;
    }

expand_macros

Given a function call (or something that looks like one, anyway), see if it's actually a macro invocation and if so, call the macro function on it and return the result. Otherwise, just return the argument.

    sub expand_macros ( $ ) {
      my $expr = shift;
      my $name;
    
      # Non-lists contain no macros
      if (!is_cons ($expr)) {
        return $expr;
      }
    
      $name = car ($expr);
      if (!is_symbol ($name)) {
        return $expr;
      }
    
      if (!defined ($macros{$name})) {
        return $expr;
      }
    
      if ($show_macro) {
        print "Macro $name.  Before:\n", printable_string($expr), "\n";
      }
    
      $expr = call_func ($macros{$name}, $expr);
    
      if ($show_macro) {
        print "After:\n", printable_string($expr), "\n";
      }
    
      return $expr;
    }

translate_expr

Given a list containing a Lisp expression, translate it into Perl.

    sub translate_expr ( $$ ) {
      my $expr = shift;
      my $indent = shift;
    
      my $result;
    
      my $indent_string = ' ' x ($indent*2);
      my $lookup_expr = translate_elem (car ($expr), $indent+1, 1);
    
      $result  = $indent_string . "call_func (\n";
      $result .= $indent_string . $lookup_expr . ",\n";
    
      lforeach cdr($expr), sub {
        my $element = shift;
        $result .= $indent_string . translate_elem ($element, $indent+1, 0);
        $result .= ",\n";
      };
      $result =~ s/\,\s*$/\)/;
      return $result;
    }

translate_progn

Given a list of lisp expressions, generate a function which evaluates them one after the other in order.

    sub translate_progn ( $$$ ) {
      my $progn = shift;
      my $tmp = shift;
      my $indent = shift;
      my $indent_string = '  ' x $indent;
      my $result = "${indent_string}$tmp = do \{\n";
    
      lforeach $progn, sub {
        my $expr = shift;
        my $is_last = shift;
    
        $result .= translate_elem ($expr, $indent+1, 0);
        $result .= ";\n";
      };
    
      $result .= "$indent_string\};\n";
      return $result;
    }

translate_lambda

Given a lambda expression, produce a perl function that performs its actions.

    sub translate_lambda ( $ ) {
      my $expr = shift;
      my $result;
    
      $result = 'sub {
      my $args = shift;
      my $lambda = shift;
      my $result;
    
      make_locals ($lambda, $args);
    ';
    
      $result .= translate_progn ($expr, '$result', 1);
    
      $result .= '
      pop_context;
      return $result;
    }
    ';
    
      if ($show_trans) {
        print "Lambda translation:\n$result\n";
      }
    
      return $result;
    }

check_args

Make sure that $arg_list is a valid lisp argument list.

    sub check_args ( $ ) {
      my @arg_list = flatten (shift);
    
      if ($arg_list[-2] eq '&rest') {
        $arg_list[-2] = undef;
      }
    
      foreach my $arg (@arg_list) {
        if ($arg eq '&rest') {
          die "\&rest flag needs to be second-last argument.\n";
        }
      }
    }

lambda

This is the meat of the lambda function.

    sub lambda ( $$$ ) {
      my ($args, $locals, $body) = @_;
      my $src = translate_lambda ($body);
    
      check_args ($args);
    
      my $perl_code = eval $src;
      die "Internal complile error: $@\nCode:\n$src\n" if ($@ ne "");
    
      return [$perl_code, curr_context, $args, $locals];
    }

show_translated

Translate and print. This is a debugging aid.

    sub show_translated ( $ ) {
      my $expr = shift;
    
      print translate_elem($expr, 0, 0);
      print "\n";
    }


Runtime Support

These are routines that get called by translated code exclusively.

Uh, not much else to say.

make_locals

Create and initialize the current context. This gets called by generated code.

    sub make_locals {
      my ($lambda, $args) = @_;
      my @flat_args = @{flatten($args)};
      my $rest_flag = 0;
    
      # Create the new context.
      push_context ($lambda->[1]);
    
      # Create the formal arguments and bind the actual arguments to them.
      # I use nested blocks (i.e. '{{' and '}}') here so that the 'next'
      # and 'last' will work (albeit not exactly as expected in the case
      # of last).  If this convuses you, type 'perldoc perlsyn' and skip
      # to the section labeled 'Loop Control'.
      lforeach $lambda->[2], sub {{
        my $formal = shift;
        my $is_last = shift;
    
        if ($formal eq '&rest') {
          $rest_flag = 1;
          next;
        }
    
        if ($rest_flag) {
          die "\&rest must refer to the last item in an argument list."
            unless $is_last;
    
          def_var ($formal, mk_list (@flat_args));
          last;  # This does what we want but only because this is the last item
        }
    
        my $actual = shift @flat_args;
        def_var ($formal, $actual);
      }};
    
      # Create the local variables
      lforeach $lambda->[3], sub {
        my $local_name = shift;
        def_var ($local_name, undef);
      };
    }

call_func

Call the function given by the first argument (a lambda) with the rest of the argument list.

    sub call_func {
      my $lambda = shift;
    
      my $num_args = scalar (@_);
      my @formals = @{flatten ($lambda->[2])};  # Blech! O(n) algorithm!
      my $expected_args = scalar (@formals);
    
      my $plural = $expected_args > 1 ? "s" : "";
    
      # If this function can take a variable number of arguments, make
      # sure there are enough.
      if ($expected_args >= 2 && $formals[-2] eq '&rest') {
        --$expected_args;
        if ($expected_args > $num_args) {
          die "Expecting at least $expected_args argument$plural, got " .
            "$num_args.\n";
        }
      } elsif ($expected_args != $num_args) {
        die "Expecting $expected_args argument$plural, got $num_args.\n";
      }
    
      return &{$lambda->[0]} (mk_list(@_), $lambda);
    }


Eval

This is guts of the Tulip 'eval' function.

Not much to it, actually. It just translates and calls Perl's ``eval'' on the code. Easy!

tulip_eval

Evaluate a tulip list as a function

    sub tulip_eval {
      my $expr = shift;
      my $code = translate_elem ($expr, 0, 0);
    
      print "Translated expression:\n$code\n"   if ($show_trans);
    
      my $result = eval $code;
    
      if ($@ ne "") {
        die "Perl Error: $@\nCode: $code\n\n";
      }
    
      return $result;
    }


The Read Function

The function 'read' is what passes for a parser in Tulip. The actual entry point is called ``tulip_read'' although it's accessible from Lisp by the name ``read''.

There are several hooks into read from Lisp:

_read-line
This reads a line of text from its argument, a FileHandle, and returns it. The default implementation is defined via prim() (see below) and is mostly a wrapper around FileHandle->getline.

_next-token
This reads a token from the argument (a FileHandle) and returns a cons cell where the first element is a symbol and the second a string. The symbol is one of 'punctuation, 'name, 'string or 'number and the string is the text of the token.

These may be overridden by user programs in order to wreak all kinds of havoc on the Lisp language. Note that I haven't tested these features yet.

Globals

These routines define 3 global variables. Yes, this is terrible coding style--sue me.

The variables $line_count and $last_file are used to keep track of where in the file we are so that synerr() can report the location. Since these get updated by _read-line() and load() (see below), too much syntactical fiddling could throw these off.

$line_buffer holds some or all of the current line being tokenized. Yawn.

    
    sub read_tolerantly( $ );
    
    my $line_buffer = undef;    # External cache for one line
    my $line_count = 0;            # Current line number fetched by fill_buffer
    my $last_file = "";            # Last file opened by readEval.

synerr

Die, displaying line number and filename information. This is only meaningful if $last_file and $line_count were both set recently.

    sub synerr ( $ ) {
      my $msg = shift;
    
      chomp $msg;
      die "Read error: $msg at $line_count in $last_file.\n";
    }

fill_buffer

If necessary, read the next line from input and store the result in $line_buffer, calling a the _read-line primitive.

    sub fill_buffer {
      my $fh = shift;
      my $result = undef;
      my $read_line = $globals{'_read-line'};
    
      while ($line_buffer =~ /^\s*$/) {
        $line_buffer = &{$read_line->[0]} (mk_list ($fh));
        ++$line_count;
        if (!defined($line_buffer)) {
          return;
        }
      }
    }

mk_tok_pair

Return a pair containing the first argument (a symbol) and the second converted to a Lisp string object (i.e. a reference). This is here because I don't want to return a reference to $1, since that's magical and (I think) likely to change underneath me.

    sub mk_tok_pair ( $$ ) {
      my ($type, $tok) = @_;
    
      return mk_pair ($type, \$tok);
    }

next_token

Read the next token from $fh. This is callable as a primitive.

    sub next_token {
      my $arglist = shift;
      my $fh = car ($arglist);
      local $_;
      my $token = undef;
    
      fill_buffer ($fh);
    
      # Return undef if we've reached EOF.
      return undef
        if (!defined ($line_buffer));
    
      $_ = $line_buffer;
      s/^\s*//;
    
      # If we've reached a comment, clear the line and try again.
      if (/^\;/) {
        $line_buffer = "";
        return next_token ($arglist);
      }
    
      SWITCH: {
          my $kw_re = '^([-_+-~!%^&*|[:alpha:]]+)';
    
        s{^(\'|\(|\))}{}
          and do {$token = mk_tok_pair ('punctuation', $1); last;};
    
        s/$kw_re//o
          and do {$token = mk_tok_pair ('name', $1);        last;};
    
        s/^\"([^\"]*)\"// and do {
          my $tok = $1;
    
          $tok =~ s/\\n/\n/g;
          $token = mk_pair ('string', \$tok);
          last;
        };
    
        s/^(\d+(\.\d+))// and do {
          $token = mk_tok_pair ('number', $1);      last;
        };
    
        synerr "Invalid token near: $line_buffer.\n";
    
      };
      $line_buffer = $_;
    
      return $token
    }

read_quoted

Read a quoted item. If the item is a list or a symbol, return it quoted. Otherwise, just return it. # Note: I'm almost positive I can remove the tests and just quote everything. I even think it might be more correct, but you know what? It's late and I want to get this thing done and play Diablo II for a while, so I'm not going to bother.

    sub read_quoted ( $ ) {
      my $fh = shift;
      my $rest = read_tolerantly ($fh);
    
      if ($rest == $EOF) {
        synerr "Unexpected end-of-file.\n";
      }
    
      # Quoted names are symbols
      if (is_symbol ($rest)) {
        return quote ($rest);
      }
    
      # If it's a list, return it in an array ref, as per the convention.
      if (is_cons ($rest)) {
        return quote($rest);
      }
    
      return $rest;
    }

read_list

Read a list and return it.

    sub read_list ( $ ) {
      my $fh = shift;
    
      my @result_array;
      my $elem;
    
      while (1) {
        $elem = read_tolerantly ($fh);
    
        if ($elem == $EOF) {
          synerr "Unexpected end-of-file.\n";
        }
    
        if ($elem eq ")") {
          return mk_list (@result_array);
        }
    
        push @result_array, $elem;
      }
    
      # If we get here, we've hit the end-of-file without finding a
      # closing paren.  Oops.
      synerr "Missing ')' at end of file.\n";
    }

read_tolerantly

This is the guts of the read function, only it returns ``)'' instead of treating it as an error, since it may have been called recursively.

    sub read_tolerantly ( $ ) {
      my $fh = shift;
    
      my $result = undef;
      my $next_tok = $globals{'_next-token'};
      my $tok = &{$next_tok->[0]} (mk_list($fh));
      my $tok_text;
    
      if (!defined ($tok)) {
        return $EOF;
      }
      $tok_text = ${cdr ($tok)};
    
      my $toktype = car ($tok);
      if ($toktype eq 'punctuation') {
        if ($tok_text eq ")") {
          return $tok_text;
        }
    
        if ($tok_text eq "'") {
          return read_quoted ($fh);
        }
    
        if ($tok_text eq "(") {
          return read_list ($fh);
        }
      }
    
      # A quoted name is just a string, which can be used as a name.
      if ($toktype eq 'string') {
        return \$tok_text;
      }
    
      if ($toktype eq 'name' or $toktype eq 'number') {
        return $tok_text;
      }
    
      synerr "Unknown input.\n";    # I don't think we can get here.
    }

tulip_read

The lisp read function. This expects to be called at the top level of nesting.

    sub tulip_read {
      my $fh = shift;
      my $result = read_tolerantly ($fh);
    
      synerr "Unexpected ')'"
        if ($result eq ")");
    
      # Users don't see EOF.
      if ($result == $EOF) {
        return undef;
      }
    
      return $result;
    }


Primitives and Support Routines

This section contains all of the primitives and the code used to support them and their definitions.

The function 'prim' is used to define primitives. It takes 3 arguments:

    - The number of formal arguments (or -1 if it's a variable number)
    - The public name of the function
    - A reference to the sub implementing the function.

The sub must take one or two arguments. The first argument is the list of actual arguments passed by the calling Lisp routine. The second argument (which is usually ignored here) is the lambda of the function.

Note that there is structurally no difference between a primitive and a lambda at this stage. Both are a Perl function and some extra information and what they do with it is up to them.

Defining new primitives is simple. Just call prim with a reference to a function taking one argument (the arg. list) and you're done.

prim

Routine to define a primitive.

    sub prim ( $$$ ) {
      my ($num_args, $name, $function) = @_;
      my $formals;
      my $lambda;
    
      if ($num_args < 0) {
        $formals = mk_list (qw'&rest args');
      } else {
        $formals = mk_list (split (/ /, 'x ' x $num_args));
      }
    
      $lambda = [$function, $global_context, $formals, []];
      $globals{$name} = $lambda;
    }

printable_string

Given a lisp object, return a string suitable for printing that describes it.

    sub printable_string ( $ );        # Needed for recursive call to parse
    sub printable_string ( $ ) {
      my $arg = shift;
      my $desc;
    
      if (!defined ($arg)) {
        return "nil";
      }
    
      if (is_number ($arg)) {
        $desc = $arg;
    
      } elsif (is_opaque_obj ($arg)) {
        $desc = "<".ref($arg).">";
    
      } elsif (is_string ($arg)) {
        $desc = "\"${$arg}\"";
    
      } elsif (is_symbol ($arg)) {
        $desc = $arg;
    
      } elsif (is_lambda ($arg)) {
        $desc = "<function>";
    
      } elsif (is_quote_expr ($arg)) {
        return "'" . printable_string (unquote ($arg));
    
      } elsif ($arg == \$true) {
        $desc = "t";
    
      } elsif (is_proper_list ($arg)) {
        $desc = "(";
        lforeach $arg, sub {
          my $item = shift;
          my $is_last = shift;
    
          $desc .= printable_string ($item);
          $desc .= " " unless $is_last;
        };
        $desc .= ")";
    
      } elsif (is_cons ($arg)) {
        $desc = "(cons ";
        $desc .= printable_string (car ($arg));
        $desc .= " ";
        $desc .= printable_string (cdr ($arg));
        $desc .= ')';
      } else {
        $desc = "<unknown>";
      }
    
      return $desc;
    }

tulip_print

The lisp print function. It attempts to return a parseable description, although that's not necessarily going to work correctly.

    sub tulip_print ( $ ) {
      my $arg = car (shift);
      my $line = printable_string ($arg);
    
      print $line;
    
      return undef;
    }
    
    prim 2, 'cons',    sub {
      my $arg = shift;
      return [car ($arg), cadr ($arg)]
    };
    
    prim 1, 'car',        sub {return car (shift->[0])};
    prim 1, 'cdr',        sub {return cdr (shift->[0])};
    prim 1, 'cadr',        sub {return cadr (shift->[0])};
    prim 1, 'caddr',    sub {return caddr (shift->[0])};
    prim 1, 'length',    sub {return llength (shift->[0])};
    prim 1, 'last',        sub {return llast (car (shift))};
    
    prim 2, 'set',        sub {
      my $arg = shift;
      my $name = car ($arg);
      my $value = cadr ($arg);
    
      if (!is_symbol ($name)) {
        die "First argument to 'set must be a symbol.\n";
      }
    
      set_var ($name, $value);
    
      return $value;
    };
    
    prim 1, 'print',    \&tulip_print;
    prim 1, 'printnl',  sub {tulip_print(shift); print "\n"; undef};
    prim 1, 'prints',    sub {
      my $arg = car (shift);
      ensure_string ($arg);
      print ${$arg};
    };
    
    prim -1, 'sprintf', sub {
      my $args = shift;
      my @arg_list;
      my $result;
    
      ensure_string (car ($args));
      my $fmt = ${car($args)};
    
      lforeach cdr ($args), sub {
        my $item = shift;
    
        if (ref($item) eq 'SCALAR') {
          push @arg_list, ${$item};
        } else {
          push @arg_list, $item;
        }
      };
    
      $result = sprintf $fmt, @arg_list;
      return \$result;
    };
    
    prim 2, '_define',  sub {
      my $args = shift;
      my ($name, $value) = @{flatten ($args)};
    
      def_global ($name, $value);
    };
    
    prim 3, '_lambda', sub {
      my $args = shift;
      return lambda (car ($args), cadr ($args), caddr ($args));
    };
    
    prim 2, '_macro', sub {
      my $args = shift;
    
      my $name = car ($args);
      my $function = cadr ($args);
    
      ensure_symbol ($name);
      ensure_lambda ($function) if (defined ($function));
    
      $macros{$name} = $function;
    
      return undef;
    };
    
    prim 1, 'eval', sub {
      my $args = shift;
      my $expr = car ($args);
    
      return tulip_eval ($expr);
    };
    
    prim -1, 'list', sub {return shift};
    
    prim 2, '_while', sub {
      my $arg = shift;
      my $test = car ($arg);
      my $block = cadr ($arg);
    
      while (1) {
        my $tf = tulip_eval ($test);
    
        if (!defined ($tf)) {
          return undef;
        }
    
        tulip_eval ($block);
      }
    
      return undef;  # not reached.
    };
    
    prim 2, '+', sub {
      my $arg = shift;
      return car ($arg) + cadr ($arg);
    };
    
    prim 2, '-', sub {
      my $arg = shift;
      return car ($arg) - cadr ($arg);
    };
    
    prim 2, '*', sub {
      my $arg = shift;
      return car ($arg) * cadr ($arg);
    };
    
    prim 2, '/', sub {
      my $arg = shift;
      return car ($arg) / cadr ($arg);
    };
    
    prim 3, '_if', sub {
      my $arg = shift;
      my $tf = tulip_eval (car ($arg));
    
      if (defined ($tf)) {
        return tulip_eval (cadr ($arg));
      } else {
        return tulip_eval (caddr ($arg));
      }
    };
    
    prim 2, '==', sub {
      my $arg = shift;
    
      my $left = car ($arg);
      my $right = cadr ($arg);
    
      return $left eq $right ? $globals{t} : undef;
    };
    
    prim 2, '<', sub {
      my $arg = shift;
    
      my $left = car ($arg);
      my $right = cadr ($arg);
    
      ensure_num ($left);
      ensure_num ($right);
    
      return $left < $right ? $globals{t} : undef;
    };
    
    prim 1, 'read', sub {
      my $args = shift;
      return tulip_read (car ($args));
    };
    
    
    prim 1, 'open', sub {
      my $args = shift;
      my $filename = car ($args);
    
      my $fh = new FileHandle $filename;
    
      return $fh;
    };
    
    prim 1, 'close', sub {
      my $args = shift;
      my $fh = car ($args);
    
      $fh->close;
    
      return undef;
    };
    
    prim 2, 'fputs', sub {
      my $args = shift;
      my $fh = car ($args);
      my $str = cadr ($args);
    
      ensure_string ($str);
    
      print $fh ${$str};
    
      return undef;
    };
    
    prim 2, 'fget', sub {
      my $args = shift;
      my $fh = car ($args);
    
      return $fh->getc;
    };
    
    
    prim 1, '_read-line', sub {
      my $arg = shift;
      my $fh = car ($arg);
    
      if ($fh->eof) {
        return undef;
      } else {
        return $fh->getline;
      }
    };
    
    prim 1, '_next-token', \&next_token;
    
    prim 1, 'quote', sub {return quote (car (shift))};
    prim 1, 'unquote', sub {return unquote (car (shift))};
    
    prim 1, 'string->symbol', sub {
      my $arg = car (shift);
    
      ensure_string ($arg);
      return ${$arg};
    };


Mainline Loop

This is the mainline routine. It's pretty simple. It goes through the argument list, opens each filename on it and evaluates their contents.

It evaluates each expression as soon as it has read it in, so expressions that change reading behaviour (by, say, overriding _next-token) take effect immediately.

readEval

Read, then evaluate all expressions in the file referenced by the given FileHandle.

    sub readEval ( $ ) {
      my $fh = shift;
    
      while (!$fh->eof) {
        my ($expr, $result);
    
        $expr = tulip_read ($fh);
        next if (!defined ($expr));
    
        $result = tulip_eval ($expr);
        if ($verbose) {
          print printable_string ($result), "\n";
        }
      }
    }

load

Open the given filename for reading, then call readEval on it.

    sub load ( $ ) {
      my $fileName = shift;
      my $fh;
    
      $fh = new FileHandle ("< $fileName")
        or die "Unable to open filename $fileName\n";
      $line_count = 0;
      $last_file = $fileName;
    
      print "Reading in $fileName:\n" if ($verbose);
      eval {        # Comment out the eval to get better backtraces from Perl
        readEval ($fh);
      };
    
      if ($@ ne "") {
        synerr "$@";
      }
    
      $fh->close;
    }

go

Mainline routine

    sub go {
      my $file;
    
      for my $fn (@ARGV) {
        print "Loading $fn:\n" if ($verbose);
        load ($fn);
      }
    }
    
    go;