The Beta Blog

Site update

I have now updated jj (the software that runs this blog) to be slightly less broken. It is also now on http://codeberg.org instead of http://github.com, although the latter repository is still archived.

This will be of interest to the zero people who use this software and are not me.

# Posted 2024-10-20 18:20:38 UTC; last changed 2024-10-20 18:17:15 UTC

Loom: A Programming Language

I'm a programming language ~~nerd~~ enthusiast, and one of the ways this manifests itself is in the occasional urge to design a new language. There have been multiple such attempts in my past and I succumbed to the urge again last year.

Here's the result. It's called Loom.

The initial implementation is available here, along with a language reference and library reference.

If you want a more gentle introduction to it, read on.

This post is mostly about the ideas behind it but it can also serve as a wierdly overthought tutorial.

Overview

Roughly speaking, Loom is a dialect of Smalltalk with C++-style syntax. Its goal is to be:

Purely object-oriented in the Smalltalk sense
Homoiconic
Minimal
Transparent
Easy to implement

The core ideas were stolen from Smalltalk when they left the doors unlocked one night while the syntax was accidentally-on-purpose stolen from sclang. I also shoplifted some useful concepts from Ruby, and a few Lisp ideas may also have somehow found their way into my bag.

The three main ideas behind Loom are:

Everything is an object.
Everything is done by sending a message¹ to an object.
Both compiling and running Loom code are simple enough processes that you can hold them in your head.

Running stuff

If you clone and (successfully) build the sources linked above, you'll have the Loom interpreter. When run with no arguments, it will drop into an interactive session (aka a REPL):

$ ./src/loom
Loom REPL. Hooray!

> 3 + 4
7

(It helps to have rlwrap installed; the script in src/loom will use it if it's available.)

You quit it with an EOF character (CTRL+D on *nix).

And if you run it with a Loom program, it will attempt to execute it, as one does:

$ ./src/loom examples/sieve.loom 40
Solving up to 40...
Primes up to 40: 
    2 3 5 7 11 13 17 19 23 29 31 37

Okay, onward to the language itself. Let's start with some basic stuff:

Basic Stuff

Comments begin with # or // and go to the end of the line:

// comment
# also a comment
2 + 3;  // comment after a statement

(I just couldn't pick a favourite.)

Numbers and strings are as you'd expect from a C-style syntax:

123                     # Decimal integer
420.69                  # Decimal float
0xBADCAFE               # Hex integer
0b1011                  # Binary integer

"Hello!\nworld!\n"      # Some C-style escapes are supported

There's also syntax for creating symbols and vectors. These look like literals but aren't:

[1, 2, 3]               # Vector (i.e. array)
:foo                    # The symbol 'foo'

Symbols represent names within the system, just like in Lisp, Smalltalk, Ruby, and other right-thinking languages.

And Vectors are what I call arrays, because I'm pretentious. And also because I'm reserving the word Array for a possible future type that's more primitive. (Vectors can be resized in place; arrays can't. In the future, I may want to implement Vectors around Array instances so I don't need to use the host system's types. But I digress.)

Names (used for variables and methods) follow the standard C convention:

foo_bar_quux
_fooBarQuux42Baz

That is, anything matching the regexp /^[_a-zA-Z][_a-zA-Z0-9]*/.

However,

Any variable name beginning with an upper-case letter is a constant:
```
def Pi = 3.14159
```
(This also works for method arguments and locals; the latter isn't useful and is kind of a bug.)
There are a few well-known constants (Self, True, False, Nil, and Here) whose lowercase names (self, true, false, etc.) are reserved by the parser and expanded into their upper-case versions. So (e.g.) nil is just another way of writing Nil.
Any character (almost) can be part of a name if it's quoted with backtick characters (`):
```
`$20, same as in town` = 20;
PoliteObject.new.`please initialize this instance`();
```

This last property leads to some clever hackery we can do with syntax.

Message syntax

Sending a message to an object follows the usual C++-style syntax we know and love:

object.message(arg1, arg2, arg3)

Since this is the only thing you can do, Loom coding (and reading) would normally be a huge slog. We work around this in a number of ways, mostly by fiddling with the syntax.

For example, consider some basic arithmetic:

3.mult(4).add(1);

We can (and do) use backticks and give the methods more operator-like names:

3.`*`(4).`+`(1);

and this is slightly better but still not very readable.

So the parser² treats any token made of operator characters as implying the `` .`...` `` part. And if there's no open parenthesis token (`(`) afterward, it treats this as equivalent to taking the next token and wrapping the parens around that.

Which means the above can be written as

3 * 4 + 1;
3 + (4) + 1;
3 + 4 + (1);

This also means that Loom doesn't need brackets³. If you need to change the order of evaluation, you can use the argument list parens:

a + (b * c) + (d * e);

// Equivalent to
a.`+`(b.`+`(c)).`+`(d.`+`(e));

We do other things with syntax. If a (non-operator) message has no arguments, it's safe to leave off the trailing parens:

b = a.foo;

So getter methods are basically free. For setters, the parser looks for a trailing = token and if it finds it, first renames the message to have a trailing underscore and then passes the expression after the = as its argument. The following are equivalent:

b.foo = bobo.count + 1;
b.foo_(bobo.count() + 1);

Semi-related, C-style array access syntax gets expanded into at and atPut message sends:

x = a[n + 1];
// is equivalent to
x = a.at(n + 1);

x[n + 1] = 42;
// is equivalent to
x.atPut(n + 1, 42);

So vector access looks the way you'd expect it to, but so does the (very slow) Dictionary class. Anything that implements at and atPut can be accessed with this syntax.

Okay, onward to the deep end.

Quoting

Loom does Lisp-style quoting. You mostly don't need to worry about it unless you're poking around the internals, but as I intend to do just that, this is necessary.

The syntax for a quoted expression is the expression surrounded by special brackets :( and ). For example:

x = :( foo );

Quotes keep the things between the brackets from being evaluated. So in the above snippet, x gets the symbol foo instead of the value of the variable named foo.

(And yes, the :foo syntax above is just shorthand for :(foo).)

Most Loom objects just evaluate to themselves, so quoting them has no effect. The exceptions are symbols (as above), message send expressions, and quoted expressions themselves.

There's one extra bit of quote-related syntax. A Vector expression prefixed with a colon (:[ instead of [) is equivalent to quoting each element of the vector. The following are equivalent:

:[a, b, c]
[:a, :b, :c]
Vector.with(:a, :b, :c)

Quotes end up being vital for a lot of metaprogramming-related things, and since the underlying machinery of Loom is already based on metaprogramming, we need them.

(I initially tried to avoid adding this feature. I thought I could simply decompose each object into an expression that recreated it, so that (e.g.) :foo would expand to "foo".intern. This might be viable, but debugging any kind of metaprogramming was a nightmare problem that Quote mostly removed.)

Objects and Classes

Now, let's talk about object-oriented programming. Here's a class definition:

def ContactInfo = Object.subclass(:[name, address, work_phone,
                                    home_phone, tags]);

Let's start to the left of the =.

The def keyword defines a global constant, ContactInfo and assigns the result of the expression after the = to it. (def is syntax that expands to a call to Here.defglobal(...). I'll get to Here later.)

To the right, we see Object. This is the root class which, like all other classes, is an object. Its method subclass creates the new class and its instance variables (aka slots) are defined by the array of symbols subclass receives as its first argument.

Most classes have an initializer method (constructor in C++-speak):

ContactInfo::initialize = { | name_arg |
    name = name_arg;
    tags = [];
};

This is an ordinary method but it gets called by the class's instantiation method, new; its arguments (the name(s) between the | characters) are all passed to initialize:

def Ringo = ContactInfo.new("Ringo Starr");

Instance variables are private to the object, so to get at them from outside, we'll need to add a getter and setter method:

ContactInfo::name = { return name };
ContactInfo::name_ = { | new_name | return name = new_name };

(Recall that something like this:

Ringo.name = "Richard Starkey";

gets expanded to a call to name_, the setter.)

Loom actually has built-in shorthand for this (and also the read-only and write-only variants), so you'll rarely need to write them by hand.

ContactInfo.accessible(:address);
ContactInfo.accessible(:work_phone);
ContactInfo.accessible(:home_phone);

Methods can also take variadic arguments:

ContactInfo::tag = {|*all_tags|
  tags = tags + all_tags;
}
Ringo.tag(:ringo, :the_best_drummer_in_liverpool);

They also (obviously) have local variables, declared between a second, optional pair of pipe (|) characters:

ContactInfo::set_field_count = { ||
    | sum |
    sum = tags.size;
    [name, address, work_phone, home_phone].each{|fld|
        fld.is_nil.not .if { sum = sum + 1 }
    };

    return sum;
};

In this case, we need to also specify an empty argument list. However, it's safe to omit empty argument lists if the resulting code is unambiguous. (This is any case except for when there are temporaries but no arguments.)

We can also add methods to individual objects:

Ringo::*is_pete_best = { return false };

This includes classes:

ContactInfo::*new_beatle = { return self.new("Paul McCartney") };
def Paul = ContactInfo.new_beatle;

All of this sytax for defining new methods expands into ordinary message send expressions. For example, this

ContactInfo::dial = { ... }

expands into something like this

ContactInfo.inner_add_method(:dial, ...);

So all of this is available for metaprogramming.

Sending Messages

In addition to the language's message-send syntax, message-based languages typically provide a way to programmatically send a message to an object. This is typically done by method(s) of the base class that take the name and message arguments as their own arguments, then send them and return the result. This is how Loom does it as well.

In Loom, there are two methods of class Object: send and sendv.

send is a variadic method whose first argument is the message name (a symbol) and the remaining argument are passed to the message. For example,

3.send(:`+`, 4)             # 7

This is equivalent to either of

3 + 4
3.`+`(4)

But because the message is an argument, we can compute it:

msg = self.select_at_random(:[`+`, `-`, `*`, `/`]);
3.send(msg, 4)              # ???

sendv is like send, but not variadic. Instead, it takes exactly two arguments where the second is a vector containing the message's arguments. With sendv, the above examples would look like this:

3.sendv(:`+`, [4])          # 7

msg = self.select_at_random(:[`+`, `-`, `*`, `/`]);
3.sendv(msg, [4])           # ???

This is important because, while Loom methods can take variadic arguments, there is currently no other way to unpack a vector of arguments into an argument list the way (e.g.) Ruby's * prefix does.

In the future, something like this will probably work

args = [];
// ...append arguments to args...
thing.msg(*args);       // Not implemented yet

but for now, you'll need to use sendv:

args = [];
// ...append arguments to args...
thing.sendv(:msg, args);

The Machinery of Objects and Classes

Under the hood, the Loom object system is actually (crudely) prototype-based, by which I mean that 1) objects have their own method dictionaries and 2) can delegate method lookup to one or more other objects.

In practice, it isn't a very good prototype system, but there's enough there to use as the basis for a powerful class-based object system.

The core idea behind this is that we have a special kind of object called a trait. Traits are ordinary objects with the usual method dictionary (and delegate list), but they also have a second method dictionary/delegate list pair. (We call these inner methods and delegates.)

If an object has a trait as a delegate, the trait's inner dictionary (and inner delegate list) will be used instead of the usual (outer) one.

This gives us the foundation for classes and the rest is just library code implementing common-sense conventions. In Loom, a class is just a trait that:

Provides the method new (to create new instances).
Provides a method named slots that returns the list of instance variables.
Provides the method subclass to create a subclass.
Is part of the common class heirarchy rooted at Object, using its first inner delegate slot as the superclass.

Items 1, 2 and 3 are provided by the metaclass Class, which serves as the class of all named classes (including Class itself) and item 4 is de-facto enforced by method subclass since all objects that provide it are already in the heirarchy.

Traits also give us mixins (which I call AddonTraits for dumb reasons):

def Boopable = AddonTrait.new;
Boopable::boop = { "Booped.".println };

def BoopableContact = Contact.subclass([], Boopable);

BoopableContact.new("George Harrison").boop;

These can be mixed into new classes by passing them to subclass after the slot list.

Blocks and Control Flow

Loom, like Smalltalk, has easy lambdas (called blocks here⁴), and as in Smalltalk and Lisp, they're used for flow control.

(By lambda, I mean an anonymous function that has access to the (possibly local) scope in it was defined.)

You normally define a block with braces, just like method bodies, and you invoke it with the call method:

blk = {"***block body***".println};
blk.call();             # "***block body***"

Blocks can (but don't have to) take arguments and define local variables:

add = {|a, b| |result| result = a + b; result};
add.call(3, 4);         # 7

And they capture their local context:

Thing::counter = {||
    |total| 
    total = 0;
    return { total = total + 1; total }
};

def x = Thing.new.counter;
x.call;               # 1
x.call;               # 2
x.call;               # 3
x.call;               # 4

If you're familiar with Lisp, Ruby, or Smalltalk, this is old hat to you. (If not and I just blew your mind, feel free to take a moment.)

Loom uses blocks for nearly all flow control. For example, the if statement is implemented by adding methods to the Boolean types:

def Boolean = Object.subclass([]);
def True = Boolean.new;
def False = Booelan.new;

True::*if = {|body| return body.call()};
False::*if = {|body| return false};

Since all boolean operations return True or False, something like this

a > b .if { "a is bigger!".println };

works as expected. If a > b returns True, it will invoke True's if and that will evaluate the block. If it returns False, it will instead return False's if, which does not.

Short-circuited AND and OR operations work in much the same way:

a > b && { self.is_really_better(a, b) } .if { self.do_thing(a) };

(Aside: the parser will treat one or more blocks following an ordinary message send as arguments for that message. So the following are equivalent:

a.b({1}, {2});
a.b({1}) {2};
a.b() {1} {2};
a.b {1} {2};

Which can make the code look a bit cleaner. In the case of the && operator, normal parsing rules apply; there's an implicit pair of parents around the first block.)

The foreach loop's equivalent is provided by the Vector method each (by way of a mixin named Enumerable):

[1,2,3,4,5].each{|n| n.str + "," .print }   # 1,2,3,4,5

We also have the usual other map/reduce/etc methods:

[1,2,3,4,5].map{|n| n*n}                    # [1, 4, 9, 16, 25]
[1,2,3,4,5].select{|n| n*2 > 4}             # [3, 4, 5]
[1,2,3,4,5].inject(0) {|sum, n| sum + n}    # 15

And the for loop's equivalent is the same thing, but over an object (class Range) that pretends to be an array of increasing integers:

1 -> 5.each{|n| n.str + " " .print }
1 2 3 4 5

And the typical while loop is just as easy. All it needs is... um...

Okay, fine, while is a built-in method of Block written in C++.

You call it like this:

{n < 5} .while { n = n + 1 ; n.str + " " .print }

How Methods Work

As mentioned above, the brace-delimited function syntax ({ ... }) is syntactic sugar expanded by the parser into a set of message sends. It makes some sense to think of it as a fancier form of the quoted array expression. That is, something like this

{ a + 1; b + 2 }

expands to something a lot like the expansion of

:[ a + 1, b + 2 ]

(This is before we talk about the arguments and local variables, of course. Also, the parser treats semicolons as separators but will forgive extras more easily.)

The missing piece of this is what happens when you quote a Loom message send:

:( a + 1 )              # a.+(1)
:( a + 1 ).class        # MsgExpr

That's right, there's a class representing a message send expression. It looks like this:

def MsgExpr = Object.subclass(:[
    receiver,       # The expression to the left of the "."
    message,        # The message, a symbol
    args            # The vector of argument expressions
);

A method body is just an array of these (or symbols, or other objects), and a trivial Loom interpreter looks something like this:

Evaluator::eval_obj = { | context, obj |
    obj.class == MsgExpr .if { ||
        | receiver, args |
        receiver = self.eval_obj(context, obj.receiver);
        args = args.map{|arg| self.eval_obj(context, arg) };
        return receiver.send(obj.message, args);
    };
    obj.class == Symbol .if { return context.lookup_name(obj) };
    return obj;
};

Evaluator::eval_method_body = { | context, method_body |
    method_body.each{|expr| self.eval_obj(context, expr) };
    return context.lookup(:Self);
};

There are more fiddly little details to it than that, but this is the core idea.

If you quote a block definition,

:( {2+3} )

you'll get something like this:

ProtoMethod.new([], nil, [], :[2.+(3)], nil).make_block(Here)

Which is to say that you're getting a little bit more than just a list of expressions. Block (and method) definitions expand into an instance of class ProtoMethod, which looks like this:

def ProtoMethod = Object.subclass(:[
    args,       # Vector of formal arguments
    restvar,    # nil or the name of the variadic argument list
    locals,     # Vector of local variable names
    body,       # Vector of expressions that make up the method body
    annotation  # nil or a descriptive string intended for error messages
    );

The first three arguments get filled from the argument and local variables list and the fourth is the actual method body.

This could be interpreted as a method or block by something like the Evaluator example above. However, in the actual implemention, methods and blocks are opaque internal C++ structures that are easy to access from the actual (C++) evaluator. ProtoMethod serves as the intermediate step. Actual methods are created by a pair of build-in methods, make_block and make_method. This is analogous to how Lisp's lambda converts several lists into a callable function.

So there's nothing stopping you from constructing a ProtoMethod.new(...) expression programmatically and turning it into an executable object.

(You can also get just the ProtoMethod by prefixing your Block declaration with a colon:

:{2+3}          # ProtoMethod([],nil,[],[2.+(3)],nil

This is occasionally useful.)

There's one subtle gotcha here, though. Loom requires you to declare variables before you use them. This is a guard against typo-based bugs and also makes the scopes of names unambiguous.

So this method definition will result in an error:

def Thing = Object.subclass([])
Thing::bar = { return some_undefined_variable }

But this one won't:

Thing::foo = { |a| a .if { return another_undefined_variable } }

The reason for this, if you think about it for a few moments⁵ is pretty clear. The inner block expands into an expression like ProtoMethod.new(...).make_block(...). That is, not a function, but an expression that will create the function. So the method doesn't touch any undefined variables at all. It's only when it gets run and tries to define the block that it does something wrong. Which is, of course, far too late for our purposes.

And because the whole thing is just done with ordinary(ish) objects and methods, it's not like I'll always be able to guarantee the name correctness of a block or method. So I've kind of painted myself into a corner, haven't I?

Well, not really. Every brace expression gets expanded into something static enough that it's relatively straightforward to search it for undefined names⁶. So this is what we do.

If you're doing something clever with ProtoMethods like creating them programmatically, the system (probably) won't help you, but at that point, undefined names are the least of your worries. For ordinary blocks and methods, the Loom will give you a warning (upgradeable to error) if you get a name wrong.

`Here`, or How Variable Assignment Works

The thing I've mostly skirted around so far is how variable assignment works in Loom. You'll recall that <reverb>Everything Is Done With Message Sends</reverb>. Most things are easy enough to do that way, but variables aren't objects so you can't send them messages.

In Smalltalk (and Lisp), variable assignment is one of the few things that still needs to be done by its own top-level thing instead of calling a method or function. Finding a way to do this was a shower problem for me for a while, and when I hit on this idea, it was enough to inspire me to actually build a language around it.

It goes like this:

Each context has a local constant named Here (aliased to here) that references the context itself⁷. Here's class (Context) provides methods to access its names or those of outer scopes according to the expected scoping rules. The method set does the latter.

The parser simply expands conventional variable assignments into here.set(...) message sends:

foo = bar + 1;                  // This...
here.set(:foo, bar + 1);        // ...becomes this.

here.set follows the same scoping rules that the evaluator uses when looking up variables (current block, outer blocks, method, object, and global) and stores the value in the appropriate namespace and slot.

As with blocks, this means that you can defeat the compile-time checks for undefined names if you're overly clever:

here.set("unknown_" + "variable" .intern, 42)

And that's fine. The name checking really only cares about likely accidents, which means the boring infix-style assignment you get from the syntactic sugar. That's where the name typos you don't expect will come from.

But having here as the way to access your local scope gives you all kinds of extra flexibilty. Consider this little debug printf method you can monkeypatch onto Context:

Context::pvar = {|name|
    self.has(name) .if {
        name.str + "=" + (self.get(name).str) .println
    };
};

Now if you want to print a variable, you can just do

Bar::do_thing = {
    // ...
    here.pvar(:a);
    // ...
}

and you'll get a nicely-formatted message.

Odds and Ends

How `return` works

The final bit of syntactic magic is the return statement. Like everything else, it's syntactic sugar wrapping a message send. Specifically, it invokes Context::return.

Here's a typical method with a return statement:

Bar::thing = { |a| return a + 1 }

The return a + 1 part expands to:

here.method_scope.return(a + 1);

method_scope returns the Context belonging to the current method call⁸. This is important because we expect return to operate at the method level:

ContactInfo::dial = {
    self.location.time_of_day < (Time.noon) .if { return nil };
    return self.really_dial;
};

That is, we expect the return after the if to cause dial to return before the next expression (calling really_dial). If return nil had expanded to here.return(nil), it would only have exited from the block itself and not the method.

This mechanism can be (ab)used in clever ways. For example, this method

Thing::quux = {
    {
        {
            Here.outer.return(42);
            return "nope";        // skipped
        }.call;
        return "also nope";       // skipped
    }.call;
    return "Yup";                 // run
};

will return the string "Yup because the innermost return will cause the outer two blocks to also exit and let control flow fall to the next statement.

Doing stuff like this is generally a bad idea, but it illustrates how powerful Context::return can be. Future versions of Loom may add extra control statements (e.g. break and continue) built on this stuff.

Exceptions and Ensure

Loom also has exceptions. They got added late to the process, just because it made it so much easier to write tests for failing conditions.

Initially, Loom had a Context method named fail, which quit the program with a message. That worked well enough for a while, but the tests got increasingly awkward so I added catchable exceptions.

Here's an example:

{   
    here.throw("Some error")
}.catch(String) {|e| 
    "Caught exception '" + e + "'" .println;
};

And it does pretty much what you expect. Block::catch is like call, except that if Context::throw is called with an object whose class matches⁹ catch's first argument, calls its second argument with it and execution continues from there on.

It probably would have been possible to write this in pure Loom using Context::return, but currently it's just two native C++ functions with about 25 lines of code.

In an earlier draft of this post, the next couple of paragraphs talked about how this exception system was pretty weak overall. The underlying problem is that there's no way to guarantee that cleanup code will run after an exception the way Java does with finally or Ruby does with ensure, leading to all kinds of hard-to-track-down errors.

But then, I asked myself how hard it would really be to just fix that rather than document the failings. So I tried it, and it took maybe half a day to implement.

Here's an example of the feature:

{
    fh = File.new(filename);
    self.process(fh);
}.callAndEnsure {
    fh.close();
}

The block argument (in this case, containing fh.close()) is always called after the receiving block exits, regardless of how. It can throw an exception or do a return or just run to the end.

You can also combine it with exception catching, as one does:

{
    fh = File.new(filename);
    self.process(fh);
}.catchAndEnsure(ProcessingException) { |e|
    return nil;
} {
    fh.close();
}

Both of these methods are written in pure Loom, by the way. The undelying machinery is provided by built-in method Context::ensure. This takes a block and evaluates it just before the context returns.

Here's the source code for catchAndEnsure to illustrate this:

Block::catchAndEnsure = {|klass, handler, ensure_block|
    here.ensure(ensure_block);
    return self.catch(klass, handler);
};

The ensure block gets attached to the method's here instead of the call to self, but that's good enough. After self.catch(...) exits, here will also always return so ensure_block will also be evaluated.

Bypassing Overridden Methods (i.e. super)

I ended up writing a lot of Loom code before the first time I needed to be able to call a superclass's version of a method the current object had overridden. Which surprised me; I'd assumed that I'd need it much sooner than that¹⁰.

But I did need it, and it was unexpectedly tricky to figure out how to do it without any magic.

tl; dr, I ended up adding it Context as a method named super_send.

This works just like self.send but the method search starts at the superclass of the class that defined this method. (Not self; it's possible that it's already inherited this method, so the method determines the starting point.)

Here's an example:

Thing::blatt = {|x| return here.super_send(:blatt, x); }

And, symmetrically with Object::send, there's a sendv version:

Thing::blatt = {|x| return here.super_sendv(:blatt, [x]); }

The reason it belongs to Context is because at the time of the super_send call, here is the only well-known object that knows both self and current method.

Final Thoughts

Loom is the first language I've designed that I actually want to use.

Most of my experiments in language design are successful in that they produce a result, but that result has usually been, That wasn't a good idea after all. Loom didn't do that.

The tooling is awful, libraries are nonexistant, and the whole thing runs at geological speeds. And yet, it's fun to write Loom code. Writing runtime code was almost always easier in Loom than in C++. This despite the fact that I have an extremely good C++ toolchain with astoundingly good debugger support.

It's been fun even when I did really complex things. Figuring out if a Block uses an undefined name was difficult and required a lot of thought and iterative design, but doing that in Loom was not only possible but easier.

So that's how I'll end it. Loom doesn't suck. I'm as surprised by this as you are.

If you're unfamiliar with the Smalltalk concept of sending a message to an object, just mentally replace the term with calling a method. That's close enough for our purposes. ↩
Calling it a parser is perhaps overly generous, but it's the thing that turns text into internal data structures, so there we go. ↩
I had planned to add brackets and full BEDMAS infix evaluation order rules, but I found that just this and the other little bits of syntax were enough. ↩
This term was also stolen from Smalltalk. ↩
This is totally something I saw coming from the start and didn't catch me by surprise long after the basic Loom system was up and running. ↩
Fun fact: the code that does this is written in Loom itself. It also serves as an optimizer because it will evaluate the ProtoMethod.new(...) expressions ahead of time. ↩
Smalltalk also has Here (named thisContext, though) but doesn't take the next step of using it for variable assignment. ↩
Or nil, if there isn't one. That's not really possible on the current C++ implementation, though, since each input expression gets turned into its own wierd mutant unnamed method. ↩
By which I mean, is an instance of the class passed to catch or one of its subclasses. ↩
You would think that decades of writing OOP code would have been the hint, but nope. ↩

# Posted 2023-11-19 20:25:49 UTC; last changed 2023-11-19 21:35:40 UTC

Low-effort Retrocomputing

So the other day, I wondered if anyone had put up pre-installed disk images for any of the really old Linux distros. I found installation media at archive.org and this blog post. with (excellent) instructions on how to do it, but nobody had done the work (so I wouldn't have to) and put it up for download.

So I took a run at it and installed Slackware 3.0 (from 1995) on a QEMU disk image:

Screenshot of the Slackware 3.0 disk configurator

You can download it here. The archive's sha256 hash is

28be1e75f8c5b8e9338f18589549ebc871e6daae3d4a7433826684d0cae446d1

(Please be considerate of my bandwidth.)

The image has two accounts: root and a user account named bob; both have the password slack. It boots into X11 but you can get to a text console by pressing Ctrl+Alt+1 if you need to. Note that the console's termcap seems to be a bit messed up.

Networking works, but there's no web browser or ssh client. There's a C compiler though (I installed everything) so you should be able to build period-appropriate ssh from sources if you need to. There are also Linux builds of Netscape 3 on the 'net, although I have no idea if they'll run on Slackware.

Configuring X took some hand-fiddling. You can see my work in /etc/XF86Config and compare it to the original generated config file /etc/XF86Config.orig if you want.

Anyway, feel free to download and play with it if you're curious or nostalgic or want to do pre-1995-tech dev jam.

# Posted 2023-01-19 16:43:08 UTC; last changed 2023-01-19 16:46:42 UTC

Your Sucks Programming Language Favourite

Much to my chagrin, I've found myself lately becoming a defender of C++. People who a) know me and b) appreciate irony should feel free to smirk right about now.

To be fair, modern C++ has improved significantly, reaching rarified hights of not-badness only dreamed of fifteen years ago. But that's kind of beside the point. When you choose a programming language for a project, the quality of the language itself is often less important than external stuff; the quality of available implementations, tools, research, etc.

If I'm going to bet my (hypothetical) business on investing a zillion dollars to write a program that I can then sell, I want to know that:

The development tools aren't going to rot or disappear because the vendor lost interest (e.g. Visual Basic).
I'll be able to hire skilled developers whenever I need to.
Good quality tools, books, training, etc. will all be available when I need them.

(And as a developer, I want to bet my non-hypothetical livelihood on developing the skills that are most likely to keep me employed. Being a really badass Haskell programmer doesn't really do much for my job search¹.)

So let's concede that Rust (for example) is a better language than C++. C++ will still be a better choice for most commercial ventures in that space because it has:

Multiple high-quality implementations, two of which are FOSS²
A huge selection of high-quality third-party tools
An enormous community of developers with whom you can exchange knowledge
A literal half-century of concerted research on how to use it effectively

C++ sucks in a variety of ways but we know exactly how it sucks and how to work around it. Rust's suckage is still unknown, and I want the thing that keeps me from being homeless³ to have a really good track record.

And this principle applies to Scala-vs-Java, Zig-vs-C, Haskell-vs-anything, anything-please-anything-vs-PHP or any other language debate. $FAVORITE_LANGUAGE may be a better language than $CHOSEN_LANGUAGE but that doesn't mean it's going to get the job done better, faster, cheaper or more reliably. All of that depends on the entirety of the language's ecosystem.

Note that I'm not saying don't use $FAVORITE_LANGUAGE. Just be aware of what's riding on that decision. For a hobby project or an in-house tool that took a month to write, it's going to be fine. But for the hundred person-year project the business depends on? I mean, I'd really like $FAVORITE_LANGUAGE to be viable in ten years but I'm not going to bet the mortgage on it.

Also, you should go out and learn all kinds of programming languages--especially wierd ones that will never fly in Industry--because it will make you a better programmer. I got a lot of benefit as a C programmer from asking myself, How would I do this in Smalltalk?

I'm a programming language nerd. I've spent a lot of time thinking about how languages work and how they make people think about programming. I learn new languages for fun and I've designed and implemented several. So I absolutely get the desire to use better languages and the frustration of having to deal with the broken status quo. In a perfect world, we'd all be using Smalltalk.

Unfortunately, our world is fallen and so C++ is a necessary evil.

Okay, that's an exaggeration. A good hiring process will recognize that Haskell skills are often transferable to whatever the company is using. Unfortunately, a lot of otherwise-fine employers have terrible hiring processes and will reject any résumé not listing the exact version of their preferred web framework. As those companies have money they will exchange for relatively pleasant work, I would like to retain the option of working there. ↩
But if $FAVORITE_LANGUAGE is FOSS, that means it will be available forever! No, not really. If nobody else is working on it, you'll find yourself having to maintain the toolchain by yourself. At that point, it's almost always easier to just rewrite your program in something else. ↩
Yeah, yeah, I know; the real problem is Capitalism. ↩

# Posted 2021-07-04 17:16:50 UTC; last changed 2021-07-04 18:09:16 UTC

Getting the Singleton Class of a BasicObject in Ruby

Ruby objects provide the method singleton_class which returns the object's singleton class. Unfortunately, BasicObject doesn't have this because it's Object's superclass. So to get it, we need to be somewhat clever.

And having spent way too much time figuring out how to do this, I'm writing it here so a) that I don't lose it again and b) so that others will have less trouble than me. (I'm not on Medium so, um, hello from the fourth page of your Google search results.)

TL; DR, How do I do it?

In an instance, you'd do something like this:

obj = BasicObject.new
obj.instance_exec(obj) {
  class << self
    lself = self
    self.define_method(:my_singleton_class) { lself }
  end
}

Notice how I copy self to lself on line 4. That's because self will have changed when the method is called but the block that forms the body of my_singleton_class captures the local variable.

Also: this won't work on Ruby versions from sometime before 2.7 because define_method is private before then; see your version's Module documentation for define_method for a hacky workaround if it's too old.

Doing this with a BasicObject subclass is even simpler:

class Thingy < BasicObject
  def my_singleton_class
    class << self
      return self
    end
  end
end

What's it good for?

Any case where you want an object to handle a method call by doing something other than call a method. For example, a DSL or a proxy object that forwards the call to something else.

Typically, you'd create a class with no methods, then implement method_missing to catch the failing method lookups and do the right thing with them.

class Proxy
  def initialize(target) @target = target;   end
  def method_missing(name, args)
    log "Called #{name} with #{args}"
    return @target.send(name, args)
  end
end

BasicObject is the ideal base class for this because it has very few methods but if that's not enough–if you need to get rid of those few as well–you can always override (most of) them with a method that calls method_missing directly. This is straightforward when creating a subclass but there are times when it's necessary or easier to add methods to the object instead, and for that you need to get the singleton class.

In my case, I'm writing a DSL where every method whose name starts with a letter is valid; this means they all need to turn into calls to method_missing.

(Handling the case where the user uses method_missing as a name in the DSL is left as an exercise to the reader.)

What's a singleton class anyway?

So normally in OOP, an object is an instance of a class and this is the case with Ruby as well:

[]                          # => []
[].class                    # => Array
[].class.class              # => Class

But, when Ruby creates an object from a class, it also first creates another anonymous class called the singleton class. This gets inserted in the new object's inheritance heirarchy: that is, the singleton class becomes a subclass of the new object's class and the object becomes an instance of the singleton instead of the original class.

x = []                          # => []
x.class                         # => Array
x.singleton_class               # => #<Class:#<Array:0x00007fbc862d41b0>>
x.singleton_class.superclass    # => Array

This is how you can add methods to individual Ruby objects: you're actually defining them in the object's singleton class.

Fun fact: singleton classes are also objects and thus have their own singleton classes:

x.singleton_class
    # => #<Class:#<Array:0x00007fbc869b0060>>
x.singleton_class.singleton_class
    # => #<Class:#<Class:#<Array:0x00007fbc869b0060>>>
x.singleton_class.singleton_class.singleton_class
    # => #<Class:#<Class:#<Class:#<Array:0x00007fbc869b0060>>>>

This can go as deeply as you want it to.

The reason Ruby doesn't immediately fill up all available RAM with singleton classes and then die is because they are not created until the first time a program uses them. As a result, most objects don't have singleton classes at all.

Isn't this whole singleton class thing kind of overkill?

Not really.

See, Ruby is a language where everything is an object (in the OOP sense of the term), and so this means that classes are also objects. But since all objects have classes, that means each class is also an instance of a class. And so is that class. And this is if we ignore the singleton classes, which we are for the moment.

So how does this end? Well, it's pretty boring actually. Each class is an instance of the class named Class, including Class itself. Class is an instance of itself and that's all we really need.

[]                          # => []
[].class                    # => Array
[].class.class              # => Class
[].class.class.class        # => Class
[].class.class.class.class  # => Class

But wait! How do we do class methods or class instance variables:

class Thing
  def self.instance
    @instance = Thing.new unless @instance
    return @instance
  end
  # ...etc...
end

In Smalltalk, this gets done by giving each class object its own distinct class (the metaclass) to hold the methods and variable declarations. They are unnamed but you can get it with the class method just like Ruby. The metaclass's inheritance tree mirrors the class's tree (i.e. if Item is derived from Thing, then Thing.class is derived from Item.class) with class Class as the abstract base class of the heirarchy.

t class.                                => Thing
t class superclass.                     => Object
t class superclass superclass.          => nil

t class class.                          => Unnamed class ('Thing class')
t class class superclass.               => Unnamed class ('Object class')
t class class superclass superclass.    => Class

All metaclasses are instances of the class Metaclass:

t class.                                => Thing
t class class.                          => Unnamed class ('Thing class')
t class class class.                    => Metaclass

This includes Metaclass itself, which is how the loop closes:

Metaclass class                         => 'Metaclass class'
Metaclass class class                   => Metaclass

(Disclaimer: I've somewhat simplified the above. I also haven't run it.)

In Ruby, each Class instance (i.e. class) has a singleton class that holds the class methods and variables. That is, singleton classes serve as metaclasses. The nice thing about this is that it's a generalization of what Smalltalk does for classes, and it gives you instance methods for free.

This is not to say that it's necessarily a better way than Smalltalk's. There are advantages and disadvantages to each approach but I'm far too lazy to write about them here.

# Posted 2021-05-07 01:55:31 UTC; last changed 2021-05-07 01:57:49 UTC

The Beta Blog

I have a cunning plan...

Site update

Loom: A Programming Language

Overview

Running stuff

Basic Stuff

Message syntax

Quoting

Objects and Classes

Sending Messages

The Machinery of Objects and Classes

Blocks and Control Flow

How Methods Work

`Here`, or How Variable Assignment Works

Odds and Ends

How `return` works

Exceptions and Ensure

Bypassing Overridden Methods (i.e. super)

Final Thoughts

Low-effort Retrocomputing

Your Sucks Programming Language Favourite

Getting the Singleton Class of a BasicObject in Ruby

TL; DR, How do I do it?

What's it good for?

What's a singleton class anyway?

Isn't this whole singleton class thing kind of overkill?

Recent Posts:

The Beta Blog

I have a cunning plan...

Site update

Loom: A Programming Language

Overview

Running stuff

Basic Stuff

Message syntax

Quoting

Objects and Classes

Sending Messages

The Machinery of Objects and Classes

Blocks and Control Flow

How Methods Work

Here, or How Variable Assignment Works

Odds and Ends

How return works

Exceptions and Ensure

Bypassing Overridden Methods (i.e. super)

Final Thoughts

Low-effort Retrocomputing

Your Sucks Programming Language Favourite

Getting the Singleton Class of a BasicObject in Ruby

TL; DR, How do I do it?

What's it good for?

What's a singleton class anyway?

Isn't this whole singleton class thing kind of overkill?

Recent Posts:

`Here`, or How Variable Assignment Works

How `return` works