Lambda Signatures as Interfaces or Abstract Superclasses
Comments: 12 - Date: December 1st, 2008 - Categories: Deca
So I’ve got this great programming language all worked out. It’s called Deca, and it’s the systems-programming language I’ve always wanted. It has bloody everything I like: a single-inheritance object system based on generic functions and multimethods, tuples-as-structures-as-arrays-as-objects, lexical scoping with Scheme-style LET statements, exceptions, a parametric module system for generic programming, local type inference with static typing, even an absolutely-most-definitely-guaranteed-safe alias/reference type that gives me processor-level mutexes for free and enables me to compile the language down to machine code with no runtime library or garbage collection. Oh, and there’s an “unsafe” dialect for writing modules that need pointers and inline assembly. Sounds fricking awesome, right? Why, I should probably go ahead and implement the compiler ASAP so I can claim my world fame, right? Actually, I still need to design a complete syntax for all this, but I could still write it in S-expressions until I work out a better syntax.
Well no. Because I want to add first-class functions to this language. Therein lies the rub, because any function passed around as a first-class value could, and often does, have a closure environment of some sort with it. However, the signature of the function, which would normally be its whole and entire type in a functional language (Function of x, y, and z returning an a and carrying a closure pointer that may be null…), cannot possibly be its whole type in Deca, because Deca has to run without (language-supplied) garbage collection and therefore the size of every single data object passed into or out of a function must be known in such a way that the compiler can theoretically (that is, before optimizations like passing arguments in registers) allocate that object on the stack. As I’ll explain below, this is a problem.
So far I’ve been using the idea from D (as in Digital Mars’ D) for compiling closure expressions: create an anonymous class to contain the variables the compiler infers as closed-over by the function definition, with a “call” or “operator()” method that (other than its closure-object argument, which the compiler automatically passes like a “this” pointer despite Deca being a multiple-dispatch language) has an identical signature to the function as the programmer defined it. Calling an instance of such a class (ie: “f()” where f is of a function type) calls the correct (as determined by runtime dispatch) “operator()” method with the given arguments and the implicit instance of the closure-object. To make all this more than mere syntactic sugar for anonymous classes, all “closure classes” are direct subtypes (meaning no closure class is a child of another closure class) of an abstract supertype (with no closed-over variables at all) that simply represents an old-fashioned, C-style function pointer with no closures involved.
By now you probably see the problem yourself. If I have to define the type of an argument to or return value from a function and I define it as a function type, how much space should the compiler allocate on the stack to pass or return such an “object”?
For arguments I can simply use pass-by-reference, of course. The compiler allocates space for a pointer, does its type-checking and uses old-fashioned RTTI to determine what the function actually got at runtime. But I anticipate that programmers (not in the least myself) will actually want to pass a function by value at some point. Actually, even old C has variadic functions that determine at compile-time how much memory is needed on the stack for their arguments, but I’m sketching this part out to show how much harder the hard case is in contrast.
What about return values of function types? Or more generally, what happens when I declare a function as returning an instance of a certain class and then attempt to actually return an instance of one of its subclasses? How do I avoid resorting to pointer semantics here? C++ seems to handle it somehow (see the attached C++ source file to test the language’s behavior), so I guess I should start by disassembling my C++ binary. If anyone has an idea that doesn’t involve disassembling a binary file originally written in the Shub-Niggurath of programming languages (I didn’t wear my Magen David when I did it, so my soul may or may not remain in my body), please feel free to leave it in a comment so I can steal it!
Happy hacking to all! Here’s the source of the program I wrote to see how C++ handles this stuff. It compiles under Linux with g++, and uses weird “flds/z/cw”, “fucompp”, “fnstsw”, “fistpl” and “sahf” x86 assembly instructions that I’ve never seen before. TO THE INTEL BIBLE!
class Parent {
public:
virtual int foo(int bar) {
return bar + 1;
}
};class Child: public Parent {
int quux;
public:
int foo(int bar) {
return bar + quux;
}
Child(int i) {
quux = i;
}
};Parent baz(float x) {
if(x < 0.0)
return Child((int)x);
else
return Parent();
}int main(int argc,char** argv) {
return 0;
}
Yeah, gorram Wordpress doesn’t let me attach .cpp files. They don’t meet its security guidelines ;-).
EDIT: It turns out those funny instructions are floating-point instructions. Made x an integer now to get rid of them. The code still looks weird, but interpretable.
AGAIN EDIT: It appears the language feature I’m looking for is called return-type covariance. So yeah, I need some way to return values of covariant type to my declared return type on the stack.
Comment by Barry Kelly - December 2, 2008 @ 12:09 am
First-class functions implemented using closures that support lifetime-extended variable capture must use some kind of polymorphism, and must also either use garbage collection or force users to free the activation records manually.
C++ “implements” returning a subclass as a superclass via slicing: it discards every field of the subclass that isn’t defined in the superclass. This is almost never what you want, so you can’t find the solution to your problem in C++.
Instead, you need to either use compile-time polymorphism (e.g. generics like C#/Java) or runtime polymorphism.
When I implemented anonymous methods (i.e. first-class functions) in the non-GC language of Delphi, I used reference counting for method references (the type that can polymorphically refer to anonymous methods); in other words, lacking full precise GC, I fell back to reference-counted GC. The usability win is too large to ignore.
The main alternative, C++ 0x-style lambdas, is to use functors (overloaded `()`) with compile-time polymorphism, combined with a variadic template ancestor for cases where runtime polymorphism is used. However, variable capture is limited; there are only two choices, capture by copy and capture by reference, and neither extends a *variable*’s lifetime, though the first extends values’ lifetimes. In order to take advantage of runtime polymorphism, though, the user needs to `delete` or otherwise manage the lifetime of the value, due to functor classes being used as the implementation strategy.
Comment by Adrian Quark - December 2, 2008 @ 12:36 am
C++ does not handle that like you think it does. When the “Child” object is passed as the return value (or assigned to a variable of type “Parent”), it is truncated and effectively becomes an instance of the parent class. See http://burks.bton.ac.uk/burks/language/cpp/cppfaq/valuevsr.htm#[28.8]
Comment by Adrian Quark - December 2, 2008 @ 12:55 am
To answer your question: lexical scoping is fundamentally incompatible with static memory management. Lexical scoping implies that values can escape their dynamic scope (via closures), and the abstraction power of closures comes from the fact that you can use one without knowing its free variables. But you must know a closure’s free variables in order to manage their memory explicitly.
If you feel like this won’t be a problem (maybe you only need so-called “downward funargs”), then it seems like you can just use a special stack discipline to handle variable-sized values: include the size of the function object somewhere in its representation, and respect this when pushing or popping the stack.
Comment by egottlie - December 2, 2008 @ 12:56 am
Well shit, slicing is downright EVIL. Way to reinforce my opinion about you, C++!
Now as to the issue of garbage collection and memory management for the lambdas… that issue is solved through a bondage-and-discipline approach. I could write another whole post detailing how Deca alias types and anonymous variables work, but suffice to say that the language enforces a semantics which enables it to treat aliases as Plain Old Data for closure-variable-capture purposes without creating aliasing bugs. Think of uniqueness types, but done at runtime to avoid the need for a huge type-inference engine. Since aliases, the only reference type in the language, are Plain Old Data and everything that isn’t a reference type is either Plain Old Data or recursively defined in terms of aliases and Plain Old Data… everything is Plain Old Data for variable-capture purposes. Just copy it into a closure-object and go, basically. The closure-object then gets deleted manually later on. Like I said, the exact weird bondage-and-discipline I enforce on aliasing to make this work is worth its own post some other day.
And don’t nobody go telling me that this kind of thing is impossible or senseless.
I suppose I *could* simply outlaw returning functions from functions and require that they be passed as out arguments, but that ends up running into the same problem of either A) slicing or B) passing by reference, which requires allocating all my closure-objects on the heap (NOT a pleasant thing to do in a manual-allocation language, particularly for the perceived syntax and semantics of lambdas). There should be *some* way to return a covariantly-typed value on the stack!
Comment by egottlie - December 2, 2008 @ 12:58 am
Also, I’m assuming that class instances carry with them a *very* basic level of RTTI, such as their own size in bytes. This gives me an idea…
Comment by Barry Kelly - December 2, 2008 @ 2:26 am
Egottlie, you *must* either use compile-time polymorphism or runtime polymorphism in order to write code that works generally with lambdas of a particular arity and type signature. There is no other way out (apart from C++, i.e. discarding the troublesome data).
If you don’t use generics or templates, then you must use polymorphism. Whether you describe the lump of data being referred to as being POD or not, it will conceptually be a type, because in order for it to act as a lambda, it must be invokable and contain state; since the state may vary in size depending on the amount of captured data, the very minimum of type description - size - is necessary. (Size is also the only type known by assembler. Perhaps you would be more comfortable with that?)
Comment by egottlie - December 2, 2008 @ 2:45 am
Yeah, I’m actually using run-time polymorphism. I thought that was implied in the original post — run-time polymorphism based on the representation of every object/function (since this works nicely for both) carrying some kind of “class ID” that identifies its run-time type in some fashion sufficient to tell the size of the object at run-time.
And as to the issue of state capture due to lexical scoping, I must admit that since my lambdas have copy semantics for closure construction, they Are Not True Lexical Closures in the case that one returns multiple closures over the same state. However, in that case I believe that an object (ie: instance of a class) will work well enough. It may not be The Right Thing sometimes, but it will work and it lets me have a somewhat functional language without GC (and therefore usable for low-level programming).
Comment by egottlie - December 2, 2008 @ 2:45 am
Will write up ideas in a full blog-post tomorrow.
Comment by Marijn Haverbeke - December 2, 2008 @ 5:59 am
Also notice that *if* you manage to find some way to reliably copy closures around, you’ll either have to make the closed-over variables read-only, or deal with some very weird behaviour — assigning to them in one copy of the closure will not affect other copies.
Comment by somejan - December 2, 2008 @ 6:16 am
A ’simple’ solution (at least not requiring too much type theory) would be to allow to declare the maximum size of a variable separately from it’s type. So adding to the example in the post:
Parent p;
p = baz(-1.0);
would become something like
Parent __size__(sizeof(Child)) p; /*I’ll leave it to you to think of better syntax*/
p = baz(-1.0)
Something like this should then inform the compiler to reserve enough space on the stack for Child objects. You could probably call this ‘manual polymorphism’
This would probably require the compiler to track separately the interface and max size of every variable, so it would require quite a bit of rewriting in the typechecker. Also, it makes using inheritance more difficult and less modular than a fully general solution. You could end up with an error that ‘variable x cannot hold objects with size of DerivedClass’ when you change the definition of DerivedClass, forcing you to recompile all kinds of unrelated modules for which you may not have the source code (well, assuming Deca will actually be used in the wild).
Or an other way would be to have each activation record start with a table of pointers to variables, followed by those variables taking as much space as they need. The space needed by the variables would grow dynamically. But this would require indirection for all local variable accesses.
Comment by Sandro Magi - December 2, 2008 @ 10:18 am
The system programming language you’re looking for is BitC, and it sounds like BitC is further along than your own effort at the moment. Their design docs are worth looking at, and they discuss the GC issues.
“Downward” closures are possible to compile without GC (Ada 2005 now has them), but upward closures cannot without some tradeoffs. For instance, copying closure environments instead of passing by reference, and distinguishing function pointers without an environment from closures. As for how to implement this copying, well you compile your higher-order language to a first-order intermediate language which exposes representation for the code generation phase.
Another alternative is to give closures linear types, such that the client is forced to free them at the appropriate time.
Barry Kelly, reference counting is garbage collection, so your implementation was GC’d.
Comment by Edward Kmett - December 2, 2008 @ 10:34 am
As a devil’s advocate you could always allocate the closure out on the heap when you need to pass it in an interchangeable way.
Then you could pass around a reference to it that has linear ownership semantics, so that when your reference goes out of scope you can delete the target object. This requires that copying the reference copies the object on the heap, keeping the system of closures acyclic.
With suitable optimizations, you should be able to avoid constructing that heap based closure most of the time.
Copying costs become proportional to the size of the environment this way, and while you could go with a lame reference counting solution, that blows up in your face when you want fast multithreaded performance and start dealing with cache thrashing and may inject cycles and space-leaks if you aren’t careful.
Leave a comment