Thursday, June 07, 2007

It is more blessed to yield than to return

Yield statements in Freya must be associated with expressions with the same type as the return value from the iterator:
BinaryTree[X].TreeNode = class
public
property Value: X;
property Left, Right: TreeNode;

iterator InOrder: X;
begin
if Left <> nil then
for var
v in Left.InOrder do
yield
v;
yield Value;
if Right <> nil then
for var
v in Right.InOrder do
yield
v;
end;

// ...
end;
Since InOrder returns an X in each step, each yield expression must also belong to this type. Let's see what happens if we allow attaching an IEnumerable[X] to yield:
BinaryTree[X].TreeNode = class
public
property Value: X;
property Left, Right: TreeNode;

iterator InOrder: X;
begin
if Left <> nil then
yield
Left.InOrder;
yield Value;
if Right <> nil then
yield
Right.InOrder;
end;

// ...
end;
I think it's easy to tell both uses of yield. When the yield expression has type IEnumerable[X], instead of X, the compiler translates the statement into a for/in loop, saving the programmer a lot of keystrokes.
Finally, let's take a second look on the code that dropped from the iterator:
for var v in Right.InOrder do
yield
v;
It's easy to see that the compiler can generate better code for this special kind of loops: there's no need to allocate a explicit variable to hold the value returned by the (nested) iterator in each step. In other words, the above fragment can be translated like this (ignoring the final call to Dispose):
var en := Right.InOrder.GetEnumerator;
while en.MoveNext do
yield
en.Current;
Our compiler now detects this case and generates efficient code for it. You should realize that, though the loop variable v looks like a local variable, it is implemented via fields, since the code is part of an iterator.

Labels: ,

Friday, June 01, 2007

A whiter shade of a functional flavor

Take a look at this example:
// Yes, this is still Freya!
method Factorial(n: Integer): LongInt =>
if n <= 1 then 1 else n * Factorial(n - 1);
There are two new features in those three lines. The first one is inline implementation of executable members via expressions. The second one is the conditional expression. The former leads to the latter, so that's the order I will follow for explaining it.
First of all, why do we need inline implementation? Freya started as a Delphi/Pascal mutation, and adopted the splitting style: you must group your declarations in one place, and all executable code is written in a different section of the same file. This approach has its pros and cons. It's easy to extract a "contract form" from the source, without sophisticated edition tools. Bad news has to do with projects featuring a lot of small classes: the syntactic overhead may be greater than the real code. The following Ray class is a good example:
public

Ray = sealed class
public

Origin, Direction: Vector;
property Items[Time: Double]: Vector;
end;

implementation for Ray is

property
Items(Time: Double): Vector;
begin
Result.X := Origin.X + Time * Direction.X;
Result.Y := Origin.Y + Time * Direction.Y;
Result.Z := Origin.Z + Time * Direction.Z;
end;

end;
A first step could be moving the implementation part inside the own Ray class. Gains are small, however:
Ray = sealed class
public

Origin, Direction: Vector;
property Items[Time: Double]: Vector;

implementation

property
Items(Time: Double): Vector;
begin
Result.X := Origin.X + Time * Direction.X;
Result.Y := Origin.Y + Time * Direction.Y;
Result.Z := Origin.Z + Time * Direction.Z;
end;

end;
This still looks more Delphic than Eiffel-like. And I must confess I love Eiffel. So, we can reunite the Items implementation with its declaration:
Ray = sealed class
public

Origin, Direction: Vector;

property Items[Time: Double]: Vector;
begin
Result.X := Origin.X + Time * Direction.X;
Result.Y := Origin.Y + Time * Direction.Y;
Result.Z := Origin.Z + Time * Direction.Z;
end;

end;
This explains the inline implementation bit. Now take a look at this record:
Vector = record
public
property
X, Y, Z: Double;

property Length: Double;
begin
Result := (X.Sqr + Y.Sqr + Z.Sqr).Sqrt;
end;

end.
X, Y and Z are field-based properties, since they have no explicit implementations. Our vector type features another read-only property, Length, which we have already implemented inline. The getter body is very simple: a single statement, actually a return statement. Déjà vu, anyone?
... yes, this is a situation we will find again when dealing with anonymous methods. C# 2 introduced anonymous methods, but the syntax was too heavy for simple methods like our get_Length. For this reason, C# 3 introduced lambda expressions: a simplified syntax for writing anonymous methods with only a return statement in their bodies. Of course, C# 3's lambdas has more interesting features than merely this syntactic trick (more powerful type inference, for instance).
C# has a Spartan syntax, but Pascal heirs are a lot more verbose, so this simplification is even more important for us. Now ask yourself, why limit the benefits of the lambda style to writing anonymous methods:
Vector = record
public
property
X, Y, Z: Double;

property Length: Double =>
(X.Sqr + Y.Sqr + Z.Sqr).Sqrt;
end.
We have "shaven" all the begin/end unnecesary junk, and the assumed assignment to Result. The final code is a lot more readable and less error prone. Now you know what does "expression-based inline implementation" meant.
There remains a question: why did we add such a filthy nasty feature such as a stinky conditional expression inspired by C? In my defense, please note that our conditional expression smells more like Haskell or Miranda than good old C. There's also a good hidden signal: the LALR generator did not complain for the duplicated use of if/then/else. Chances to confuse the expression and the statement are, thus, minimal. But the main reason is power: once we have single expression lambdas, we want to make this feature as powerful as possible. Thanks to this, we have been able to reduce Factorial's implementation to a single inlined expression, as already shown.
Of course, there are more things that we could do for enhancing expressions. This method returns a "normalized" copy of a vector:
method Normalized: Vector;
begin
var
L := Self.Length;
Result := if L = 0
then Self
else new Vector(X / L, Y / L, Z / L);
end;
We are already using a conditional expression, but that's not enough to reduce this simple implementation to a single expression. What we need are "real" lambdas: I mean, inline anonymous functions... wait... that's what C# 3 now provides, isn't it? Not exactly: now we can embed lambdas inside the "imperative" side of C#, but not inside its brand-new, functional-like personality. I want something like this (provisional syntax):
method Normalized: Vector =>
if L = 0 then Self
else new Vector(X / L, Y / L, Z / L)
where L := Self.Length;
Why do I have used now an assignment instead of the => keyword? To make short a long story: an assignment means a single evaluation, no matter how many times the defined symbol is used. On the contrary, => means substituting the symbol with its definition each time. This is not a concern for a functional programming language, since they have, or they shouldn't have any side effects. See this:
static I: Double;

property A: Double = I + 1;
property B: Double => I + 1;
A is a field based property (a read/write one, by the way), and the underlying field is initialized as I + 1 when an instance of the declaring class is created. On the other hand, B is a read only, inline implemented property. Each time you read from B, the expression I + 1 is reevaluated. Something similar happens with interface delegations:
MyClass = class(IEnumerable)
// ...
implementation
interface
IEnumerable = GetArray();
end.
For the above class, the compiler creates a hidden IEnumerable field, and initializes it with the value return by the call to GetArray. This is an alternative:
MyClass = class(IEnumerable)
// ...
implementation
interface
IEnumerable is GetArray();
end.
Each time we apply a method from IEnumerable on a MyClass instance, GetArray is reevaluated, and the method call is redirected to the value returned by GetArray.

Labels: , ,