Monday, August 17, 2009

Lambda, Linq, Dynamic Proxies

In the current project I've introduced some closure/function definitions: functions, actions. The usual stuff. (for reference look at the Scala API, or the google collections, or the Linq Func, or the functional java).

One problem with the current state of closures in java is that, even though objects are closures, and they can be realized with anonymous classes (that's how scala compiler does it), we still have an interface impedance: my function definitions (interfaces) don't match google one's which don't match scala's, which don't match functional javas. So we'll have to write adapters (scala reduces the adapter overhead).
The other problem is that writing closures with anon-classes is a pain: probably that's why Doug Lea is looking into scala. In my project's context we basically/mostly have operations on collections, a sort of
google collections dsl. The only 'ugly' part is the function definition: how the predicates/transformers are defined.

It remindes me of Scala Query & Formal Language Processing in Scala.

I thought of the idea of creating proxies in order to define a function: e.g.
Funcs.on(A.class).getB().create() will create a proxy, will record the method call, and will create a function, which could be called on A objects. I didn't implement that because it seemed such an overkill to create a proxy in order to have a closure.
But now I see that it has been implemented in lambdaj (query collections), and for jpa in liquidform.

What I've missed is that through dynamic proxies we might provide enough information to build queries a la LINQ. (i.e. the information which the c#/vb.net compiler provides on parsing the query,
could be built, manually, by means of dynamic proxies, in java).
---
Update: Looking at Linq/query for java frameworks:
- one does byte code analysis in order to generate the query expression
- one does dynamic proxy
- one uses APT to generate query expressions from entities (something which people wished for jpa 2.0)

Overall I fear/see that the syntax used to build a query-expression-tree might get rather clunky. Maybe in this area Scala could shine.

Wednesday, August 05, 2009

Rx Framework

A new addition to the .Net platform, the Rx framework, aka 'reactive programming', aka 'linq to events'.

Here is the lang.net presentation, the theoretical discussion/explanation/proof about the relation between the Iterator & Observer pattern, and another very good explanation.

The .Net approach is very elegant and reminded me of Glazed Lists and of F# Reactive Programming.

Other thought associations were to F# async workflows, and ruby gui frameworks: bowline based on Titanium, and MonkeyBars based on swing.

Enjoy!

Tuesday, August 04, 2009

Variations on the visitor pattern

In the current project I've written so far at least 20 class hierarchies (+ their visitors) representing ADTs.

The common scenario is that we have some data which is created by one module and consumed by others. Since the data is shared by 2+ modules, most operations on it require several services from several modules.
So we're out of luck with the Interpreter Pattern, we use the Visitor Pattern/Double Dispatch:

abstract class Expression {
....static class Number extends Expression {
........public final int n;
........Number(int n) { this.n = n; }
....}
....static class Plus extends Expression {
........// since the data is immutable, there makes no sense to create getters
........public final Expression left;
........ public final Expression right;
........Plus(left, right) { this.left = left; this.right = right; }
....}

....// toString, equals, hashCode are implemented with apache commons-lang, based on reflection up to the Expression class
....}

The classic Visitor pattern says:

interface Visitor {
....void visit(Number n);
....void visit(Plus plus);
}

and

abstract void Expression#accept(Visitor v);
....void Number#accept(Visitor v) { v.visit(this); }
....void Plus#accept(Visitor v) { v.visit(this); }
// noticed the copy-paste !? (imagine doing that 40+ times...)

In this example the visitor is a stateful closure which encapsulates a computation its initial data and its final (accumulated) result.

The problem which might arise is how do we handle adding more data type (i.e. classes) to our hierarchy. See the Expression Problem, Extensible Visitor/book (more on google ;).
The proposed solution is to drill a hole in the type safety, in order to make old code compatible with new code:

interface Visitor {
....void visit(Number n);
....void visit(Plus plus);
....void otherwise(Expression x);
}

Now we want to make the visitor stateless, maybe some visitors could be injectable services, eg: ExpressionPrinter, ExpressionEvaluator etc. In this scenario we want to extract the initial data & the result from the visitor:

interface Visitor[A,R] {
....R visit(Number n, A args);
....R visit(Plus plus, A args);
....R otherwise(Expression x, A args);
}

we also need to change the signature of the Expression#accept

abstract [A,R] R Expression#accept(Visitor v, A args);
....[A,R] R Number#accept(Visitor v, A args) { v.visit(this, args); }
....[A,R] R Plus#accept(Visitor v, A args) { v.visit(this, args); }
// noticed the copy-paste !? (imagine doing that 40+ times...)

In this scenarion we can pass the initial data as args , the Expression will pass that to the visitor, which computes the final result R.

Another variation is when the visitor returns an element of the type which he visited. For that we need to modify the Expression:

interface Expression {
....E accept(Visitor v, A args);
....static class Number implements Expression { ... }
}

In this case we might use the A args in order to accumulate other results than the specified E type (we use the A as an accumulator).

If we want to really keep things clean and separate input from output, we could return on accept a tuple[R, E], but that's not needed in 90% of the cases.

Compare that with pattern matching on case-classes:
* no need for toString, equals, hashCode since they are generated by the compiler
* no need for visitors due to pattern matching
** no issues with statefull visitors
* no need for manual written ctors since they are written by the compiler,
* we get cca 50% less code to maintain.

Now scale that to 20+ class hierarchies...