Monday, August 17, 2009

Lambda, Linq, Dynamic Proxies

In the current project I've introduced some closure/function definitions: functions, actions. The usual stuff. (for reference look at the Scala API, or the google collections, or the Linq Func, or the functional java).

One problem with the current state of closures in java is that, even though objects are closures, and they can be realized with anonymous classes (that's how scala compiler does it), we still have an interface impedance: my function definitions (interfaces) don't match google one's which don't match scala's, which don't match functional javas. So we'll have to write adapters (scala reduces the adapter overhead).
The other problem is that writing closures with anon-classes is a pain: probably that's why Doug Lea is looking into scala. In my project's context we basically/mostly have operations on collections, a sort of
google collections dsl. The only 'ugly' part is the function definition: how the predicates/transformers are defined.

It remindes me of Scala Query & Formal Language Processing in Scala.

I thought of the idea of creating proxies in order to define a function: e.g.
Funcs.on(A.class).getB().create() will create a proxy, will record the method call, and will create a function, which could be called on A objects. I didn't implement that because it seemed such an overkill to create a proxy in order to have a closure.
But now I see that it has been implemented in lambdaj (query collections), and for jpa in liquidform.

What I've missed is that through dynamic proxies we might provide enough information to build queries a la LINQ. (i.e. the information which the c#/vb.net compiler provides on parsing the query,
could be built, manually, by means of dynamic proxies, in java).
---
Update: Looking at Linq/query for java frameworks:
- one does byte code analysis in order to generate the query expression
- one does dynamic proxy
- one uses APT to generate query expressions from entities (something which people wished for jpa 2.0)

Overall I fear/see that the syntax used to build a query-expression-tree might get rather clunky. Maybe in this area Scala could shine.

Wednesday, August 05, 2009

Rx Framework

A new addition to the .Net platform, the Rx framework, aka 'reactive programming', aka 'linq to events'.

Here is the lang.net presentation, the theoretical discussion/explanation/proof about the relation between the Iterator & Observer pattern, and another very good explanation.

The .Net approach is very elegant and reminded me of Glazed Lists and of F# Reactive Programming.

Other thought associations were to F# async workflows, and ruby gui frameworks: bowline based on Titanium, and MonkeyBars based on swing.

Enjoy!

Tuesday, August 04, 2009

Variations on the visitor pattern

In the current project I've written so far at least 20 class hierarchies (+ their visitors) representing ADTs.

The common scenario is that we have some data which is created by one module and consumed by others. Since the data is shared by 2+ modules, most operations on it require several services from several modules.
So we're out of luck with the Interpreter Pattern, we use the Visitor Pattern/Double Dispatch:

abstract class Expression {
....static class Number extends Expression {
........public final int n;
........Number(int n) { this.n = n; }
....}
....static class Plus extends Expression {
........// since the data is immutable, there makes no sense to create getters
........public final Expression left;
........ public final Expression right;
........Plus(left, right) { this.left = left; this.right = right; }
....}

....// toString, equals, hashCode are implemented with apache commons-lang, based on reflection up to the Expression class
....}

The classic Visitor pattern says:

interface Visitor {
....void visit(Number n);
....void visit(Plus plus);
}

and

abstract void Expression#accept(Visitor v);
....void Number#accept(Visitor v) { v.visit(this); }
....void Plus#accept(Visitor v) { v.visit(this); }
// noticed the copy-paste !? (imagine doing that 40+ times...)

In this example the visitor is a stateful closure which encapsulates a computation its initial data and its final (accumulated) result.

The problem which might arise is how do we handle adding more data type (i.e. classes) to our hierarchy. See the Expression Problem, Extensible Visitor/book (more on google ;).
The proposed solution is to drill a hole in the type safety, in order to make old code compatible with new code:

interface Visitor {
....void visit(Number n);
....void visit(Plus plus);
....void otherwise(Expression x);
}

Now we want to make the visitor stateless, maybe some visitors could be injectable services, eg: ExpressionPrinter, ExpressionEvaluator etc. In this scenario we want to extract the initial data & the result from the visitor:

interface Visitor[A,R] {
....R visit(Number n, A args);
....R visit(Plus plus, A args);
....R otherwise(Expression x, A args);
}

we also need to change the signature of the Expression#accept

abstract [A,R] R Expression#accept(Visitor v, A args);
....[A,R] R Number#accept(Visitor v, A args) { v.visit(this, args); }
....[A,R] R Plus#accept(Visitor v, A args) { v.visit(this, args); }
// noticed the copy-paste !? (imagine doing that 40+ times...)

In this scenarion we can pass the initial data as args , the Expression will pass that to the visitor, which computes the final result R.

Another variation is when the visitor returns an element of the type which he visited. For that we need to modify the Expression:

interface Expression {
....E accept(Visitor v, A args);
....static class Number implements Expression { ... }
}

In this case we might use the A args in order to accumulate other results than the specified E type (we use the A as an accumulator).

If we want to really keep things clean and separate input from output, we could return on accept a tuple[R, E], but that's not needed in 90% of the cases.

Compare that with pattern matching on case-classes:
* no need for toString, equals, hashCode since they are generated by the compiler
* no need for visitors due to pattern matching
** no issues with statefull visitors
* no need for manual written ctors since they are written by the compiler,
* we get cca 50% less code to maintain.

Now scale that to 20+ class hierarchies...

Monday, July 20, 2009

Why Scala

I wanted to write a post 'Why Scala is such a nice/great language', but after I've seen the Martin Odersky presentations at google, and at FOSDEM 2009, Bonas Joner at QCon 2009 'Pragmatic Real-World Scala)', Evan Weaver with 'Improving Running Components at Twitter' and the James Strachan's (of groovy fame) post, an impressive collection of learning Scala resources, I had the impression that there is nothing more to be said (that hasn't be said before).

There is even an online and offline ScalaTour.

Nevertheless, we could/should mention

Java-Basics:

* seamless java integration
* IDE support, build support (maven, ant, sbt, buildr)
* java code coverage

OO-Basics:
* objects all-way-down (see ScalaOverview) (the Smalltalk way)
** primitives are efficiently wrapped & handled by the compiler
* closures/functions are objects too
* multiple trait inheritance (e.g. utility-belt, logging, ...)

Fun-Stuff:
* extensible pattern matching (i.e. switch on steroids on objects)
* functional idioms allowed
* type inference (Hindley-Milner)
* reified generics
* type classes
* elegant solution to the Expression Problem

Monday, July 13, 2009

Moved Code Snippets to Mercurial

I've just moved the tip of the svn trunk to mercurial.
In order to make a push without a password you'll need to modify the .hgrc (specify your google code username, password :

[paths]
thought-tracker = https://username:password@thought-tracker.googlecode.com/hg/

In order to get it:
hg clone https://thought-tracker.googlecode.com/hg local_folder

In order to push/publish it (after a local hg ci -Am"some msg"):
hg push thought-tracker

Thursday, July 09, 2009

DVCS(hg) vs SVN workflows

Suppose we work in an international company, remote sites, not so good connection between those sites.
We have the following scenarios:

1. We have to do a rather big refactoring, which will take several weeks.
- With svn: we create a 'feature branch' where we'll do the refactoring. We update daily from trunk, we push to trunk regularly/at the end. The feature branch is allowed to be CI-red. (Due to an svn issue merging back-and-forth became a problem for svn-merge-tracking, and we had to do a tree-diff on 2 local sandboxes)
- With hg: you just create a clone-branch, you work on it, you can merge with<->from trunk regularly,
whithout any merging problems.
- Guerilla tactics: shadow the svn trunk/branch with some hg repositories and to the merges with hg (no tree-diff), at the end push the changes to svn

2. Junior (remote) developer needs help fixing a unit-test.
- With svn: do a branch, svn switch to it, push the code with the broken tests, fix them, merge back to trunk.
- With webex: do a remote session an explain the fix
- With hg: just pull the changes, make a fix, push them back to the remote developer. (hint: hg is more network-friendly then svn)

3. An interface between 2 components is changed radically. Both the user & the implementation must be changed.
- With svn: do a branch, do all the changes, merge to trunk
- With hg: change the interface in a cloned repository, then share this repository between the user & implementer. They can work in parallel, the last one finishing pulls the changes from the other one and does the integration.

4. Junior developer, friday@17.00, before a 3 week holiday, wants to commit his _weekly_ changes. He cannot create a local ci build within 1 hour so he doesn't commit anything.
- He shouldn't integrate friday@17.00. (but he did)
- With svn: create a branch, svn switch, push your changes to the server
- With hg: a colleague could pull the changes an do the integration + push the changes, without the need to go through the server.

5. Always keep the code green.
- With svn: run the CI on the pre-commit hook (TeamCity does it). The code get's in the repository only if the build is green
- With hg: maintain 2 repositories: in the 1st one we push the changes, which are peeked by the CI, validate, and on success pushed to another repository. (or just do a rollback; much easier with hg)

A lot of the hg power comes from the fact that is changeset-based (unlike svn) and merges are really easy. Another big plus point goes to the distributed nature of hg: it enables more flexible workflows; in a centralized vcs all the communication is done through the server.

links: hg, git

Monday, May 04, 2009

15 Min. IoC

From my series, '15 minutes Lessons Learned', a short introduction to Inversion of Control

Monday, February 23, 2009

Programming Abstractions

Great post by Brian Hurt: Programming Doesn't Suck! Or At Least It Shouldn't.

Abstraction helps up the percentage of interesting code vr.s boring grunt work code, by minimizing the amount of boring grunt work code. If you can turn five lines of code, replicated ten thousand times, into one line of code, replicated ten thousand times, you’ve just turned a 50 KLOC project into a 10KLOC project, doable at least 5 times faster (and probably more like 25 times faster). Get the crap work out of the way fast, so you can get on to the more interesting stuff.

I’m not even talking about leaving your language of choice, I’m talking about thinking outside the box, or even just thinking. But learning new languages is an advantage in making programming not suck, because if nothing else, it gives you new tools for your tool box, new ways to consider abstracting the code. And some languages are better than others, and the more you know, the more likely you are to know a good one. And if you don’t know the better language, you can’t use it.

My initial response to the DailyWTF post was that it was another classic example of the starving child in Africa syndrome. You’ve seen the pictures. A child in some hell-hole of a third world, generally Africa but parts of Asia and South America qualify as well. The child is hungry, maybe starving, disease ridden, bug infested, and destined for a life that was nasty, brutish, and short. But they’re smiling away, happy as can be, simply because they don’t know any better. Everyone they’ve ever known, seen, or even heard of has been in exactly the same boat they are. That’s just what life is like. I often times think that many programmers are just like that staving child. They put up with, indeed don’t even see anything wrong with, having to write half a dozen lines of code just to set some UI properties, because they’ve never known any better. Bug ridden, virus invested, bloated, slow, impossible to maintain, that’s just what programming is.

The second post is convincing me that the situation is much, much worse than that. That it’s not just ignorance is the problem. It’s not just that many programmers don’t know any better, it’s that many programmers don’t want to know any better, and are looking for an excuse, any excuse, to stay ignorant. It’s as if that starving child doesn’t want food, and would reject food if offered.

The reason that post is causing to me think this is because I’ve heard this argument before. All of them. See, I lived through the last great paradigm shift, when the industry moved from Procedural to Object Oriented- and heard the dying remnants of the one before that, from unstructured to procedural programming. I’ve heard all these arguments, or should I say excuses, before- from people trying to avoid learning C++ and Java.


Slightly related: Erik Meijer @ JAOO2008: 'Why Functional Programming (Still) Matters'

Friday, February 06, 2009

Failure, take two

Regarding the Agile/Scrum failure, it has been a lot of buzz in the community:

James Shore - The Decline and Fall of Agile
Martin Fowler - Flaccid Scrum
Ron Jeffries - Context My Foot
Want to succeed at software? Then it can’t be business as usual.

Wednesday, January 28, 2009

pet project idea

Implement a 'bean-mapper' a la Dozer, but instead of using the xml files, use dynamic proxies, a la LiquidForm.

Tuesday, January 27, 2009

Failure

Failure

"The only failure is the failure to learn from failure" -- Kevin Everett FitzMaurice

A friend of mine notice that in a big company, which recently decided to embrace agile and use Scrum, the Scrum practices are applied in a mechanical way. Actually he called it a 'theater', and he said that the 'agile principles' are important. A colleague had a similar observation, that Scrum is applied in an american-indian-ceremony kind-of way, expecting that the results will fall out of the sky.

Maybe this is related to the 'Shu Ha Ri' training cycle, which:
TheThreeExtremos lately seem to be recommending a ShuHaRi approach to XP: First, follow all the practices. Then, realize TheyreJustRules, and change them (i.e. break some of the original rules). Finally, you don't need to think about the rules anymore. -- GeorgePaci

But in day-to-day business, it is just sloppiness. People don't understand that they are on a learning journey, they don't understand why are they following the practices/rules. Some don't want to learn, they just one to get the job done and go home. Many people are just happy with Ship-It. We've done, it, we are happy, let's go home. No retrospection, no how-can-we-do-it-better-next-time, no 5-whys. There is no striving towards elegance, to do things better. No passion, no motivation. Probably that's why retrospective are the first to be dropped/disregarded.

Does this mean that Agile/(Scrum?) is a culture-thing? How can we change a 'not-my-problem' culture? How can we stop sinking in a pool of problems? How can we start carrying?

ps. via rangawald's delicious, I found Taming Perfectionism.

Friday, January 16, 2009

The Membrane

Suppose we have 4 coupled/chained objects: A, B, C, D, accessible through the properties A.B.C.D, and the objects have the following properties:

A: B, A1, A2
B: C, B1, B2, B3
C: D, C1, C2
D: D1, D2, D3

Usually within a certain area/component, we don't need all that object tree, but just a projection/subset of it: eg. IView { A1, B1, C1, D1. }
In practice, the problem is that most of the people will use/scatter these properties all over their code, leading to maintenance monsters:
- navigation is all over the place, violating the 'Law of Demeter'.
- Testing is getting a lot more difficult: in order to test a user of IView you need to instatiate A, B, C, D, these might have good constructors which force you to instatiate other X objects, etc.
- The amount of code using chained objects/train wrecks will crush future refactorings: if you'll want to reverse the navigation from B.C to C.B, you'll have to change all the usage points, the same goes for changing a data type.

Possible solutions:
a. Define the projection as required by that component/use case, eg IView { A1, B1, C1, D1 } (properties, or getter/setter, etc.)
- Define a repository service IRepository.get(...) -> IView
- Work against your abstraction IRepository/IView

The IView/IRepository could build through:
- DTOs mapping (a la dozer & co.)
- wrap & lazy fetch:
View {
ctor(A a) { _a = a; }
A1 { _a.A1 }
B1 { _a.B.B1}
C1 { _a.B.C.C1 }
D1 { _a.B.C.D1 }
}
(a pretty slick solution might be build with LINQ expressions)
In this way we have isolated our component from external changes, we are depending only on our IRepository/IView, future refactorings will afect only that area.
The IRepository/IView abstractions represent the membrane to the external world. (Ralf refers to the pattern as 'external adapter', probably refering to Alistair Cockburn's 'Hexagonal Architecture' paper, I prefer the Alan Kay's metaphor).

b. Instead of defining the projection IView we could add the required operations on A. This has the benefit of aggregating/reusing common operations on A (usually a root entity).
In some cases it might lead to the polution of A: all the properties/operations beneath the A will emerge into A, transforming the tree into a list. Class A will become too big, containing its data, and n+ operations (C# has region-folding, smalltalk has protocols for organizing big classes)

Final thoughts:
- it is not about DDD/Tell don't Ask vs EJB/DTOs vs XXX, but is about isolation/visibility.
- (Unit) Testing asap helps a lot identifying architecture/design smells.

Friday, January 09, 2009

toString or not toString

suppose we have an Id class, which used as a 'pointer' to some persistent entities.

abstract class Id {
T _raw;
Id(T raw) { _raw = raw; }
@Overrride void toString() { return _raw.toString(); }
}

And the Object#toString() documentation says:
/**
* Returns a string representation of the object. In general, the
* #toString method returns a string that
* "textually represents" this object. The result should
* be a concise but informative representation that is easy for a
* person to read.
* It is recommended that all subclasses override this method.
*
* The #toString method for class Object
* returns a string consisting of the name of the class of which the
* object is an instance, the at-sign character `@', and
* the unsigned hexadecimal representation of the hash code of the
* object. In other words, this method returns a string equal to the
* value of:
* getClass().getName() + '@' + Integer.toHexString(hashCode())
**/

What is the problem?

* we heavily use these Ids in a *lot* of places
* id.toString() is used both for Id -> String conversion, as a part of our design and as textual representation for debugging/logging
* to subclasses AID and BID will have the same textual representation, making future debugging/logging more confusing then necessary
* if we enrich the Id with more data (eg. a timestamp, or entity version, etc.) we can no longer not add this extra data to its textual representation (eg. logging), since toString() is also used for conversion
** it is very hard to find all the usages of the toString method, since every IDE (and the compiler too) considers this method belonging to the Object and not to the abstraction Id (that's why the @Override is there)
*** if we would have worked in a TDD kind-of way, we probably would have started with an interface, IUniqueId, to which we probably would have added a method asString()

concluding advice: when you model your own domain language/design, be careful what you override.