Pentaho first impressions

I've just went to a Pentaho presentation performed by Xpand IT.
It was the first opportunity I had to get some answers from someone close to the project and I must say, in general, it is what I was expecting.

The presentation was a bit technical, but I think it could not be otherwise since Pentaho is technical. What I mean is that it does not address one of the most important issues on an ETL project: funcional mapping.
Technically, it's all there, but I do feel the transformations could, and should, be easier to define and implement. Point and click is cool when one wishes to sell the product to management, but to do real work, it slows down the development process.
There are just too many clicks involved even for the simplest task, some could be avoided with a GUI revision focused on the user productivity. For instance, recording previous user tree filters, instead of writing the word every time one wishes to filter something, would save a lot of time to the end user.

Another weak point is the lack of rule mapping management and documentation. It does not have a clean nor fast way to see field mappings. If one wishes to see which fields are mapped or what transformation rules are implemented, one has to manually search all the transformations and click on its graphical representation in order to find them.
It's a lot of click when one has hundreds of tables, and thousands of transformation rules defined.
And, obviously, this lack of management reflects on the project documentation because one cannot generate access nor generate rule mapping reports.
This is a serious issue Pentaho should solve. Rule definition is the core of an ETL process and it is totally unacceptable that one cannot access it is a simple, fast and clear way.

On the strong side, Pentaho has a lot of connectors and operators already available. It's entirely written in Java, supports scripting languages and it's open source, through the Kettle project. All this means that it can be easily expanded, either through customization of its core or through the development of plugins.
It can work from the command line, which is excellent because it can be included a shell script, and it has it's own J2EE server, also excellent because it provides out of the box integration solutions. For instance, one can write a transformation that starts when a web service is called or a JMS receives a message.
It comes with some simple, fuzzy, functions that helps to clean data, but don't expect too much out of it.
It seems to scale well, mainly through parallelization, but orchestration can only be achieved manually.

In short, Pentaho can evolve a lot. Specially when it comes to the funcional part of the ETL. But even technically, it misses an orchestrator to help jobs orchestration.
Currently, as a data migration expert, I'm still not convinced that Pentaho can be used on "hard core ETL" projects where the functional mapping management, the development time and the data migration time window are critical points.

./M6