I've just went to a Pentaho presentation performed by Xpand IT.
It was the first opportunity I had to get some answers from someone close to the project and I must say, in general, it is what I was expecting.
The presentation was a bit technical, but I think it could not be otherwise since Pentaho is technical. What I mean is that it does not address one of the most important issues on an ETL project: funcional mapping.
Technically, it's all there, but I do feel the transformations could, and should, be easier to define and implement. Point and click is cool when one wishes to sell the product to management, but to do real work, it slows down the development process.
There are just too many clicks involved even for the simplest task, some could be avoided with a GUI revision focused on the user productivity. For instance, recording previous user tree filters, instead of writing the word every time one wishes to filter something, would save a lot of time to the end user.
Another weak point is the lack of rule mapping management and documentation. It does not have a clean nor fast way to see field mappings. If one wishes to see which fields are mapped or what transformation rules are implemented, one has to manually search all the transformations and click on its graphical representation in order to find them.
It's a lot of click when one has hundreds of tables, and thousands of transformation rules defined.
And, obviously, this lack of management reflects on the project documentation because one cannot generate access nor generate rule mapping reports.
This is a serious issue Pentaho should solve. Rule definition is the core of an ETL process and it is totally unacceptable that one cannot access it is a simple, fast and clear way.
On the strong side, Pentaho has a lot of connectors and operators already available. It's entirely written in Java, supports scripting languages and it's open source, through the Kettle project. All this means that it can be easily expanded, either through customization of its core or through the development of plugins.
It can work from the command line, which is excellent because it can be included a shell script, and it has it's own J2EE server, also excellent because it provides out of the box integration solutions. For instance, one can write a transformation that starts when a web service is called or a JMS receives a message.
It comes with some simple, fuzzy, functions that helps to clean data, but don't expect too much out of it.
It seems to scale well, mainly through parallelization, but orchestration can only be achieved manually.
In short, Pentaho can evolve a lot. Specially when it comes to the funcional part of the ETL. But even technically, it misses an orchestrator to help jobs orchestration.
Currently, as a data migration expert, I'm still not convinced that Pentaho can be used on "hard core ETL" projects where the functional mapping management, the development time and the data migration time window are critical points.
./M6
Drupal makes people life harder.
Out of the box Drupal is not very good.
Actually, it is quite over simplistic...
When I see a piece of software, I instantly assume that it will be helpful to some people, but Drupal philosophy is quite the oposite, it deliberately makes people life harder.
When one installs a Drupal site, most people will not get a solution, they'll just get more problems to solve.
Drupal states to be, and I quote, "an open source content management", but after it has been installed, it's totally useless since no one can really manage any content except text. And let's be fair, in the 21st century, supporting only text is extremely limited.
So, in order to support other types of content, one has to install plugins, the same plugins that should already come in the Drupal core system.
Another mistake Drupal makes is it's lack of support for common SEO stuff. In such a competitive world, it's a total fail not to support meta tags out of the box, again, another plugin is required.
Only after a considerable set of plugins has been installed, activated and configured, Drupal starts to be useful.
I consider such philosophy a real mistake because Drupal misses the whole point of software making people life easier. With such a plug-in world, Drupal makes people life harder.
Sometimes it actually states "go and use Joomla!".
./M6
Actually, it is quite over simplistic...
When I see a piece of software, I instantly assume that it will be helpful to some people, but Drupal philosophy is quite the oposite, it deliberately makes people life harder.
When one installs a Drupal site, most people will not get a solution, they'll just get more problems to solve.
Drupal states to be, and I quote, "an open source content management", but after it has been installed, it's totally useless since no one can really manage any content except text. And let's be fair, in the 21st century, supporting only text is extremely limited.
So, in order to support other types of content, one has to install plugins, the same plugins that should already come in the Drupal core system.
Another mistake Drupal makes is it's lack of support for common SEO stuff. In such a competitive world, it's a total fail not to support meta tags out of the box, again, another plugin is required.
Only after a considerable set of plugins has been installed, activated and configured, Drupal starts to be useful.
I consider such philosophy a real mistake because Drupal misses the whole point of software making people life easier. With such a plug-in world, Drupal makes people life harder.
Sometimes it actually states "go and use Joomla!".
./M6
Subscribe to:
Posts (Atom)