Science Charlantry

Natural Language Processing (NLP) is a scientific area that I'm very interested in. NLP per se is vague, since it embraces specific areas such as automatic translation, speech recognition, syntax processing, etc..
Given my interest in this area, I was thrilled when a friend pointed me a /. discussion about Charlantry in forensic speech science.

Professors Eriksson and Lacerda have proven scientifically that the so called "lie detectors", based on voice stress analysis and psychological stress evaluation, are really a lie by them selves.

The article can currently be found here, it has been published in the International Journal of Speech Language and the Law in 2007, and is currently giving lots of headaches to both the lie detector vendors and the professors.

I find this discussion most interesting because science must always be serious, unfortunately sometimes it is not. Sometimes it is just used as a basis to give credibility to something that is not credible or reliable.

We are all used to see marketing claims about software that uses "cutting edge" algorithms and techniques that no one knows about because the software vendor keeps the "cutting edge" stuff all for itself.
I wonder how "cutting edge" some of those algorithms and techniques really are...

./M6

Windows XP SP3 Installation Error

I was installing Service Pack 3 for Windows XP, Portuguese version, and I got this wierd message:

Este Service Pack requer a ligação à electricidade de rede antes de iniciar o programa de configuração.

Translated into english is something like "This Service Pack requires a connection of the machine to the electicity of the network before initializing the configuration program".
What is "a connection of the machine to the electricity of the network"? It does not make any sense!
After thinking a bit about this wierd message, I finally found out what the message ment: "the machime must me plugged into electricity". I was using a laptop and it was running on the battery, so SP3 refused to install.

The Portuguese internationalization team made e serious translation mistake, since the message does not make any sense, any sense at all, in Portuguese

./M6

i18n RCP applications

I'm developing an RCP application, also known as Eclipse Application, with a couple of friends and it is internationalized (i18n) for English, Portuguese and, currently non-officially and incomplete, Spanish.

Currently it is being developed on Europa (3.3.2) and after fetching the i18n Eclipse packages for Portuguese, I was unable to use the i18n files withing the application.
What this means is that the application specific stuff was i18n but the RCP core was not. For instance, the menus were in Portuguese but the welcome page navigation buttons, and all JFace stuff and alike, were in English.

I've digged for "awhile" in the official RCP documentation, FAQs, How-Tos, and non-official stuff like news groups, forums and blogs. I've made some posts in some-what official RCP fóruns/news groups but I haven't got a single answer.

After almost two weeks fighting with this problem, I've finally solved it!
It's actually quite easy and all it requires is a couple mouse clicks.

Here's what it takes to i18n the core of an RCP application:
  1. Download the language packages from the Babel project using the Eclipse install/update mechanism.
  2. Open the RCP application ".product" file.
  3. Go to the "Configuration" section.
  4. Press "Add Required Plug-ins" button.
This last step is the "magical step", it includes the i18n plugins to the RCP application.
You can identify the language plugins by their name. They come in the "<plugin>.nl_<language>" format. For instance, Portuguese JFace translation file is "org.eclipse.jface.nl_pt".

I sure hope this information helps others and save their time.

./M6

R on Kubuntu

I'm using R on my MSc thesis, which is about a new clustering algorithm, and I found out that I was unable to install it via Adept.
It may be my fault, but there's other things I seem not to find in Adept.
So, I had to do it "manually". It was easy to find the official information, and it was also easy to install and use it.

To get R, just update the repositories and the get r-base:
$ sudo apt-get update
$ sudo apt-get install r-base

I'm not sure that I really need it, but since I'm going to develop a package, I've also installed the development stuff:
$ sudo apt-get install r-base-dev r-recommended

I've also wanted an R editor, so I've used JGR. In order to use it, I've followed the official installation under Linux.
Java is required, so when there's none, Sun's Java should be installed:
$ sudo apt-get install sun-java6-jdk

To install JGR, it's necessary to enable Java support in R:
$ sudo R CMD javareconf

And then install the JGR library:
$ sudo R
> install.packages('JGR')

After installed, run it:
> library(JGR)
> JGR()

JGR It has a small problem though, it has been installed as root and I seem unable to use it with any other user, so root will be.

So, I've automated the startup of R with JGR with a simple shell script, named runr.sh:
sudo R -f /home/m6/msc/jgr.r
While the jgr.r file contains the JGR start commands
library(JGR)
JGR()

Now, running ./runr.sh starts R and JGR all in one. It's still necessary to login as root, though.


There are alternatives to JGR that require a lot less stress. Tinn-R and R Commander are alternatives, in fact Tinn-R is the editor I elected for R in Windows environment.
For know I'll just stick with JGR because I like to explore alternatives in order to decide which is best.

./M6

JDeveloper Sucks

During 2008 I've been using Oracle JDeveloper 10.1.3.3 for J2EE, or JEE as SUN has renamed it, using ADF.
I can only say that JDeveloper sucks... And by the way, ADF sucks too...

JDeveloper deteriorates with usage, meaning if you use it a lot, as I do on a daily basis, will start behaving weird, like crashing when it's started or refusing to close a file that has been edited, even if it file is no longer accessible through a tab, it is still loaded, since it is reachable through the window list.
Degradation is not uncommon in such IDEs, sometimes Eclipse also deteriorates, specially when it has loads and loads of plugins. But Eclipse can be "restarted", just do a -clean and it stability will come back. JDeveloper does not have such parameter, and one has to reinstall it (overwriting will not work) and reinstall all necessary plugins or "clean" it manually, witch is not easy nor fast to do.
Degradation itself is bad enough, but it's not the only bad feature it has.
Some simple an common functionality is really bad implemented, so bad that it can ruin ones work, has it has already done with me. A "simple" rename of a variable, through the refactor functionality, will perform a textual search for the variable string name in all the files. The result can be catastrophic, since it can (an will) find that variable string as a substring on a non-related variable and method names... Yep, it's default refactoring will "blindly" perform a textual find-and-replace in all your .java project files... So be careful and check the preview option before doing it.
The Subversion (SVN) plugin sucks to... It's not really useful. I was expecting to commit/update over a project, but that is not possible. One can do such operation over files and some directories, but the SVN plugin is not that useful, since if your project has more that a module, you need to go to "special" directories/packages and perform the commit/update, on each module...
The debug task is also not very good. When one watches a variable or expression, most of the time it cannot retrieve/evaluate its value.

These are just some examples of how bad JDeveloper is. I knew it was not state-of-the-art, but I was not expecting something this bad...
Here's a hint for Oracle JDeveloper product management: how about using Eclipse RCP as the base of JDeveloper?

./M6

BEA Weblogic Integration

One of my projects late in 2008 and early in 2009 included the integration of several systems, including data transformations, orchestration, DB2 data access, Oracle data access and web services calls.
For the job, BEAs Weblogic Integration 9.2 was the tool selected.
I never worked with Weblogic Integration (WLI) before and I had to learn to use it (all by my self) in order to use it myself and to teach it to other team members.

WLI seems pretty easy to use, as any marketing documentation about it will say. What they don't say is that it does not deliver what is promissed.

I, obviously, did some tests before starting the real project integration code, and the tests started to reveal some problems that, with the help of local BEA support, I found they were just my "fault", since I was still learning WLI.
The proof of concept went well, and then, again with the help of local BEA support, we defined how the system should work. Here it is, grosso modo:
  • There is an extraction process, triggered by a temporal event, that will read DB2 information.
  • The BD2 information will be composed in batchs, each having n DB2 rows.
  • The extraction process will call the integration process with a batch as its input.
  • The integration process is triggered by a web service call.
  • The integration process processes each batch row using the WLI transactional mechanism.
  • The transactional mechanism is configured to freeze if an error occurs.
Please note that both the extraction and the integration processes are part of the same WLI application, they are siblings, they do know about each other existence and they do communicate.

I started to develop the processes and soon I was getting errors. I developed workarounds for the errors in order to continue to develop the application, but there was a time when I really had to request BEA local support again.
I had so many problems and so many workarounds that the hole process started to became a nightmare and WLI went from "solution" to "problem".
Has I was showing the errors to the BEA local support guy, he was totally surprised and kept saying "that has to work" to each error I was showing him.

Here's some of the major problems I found on WLI:
  • XQuery transformations failed at runtime but went well on the tests performed at design time, even when using the same inputs.
  • Transactions crashed the system. When the extraction process calls the integration process and this one has a transaction, the system crashes.
  • Web service calls with transformations fails. When calling a web service, the mapping of the source data into the web service input schema failed: the application created some files but the transformation editor, with the transformation itself, failed to load.
  • Automatic generated code breaks sometimes.
Some of the problems were reported to BEA support as critical problems.
I would love to post here the solutions for the problems I've found, but unfortunately, there are none... BEA just gave me some workarounds, some different from the ones I've implemented, and that's it. Instead of a solution ended up with a "workaround" implementation.
Here's the workarounds for the problems reported:
  • XQuery major bug transformation: I was performing the transformations using Java code and BEA told me to change my schema to fit its product! Yep, the same schema that is shared with several web services... Yep, it seems that BEA solution was not to correct the problem but to adapt my inputs, with major impacts in other systems, to their bugs.
  • Transactions stop crashing the system when I used JPDProxy instead of the regular process call. I told this to BEA support and, at the time, I had no other solution from the support team.
  • Web service calls with transformations failure was really tricky, since there's not really a pattern for its failures. Nevertheless, I found out that:
    1. Creating and empty transformation and adding its source fields afterwards sometimes it worked well.
    2. Wrapping all the source fields on a single class it worked well most of the time.
  • Automatic generated code breaks because no code formatting can be applied. This means that any line of code that has been automatically generated must be on a single line. Yep, if you have a web service call that has 10 parameters, which formated would make 3 lines of code breaking at column 80, you must have it on a single 214 column length line.
I left before BEA responded to all the problems, but my guess is that BEA support will stick by the workarounds the team has found or will propose their owns but will not solve the problem by releasing a patch that fixes the bugs.

Later, I've chatted with a colleague that has used WLI with success and he was telling me how great it is... Unfortunately I had such a bad experience with WLI that my professional opinion is: stay away from it.

./M6

How to do nothing and get payed for it

In February 2008 I was working as an IT consultant and my client was the Portuguese tax services area.
When I started there, about 15 months earlier, I was given a desktop with a dual core processor, 1 GB RAM, 70 GB HD and a lousy CRT.
The computer was not state of the art but it was OK for the job. Back then I had the usual software kit (MS Office, McAfee, Email, etc.) and developed J2EE applications, using Spring, Eclipse, BEA Weblogic and Oracle 9i running on Windows XP SP2 English version.
I had power user privileges on my computer and executed everything locally, except for the database, witch I accessed remotely.

I then needed to work with other stuff, like BEA NetUI, BEA Workshop Integrator and other stuff, so I got a 1GB memory upgrade. This allowed me to continue to be as productive as I was when I first started there.

But suddenly, in December, a "brilliant" mind from the system administration department executed of the most stupidest ideas I have seen in IT: change all the IT department desktops, which were somewhat like mine, by virtual machines.
So, I started January with a desktop, which is similar to my first computer, 2 GB RAM, dual core processor, 60GB HD, the same lousy CRT running Windows XP SP2, Portuguese version.
I login into the system, using domain, let's call it X, and I access a "fantastic"desktop, with a picture of the building where I work, plus:
  • Access to domain X but with no privileges what so ever on the system, I can't even save a bookmark on IE;
  • MS Office is installed, but I cannot keep files locally on the computer;
  • Access to a shared network drive;
  • MacAffee anti-virus running, with "paranoid" configuration which cannot be changed;
  • MS Outlook for email, which works fine but I must use the web interface due to a stupid policy;
  • MS Virtual PC with a Windows XP SP2, English version.
Since I cannot work in this system, I must start the Virtual PC with the XP image provided by the system administration.
After the virtual system boots up, I must login into the system using domain X (yes, the same domain) but now I have administrator privileges. Here's what I get from the virtual system:
  • Access to domain X with power user privileges on the system, I can install, uninstall and kill applications;
  • MS Office is not installed and cannot be installed, due to a stupid policy, so I must copy-paste office files between the real and the virtual systems when I need to read a Word documento or to check an Excel file.
  • MacAffee antivirus running, with "paranoid" configuration which cannot be changed;
  • 1GB to 1.5GB of memory available, the computer has 2GB but it is not possible to use much more than 1500KB.
Here's my first conclusions:
  1. Yes, I do have completely different privileges on the same domain, depending if I'm working on the real or on the virtual system.
  2. Yes, I do have MS Office on the "wrong" system, I am forced to checkout files from the version control repository in the virtual system, copy the files into the real system, edit the files in the real system, copy the files back to the virtual system and the check in the files, also in the virtual system.
  3. Yes, I do have two anti-virus, that I cannot configure to be less intrusive, which sometimes scan the same files at the same time.
  4. Yes, I do have two operating systems running, but one is actually resting or doing nothing at all.
  5. Yes, I do have 2GB of memory but cannot use more than 1.5GB beacause the operating system that is doing nothing needs 0.5GB.
  6. Yes, I do have a standard hard disk but cannot use it since I cannot save files into it.

I though that was stupid enough and it couldn't possible get worse, I was completely wrong, the nightmare had not even started...

I had to reinstall everything on the virtual system, and that's when I really understood how bad this architecture is.
When I installed the first application, it took ages... It's not difficult to understand why, a simple explanation is: any disk or memory access must go through the virtual anti-virus, the virtual operating system, the virtual disk and then it must go through the real anti-virus, the virtual operating system and the virtual disk. I do know that not all access are this bad, but the overall outcome is bad enough.

I made some measurements and here's the conclusions: this new system architecture is 4 times slower! Here's an example, a tipical BEA Weblogic 8.1.5 installation takes about 5 minutes on a regular computer, it took 15 minutes. But the record was from BEA Weblogic Integration 9.2, it takes 15 minutes on a regular computer but it took 60 minutes to install on the virtual system.
For the skeptics, yes, I did used a chronometer for time measurement.

An installation procedure is done once in a while, so it was annoying but I could live with that.
But on a working daily bases, I cannot live with such a slow system.
I continued my measurements and concluded that each consultant doing the same job as I do has seen a productivity decrease of 90 minutes per day. Yes, each consultant is one hour and thirty minutes slower than before, per day...
This makes 33 hours slower per month, on 40 hour labor week, it's almost one week of vacation per month... The client to which I'm working has over 40 people doing the same job and I do...
And we're not cheap...

When I showed these number to my superior I asked him if I could use my laptop and go home 1h30m early each day... He and I laughed. :)
When I asked him the killer question of "why" were our desktops migrated into such a stupid architecture, he could not tell and has also not received any answers from the client.
I can point a lot more reasons not to adopt such a stupid architecture, but have not found a single reason for doing so...

The client has shot its own foot: it should give us the best environment possible so that we can be high performers, but actually it is slowing us down and paying us for it...

So, if you want to do nothing and get payed for doing it, just find a client which is at least as stupid as this one. Find a killer combination like the one I have found: stupid system administrators and dumb organization that cannot even fight back, even when people from the organization complain...


This is definitely one of the stupidest ideas I've ever seen from system administration (and believe me I've seen a few)... But then again, it may be just me, after all, I do have bad temper...
But the story of the virtual machines continues...


I have complained about the 90 minutes performance decrease per day per developer to my manager and I also have reported it to the client in the monthly report.

One Thursday we had a meeting with the client and one of the topics was the virtual machines performance.
My client had been warned by the 90 minutes performance decrease, my manager has done his job right, and wanted to know what was going on.
She started by performing a simple calculation, multiplying the 90 minutes for all the resources she has that develop using the virtual machines times 22 working days and came up with 1600 hours.
1600 hours per month that she's paying us for doing nothing! She freaked out.
I've explained her that the system is running two operating systems, two anti-virus, we can only use 75% of the available RAM, the processor is never 100% available and any I/O disk operation is extremely slow.
The guy responsible for that 1600 hour loss had no words to justify why she was paying 1600 hours per month of inactivity.

After some discussion, the guy came up with the following solution: double the RAM of all the machines. This solution is totally lined up with the "brilliant" idea of the virtual machines implementation: it is totally stupid!
Instead of admitting that the virtual machines did not work and revert the process, he decided to buy more RAM for each machine. This will increase the client costs and even with 4GB RAM, he admitted that:

1. The developers will not be able to use the full capabilities of the machine.
2. The anti-virus will still be intrusive and will still decrease the developers performance, since nothing will change.

Lesson learned: if you made a mistake, do not fix it, just make slight variations of the mistake and eventually everybody will drop the subject...

./M6

Welcome to M6 on Software

Welcome. :)

I've started this blog mainly for knowledge sharing. As a software developer, sometimes I have useful information that is useful for others as well. Therefor, I will share information mainly about development, either programming languages as Java and Eclipse RCP, but also about software business and any topic related with software that I find interesting enough to write about.

If the name of this blog sounds familiar, that's because I've "copy-pasted" it from Joel On Software, one of my first, and favourite software development blogs (all the good blog names were already taken :D).
You may also familiar with The Business of Software, from Eric.Weblog(), that covers the software development from the business point of view, a very interesting topic to me also.
If you're not familiar with any of the blogs above, I encourage you to take a peak and read it, there's a lot of interesting stuff there.

I hope that you like what you read and find the information useful.
./M6