New Adventures in Software


Commons Lang - Why?

Posted in Java by Dan on June 29th, 2007


A lot of useful software has come out of the Apache Foundation and its Jakarta project, but I’ve never been a fan of the Commons libraries. Aside from the questionable premise of Commons Logging, it was a disappointment with Commons Math that inspired the domain name for this site. It’s fair to say that Sun’s enhancements over the years to the standard libraries and Java language (e.g. generics, auto-boxing and enums) have obsoleted many of the Commons classes.

Today, at work, I discovered that some of the code in my current project depends on Commons Lang. This led me to investigate what exactly this library provides. I’ve been looking through the docs, through the source code and scratching my head in puzzlement at the utter wrong-headed pointlessness of much of this beautifully-commented folly. The stuff that’s not mind-bogglingly trivial is suspect in many ways.

The builder package is of dubious utility. The major IDEs will auto-generate efficient implementations of equals, hashCode and toString. The Commons solution for equals and hashCode means writing more code with additional runtime overhead. The security-violating, reflection-based option for implementing the equals method is especially daft.

The BooleanUtils class provides the following brilliantly redundant quartet of static utility methods: isTrue, isFalse, isNotTrue and isNotFalse. We’re just missing isEitherTrueOrFalse and isBothTrueAndFalse.

These criticisms are about things that are mostly harmless. The Commons Lang treatment of random numbers is less benign. A utility class provides static versions of most (but not all) of the useful methods from java.util.Random. These methods delegate to the very special Commons Lang RNG.

The Commons Lang RNG (JVMRandom) is remarkable in that it proves that the concept of “adding value” can be applied even when value is a negative quantity. JVMRandom is a sub-class of java.util.Random. Each of the interesting methods is over-ridden. Some are over-ridden to throw UnsupportedOperationException (and therefore prevent access to potentially useful functionality). Those that do still do something are over-ridden to generate all random values via Math.random() (Roedy Green has an explanation of why this isn’t sensible). The random boolean method is re-implemented to be (very slightly) biased in favour of false. In short, JVMRandom extends the imperfect but functional default RNG and makes it do less, more slowly and less correctly.

BigDecimal Gotchas and the need for Overloaded Operators

Posted in Java by Dan on June 25th, 2007


…or why Java sucks for arbitrary-precision arithmetic.

For many applications floating-point binary is bad. But BigDecimal isn’t much fun either…

I came across this discussion yesterday on the evilness of BigDecimal’s double constructor. It’s an imporant point to be aware of when using the BigDecimal class. In choosing to use BigDecimal instead of double, you probably wanted an exact value but, if you use that constructor, you’re unlikely to get it. I’d quite like an IDEA inspection to warn me if I inadvertently invoke that constructor with a floating-point literal.

Unfortunately, that issue is not the only thing that you have to keep in mind when using BigDecimal. Unless you’ve read the Javadocs diligently, you maybe surprised that the equals method does not consider 2.0 to be equivalent to 2.00. This is because it takes into account the scale as well as the value. If you want to compare values irrespective of scale, you need to use the compareTo method:

if (firstValue.compareTo(secondValue) == 0) 
{
     // Do something...
}

Not particulary readable. An idle glance may wrongly interpret this as a check for zero.

Of course, this means that the implementation of Comparable is inconsistent with equals. This has unpleasant consequences:

It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave “strangely” when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the equals method.

On top of all this, the BigDecimal class is not very user-friendly. Consider the equation a = b - c * d. Using doubles this would look something like this:

double a = b - c * d

That’s pretty similar to the mathematical representation. The Java programming language doesn’t provide operators for BigDecimals, so we have to use instance methods. Converting the above to use BigDecimals might look something like this:

BigDecimal a = b.subtract(c).multiply(d)

Not only is this verbose, it’s also wrong. The rules of precedence have changed. With chained method calls like this, evaluation is strictly left-to-right. Instead of subtracting the product of c and d from b, we are multiplying the difference between b and c by d. We would have to rewrite it to be equivalent to the double example:

BigDecimal a = b.subtract(c.multiply(d))

This does not present an insurmountable mental challenge, but it does increase the cognitive load slightly by virtue of being different to what we are used to. We could split the computation into separate stages with intermediate variables. This might make the behaviour clearer but it’s not exactly concise.

The final point to make about the BigDecimal class is that it is immutable and, as such, each of the “operator” methods returns a new instance. Compare the add method of a Calendar or a List with the add method of a BigDecimal. One modifies its target, the other returns an entirely new object. This ought not be a problem (immutability has many advantages) but a common error is to forget to use the result of the method. IDEA has an inspection for this. Alternatively, you can use FindBugs. In the absence of this type of tool, these are the kind of issues that will be picked up by your extensive unit test suite.

Conclusions

There is not much we can do about the first two problems. The double constructor won’t be deprecated since it does serve a purpose when performing type conversions from primitive variables. Also, it seems reasonable that BigDecimal should provide a method to check equality with respect to both value and scale. Whether that method should have been the equals method is a judgement call that is not going to be changed now.

The other issues, those of readability, verbosity and bug-prone patterns of usage, can all be improved by overloading the common arithmetic operators (+, -, *, /, %, +=, -=, *=, /=, %=, — and ++).

Over-loaded operators for BigDecimals are apparently one of the language enhancements being considered for Java 7. Whether this would cover the comparison operators is not clear (I cannot find any definitive information on the proposal). The less-than, greater-than, less-than-or-equals and greater-than-or-equals operators would be straightforward. Overloading the equality operator (==) would be more problematic. Existing code may break. Also, would it be consistent with the equals method or the compareTo method?

What’s happening in Java 7?

Posted in Java by Dan on June 16th, 2007


Alex Miller has put together this very useful page that aggregates the relevant information about changes planned or proposed for Java 7. The page includes dozens of links to the various JSRs and to articles discussing the new features.

Scholarpedia: Wikipedia with better standards?

Posted in Evolutionary Computation, Software Development by Dan on June 16th, 2007


I’ve just stumbled upon Scholarpedia, a MediaWiki-based encyclopedia. The key difference between it and Wikipedia is its focus on peer-review of content. Of course, this immediately means that it has substantially less content than Wikipedia but less is more, right? Contributors are nominated and voted in by the public based on their reputation in their area of expertise, most being notable academics.

At present Scholarpedia seems to have quite a narrow focus (most current articles are about various kinds of adaptive systems in computer science). It will be interesting to see how the project progresses. Perhaps the most promising aspect is the quality of authors who have apparently signed up to write various sections. For example, if you could ask anybody to explain Genetic Algorithms, it would probably be John Holland. And who better to write about Hopfield Networks than John Hopfield himself?