Watchmaker Framework at JavaOne – Using Java and Genetic Algorithms to Beat the Market
Those of you who will be attending JavaOne this year (2nd – 6th October in San Francisco) might be interested in Matthew Ring’s BoF session on genetic algorithms in Java. Matthew tells me that he will be presenting his software that uses my Watchmaker Framework for Evolutionary Computation as part of his session Using Java and Genetic Algorithms to Beat the Market. Here’s the abstract:
Predicting the future is dangerous business. Is Java up to the task? This session discusses and demonstrates a fun and understandable stock-picking system built with Java 6 and various Java-based open source projects. The discussion ranges from machine learning concepts (specifically genetic algorithms) and core Java development with specific code examples to trading system back-testing. Attendees are sure to come away excited, ready to tackle that next hard big-money problem, and confident that Java is a capable partner.
ReportNG 1.1.3
I’ve just pushed a new version of ReportNG (an HTML reporting plug-in for TestNG) to GitHub. This is a minor enhancement release that incorporates a few patches submitted by users. Thanks to Jeff Weiss, ReportNG should now load custom stylesheets from the classpath as well as the file system and it should have less broken dependencies if you are using it with Maven. By default ReportNG will now also not generate the annoying Velocity log file (if you want the log file you can re-enable it via a system property). This change is thanks to C. Baldwin.
Using ReportNG with Maven
I’m not a Maven user, so I’ve never made any effort to use ReportNG from Maven. Maven users were left to figure it out for themselves. Well, Marcin Zajączkowski has figured it out for himself and has documented what’s involved.
Uncommons Maths 1.2.2
Yesterday I moved Uncommons Maths (random number generators, probability distributions, combinatorics and a few other numbery type things) from Java.net to GitHub. My other projects have been on GitHub for a while but this one hadn’t been moved mostly because it hasn’t seen any recent activity. I also took this opportunity to push out another release incorporating a couple minor fixes (serialisation and thread safety issues) that have been sitting in the Subversion repository for almost a year.
Download version 1.2.2, or get it from Java.net’s Maven repository.
Android LVL Obfuscation Pitfalls
The recently-introduced Android License Verification Library (LVL) survived a disappointingly short time in the wild before somebody figured out how to circumvent its protection. Google responded to this development by stating that it was the lack of obfuscation that had made the crack so straightforward and then followed up with further advice on achieving security through obscurity, effectively acknowledging that there is no robust way of doing licence verification that can’t be bypassed.
Unfortunately, the LVL is pretty much the only option for protecting apps sold via the Android Market (other app stores have their own protection mechanisms). The choice for Android developers is mediocre protection or no protection at all (Google apparently intends to withdraw its previous copy protection mechanism). You can thwart the most casual Android pirates and make things harder for the determined but that’s about as good as it gets.
Obfuscating LVL-protected Apps with Proguard
Assuming that you choose to add the LVL to your app, you’ll no doubt want to follow Google’s advice and obfuscate your code, probably using Proguard. If you do, you might hit a couple of problems that prevent the LVL from working properly.
I added the LVL source to my application and built it as a single entity but it is also possible to incorporate the LVL as a library project.
Disable Aggressive Overloading
The first problem I encountered was a limitation in the Dalvik VM. Unlike the JVM, it does not (did not?) permit static fields of different types to have the same name. In normal Java development this situation never arises but Proguard has an option to aggressively overload names to make decompilation more difficult. For Android development this option should not be used otherwise you will get VerifyErrors.
Avoid Renaming the ILicensingService Intent
The checkAccess method of the LVL’s LicenseChecker class binds to the licensing service as follows:
boolean bindResult = mContext.bindService( new Intent(ILicensingService.class.getName()), this, // ServiceConnection. Context.BIND_AUTO_CREATE);
The problem with calling ILicensingService.class.getName() is that, after obfuscation, the class has a different name and therefore the wrong intent is created. This will show up in the log as an error with the message “Could not bind to service” and the licence check will always fail.
You could fix this by modifying the Proguard configuration to avoid renaming the ILicensingService interface or, even more straightforwardly, you could just modify the code in LicenseChecker and hard-code the appropriate action:
boolean bindResult = mContext.bindService( new Intent("com.android.vending.licensing.ILicensingService", this, // ServiceConnection. Context.BIND_AUTO_CREATE);
ReportNG 1.1.2
ReportNG version 1.1.2 is available for download. ReportNG is an HTML/XML reporting plugin for TestNG.
There are only two small changes in this release. The first is a fix for an avoidable NullPointerException if the target directory does not exist. The second updates ReportNG to work with this change in TestNG 5.13.1. Thanks to Nalin for providing a patch for this issue.
Android Adventures
Late to get involved as ever, I’ve been developing with Android for the last couple of weeks. On the whole it has been a positive experience (thanks to Richard for pointing me in the right direction with my queries).
My first impression is “what the hell were Sun doing for the last decade?” Android shows what J2ME/JavaME could have been. A proper Java environment for mobile devices (the fact that it uses Dalvik is largely inconsequential). J2ME may have first arrived when mobile phones were far less capable than they are today but, like Java in general in the last few years, it never really progressed. Worse still, with its numerous device profiles and JSRs it betrayed the “Write Once, Run Anywhere” philosophy that is so fundamental to Java. With J2ME it’s more like “right once, wrong somewhere else”. You don’t really write Java at all, you write some horribly restricted, out-dated subset of it. That’s not to say that J2ME was not a success of sorts despite its numerous short-comings. I don’t have the figures to hand but I believe it’s true to say there are vastly more J2ME-enabled devices in circulation than iOS and Android devices combined.
Pain-Free Mobile Development
Android is a complete reboot of Java for mobile devices. No longer do you have to worry about which devices support floating-point arithmetic or how to do anything useful with an anaemic UI toolkit. No more making do with Vectors and without generics, or wondering what happened to half the useful classes and methods that you take for granted with JavaSE.
Of course, nothing’s perfect. The API documentation, while generally pretty good, is not quite up to Sun’s high standard, the UI toolkit is either slightly buggy or slightly unintuitive (probably both), and a couple of things I wanted to achieve were less straightforward than I might have hoped. These however are minor gripes. The resources framework seems to do a pretty good job of dealing with the differences in platform versions and screen sizes. As Sun no doubt learned from J2ME, there is a real risk of fragmentation when different manufacturers want to deliver different capabilities. So far it seems under control.
The documentation favours Eclipse throughout but fortunately it’s pretty painless to develop Android code without it. The Ant tools are fine and, if necessary, it seems flexible enough that I could move things around to employ more sophisticated builds. I’m not sure how easy it is to use TestNG in preference to JUnit when using the Android tools. At the very least it would be possible to just bypass the Android test project stuff and test your Java classes the way you normally would. Given that Cedric was heavily involved in Android it’s mildly surprising that TestNG support is not included out-of-the-box.
Android Market
Yesterday, after paying the $25 registration fee, I pushed my first two (pretty basic) apps to the Android Market. This is an area where Google has room for improvement. It’s all a bit simple at the moment. As I’m in the UK my prices are set in pounds. There is no way to specify different prices for different currencies, nor does the market automatically convert prices for users. That means that, as a user, you are presented with application prices in at least four different currencies (US dollars, Euros, Sterling and Yen) and have to do the necessary conversions yourself. I would like to be able to specify that if my application costs £1.99 in the UK, then it should be $2.99 in the US and €2.49 in the Eurozone.
The other thing you can’t do is respond to user comments. Somebody left a comment that my app did not work properly for them. Unfortunately they provided no details to enable me to track down the problem. Worse, they wrongly assumed that the functionality was simply missing and stated as much. As the only comment on the app so far, that will likely dissuade other people from buying.
The good thing about the Market is that you have the freedom to upload pretty much whatever you want and to have it immediately available to users. Unlike the iPhone, you are not subject to the whims of an inconsistently applied approval process.
My First Apps
So what did I make? My first app is a simple utility for performing a multi-stage fitness test (a.k.a “Beep Test” or “Bleep Test”). It’s a one-button UI that emits the necessary beeps at the appropriate intervals and displays status information such as current speed and cumulative distance.
My second app is an educational game called Flagpole. All you have to do is identify flags. You get one point for each correct answer and it’s game over when you get one wrong. It’s divided into challenges of increasing difficulty, so you might start with South America, for which there are only 14 flags to identify. Eventually you’d progress to “The Whole World”, which would require you to correctly identify 232 flags without making any mistakes.
I intend to release further, more sophisticated applications in the coming weeks. These apps are sold via Rectangular Software, which is the new home for my commercial software development activities (contract/freelance development and now applications). My open source projects remain here at uncommons.org.
Moving Projects from Java.net to GitHub
How to move your project from Subversion on Java.net to Git on GitHub without losing the change history.
Cloning the Subversion Repository
The normal way to duplicate a Subversion repository with full history is to use the svnadmin dump and load commands. Unfortunately most SVN hosting services, including Java.net, do not provide access to svnadmin commands or direct access to the file system.
Fortunately there is another way to clone a repository, complete with all its history, that requires only read access to the repository: svnsync.
The first step is to create a local SVN repository into which you will mirror the remote Java.net repository.
svnadmin create myprojectBefore cloning the contents it is important that you add a pre-revprop-change hook to your new local repository. This is a script that must complete successfully (exit code 0) before any changes to revision properties are accepted. The easiest way to do this to add an empty script and make it executable.
echo '#!/bin/sh' > myproject/hooks/pre-revprop-change chmod +x myproject/hooks/pre-revprop/change
Bearing in mind that we want to preserve tags and branches too, not just the trunk, we can then mirror the top-level of the remote repository.
svnsync init file:///pathto/myproject https://myproject.dev.java.net/svn/myproject svnsync sync file:///pathto/myproject
This may take a while if the repository is large and/or your connection is slow.
Stripping Java.net Web Content from the Repository
Java.net uses the project SVN repository to manage the project website, with the files stored under trunk/www. When migrating from Java.net you probably don’t want to continue with this approach. If that’s the case then you’ll want to remove all traces of the www directory from the repository. The usual way of deleting a file – removing it from the working copy and then committing – will not purge its history so instead we use the svndumpfilter command.
First dump the mirrored repository:
svnadmin dump myproject > dump
We can then remove all traces of the www directory. Any commits that only touched files under www are now empty and are dropped completely. All remaining revisions are renumbered to avoid gaps in the sequence.
svndumpfilter --drop-empty-revs --renumber-revs exclude trunk/www < dump > filtered
The final step is to restore the filtered dump in place of our local repository.
rm -rf myproject svnadmin create myproject svnadmin load myproject < filtered
Migrating the Repository from SVN to Git
Now that the local SVN repository contains only what we wish to keep, we are ready to migrate it to Git. To achieve this I follow Paul Dowman’s instructions.
The first step is to import the SVN repository into a new Git repository.
git svn clone file:///pathto/myproject --no-metadata -A authors.txt -t tags -b branches -T trunk myproject-git
The authors.txt file maps SVN users to Git users. Your Java.net repository may have commits attributed to the users root and httpd. You should probably just map these to your own user name. There should be one entry for each committer:
root = Your Name <you@example.com> httpd = Your Name <you@example.com> you = Your Name <you@example.com> other = Someone Else <other@example.com>
Refer to Paul’s full instructions if you have branches and tags to maintain.
Pushing to GitHub
Create a new repository on GitHub and then add this as a remote for your local Git repository.
git remote add origin git@github.com:username/myproject.git
And then push:
git push origin master --tagsJob done.
ReportNG 1.1 – i18n, Gradle fix, chronological ordering
Over the last 6 weeks or so I seem to have taken an unintentional extended break from programming (and from posting here). It’s time to get back into the swing of things and top of my TODO list was putting the finishing touches to ReportNG 1.1 (ReportNG being an alternative HTML reporting plug-in for TestNG).
This new release fixes a problem people have been having using ReportNG with Gradle. It also adds internationalisation support. So now, as well as the default English text, there is also support for French and Portuguese. The Portuguese translation was contributed by Felipe Knorr Kuhn. The French text is just something I added as a proof of concept. It is likely to be offensively bad to a native French speaker and I welcome any corrections. I’d also appreciate any translations for other languages (just open an issue in the issue tracker and attach your translated version of this file).
The other major change is the addition of the “Chronology” page. This is something that exists in the default TestNG reports that I originally decided to leave out of ReportNG. I didn’t really have a use for it but several people have asked for something similar so I have added it.
Practical Evolutionary Computation: Island Models
Some time ago, I promised more details about the support for island models that was added to the Watchmaker Framework for Evolutionary Computation in version 0.7.0. I’ve finally completed a first draft of some documentation for this feature, attempting to cover both the motivation for this approach to evolutionary algorithms and the practicalities of implementing this idea in your own programs that use Watchmaker. If anything is unclear or wrong please leave suggestions for improvements in the comments.
Island Models
In the natural world, populations of organisms might be separated by geography. Left to evolve in isolation over millions of years, vastly different species will occur in different locations. Consider Australia, an island continent protected by its seas. With little opportunity for outside organisms to interfere, and few opportunities for its land-based organisms to migrate to other land masses, Australian wildlife evolved to be distinctly different from that of other continents and countries. The majority of Australia’s plant and animal species, including 84% of its mammals, are endemic. They occur nowhere else in the world.
Australia is not the only island to exhibit such levels of endemism. It was a visit to the Galápagos Islands in 1835 that started Charles Darwin on the path to formulating his theory of evolution. Darwin noticed the pronounced differences between the species of mocking birds and tortoises present on the different islands of the archipelago and began to speculate on how such variations might have occurred.
In the world of evolutionary computation we can mimic this idea of having multiple isolated populations evolving in parallel. Having additional populations would increase the likelihood of finding a solution that is close to the global optimum. However, it is not just a question of having a larger global population. A system of 10 islands each with a population of 50 individuals is not equivalent to a single island with a population of 500. The reason for this is that the island system partitions the search. If one island prematurely converges on a sub-optimal solution it does not affect the evolution happening on the other islands; they are following their own paths. A single large population does not have this in-built resilience.
Migration
There is of course no real difference between evolving 10 completely separate islands in parallel and running the same single-population evolution 10 times in a row, other than how the computing resources are utilised. In practice the populations are not kept permanently isolated from each other and there are occasional opportunities for individuals to migrate between islands.
In nature external species have been introduced to foreign ecosystems in several ways. In an ice age the waters that previously separated two land masses might freeze providing a route for land animals to migrate to previously unreachable places. Microorganisms and insects have often strayed beyond their usual environment by hitching a ride with larger species.
The effect of introducing a foreign species to a new environment can vary. The new species might be ill-adapted to its new surroundings and quickly perish. Alternatively, a lack of natural predators may cause it to flourish, often to the detriment of indigenous species. One such example is the introduction of rabbits to Australia. Australia was a land without rabbits until the arrival of European settlers. An Englishman named Thomas Austin released 24 rabbits into the wild of Victoria in October 1859 with the intention of hunting them. If rabbits are famous for one thing it is for reproducing prodigiously. The mild winters allowed year-round breeding and the absence of any natural rabbit predators, such as foxes, allowed the Australian rabbit population to explode unchecked. Within 10 years an annual cull of two million rabbits was having no noticeable effect on rabbit numbers and the habitats of some native animals were being destroyed by the floppy-eared pests. Today there are hundreds of millions of rabbits in Australia, despite efforts to reduce the population, and the name of Thomas Austin is widely cursed for his catastrophic lack of foresight.
While such invasions of separate species provide a useful analogy for what can happen when we introduce migration into island model evolutionary algorithms, we are specifically interested in the effects of migration involving genetically different members of the same species. This is because, in our simplified model of evolution, all individuals are compatible and can reproduce. The island model of evolution provides the isolation necessary for diversity to thrive while still providing opportunities for diverse individuals to be combined to produce potentially fitter offspring.
In an island model, the isolation of the separate populations often leads to different traits originating on different islands. Migration brings these diverse individuals together occasionally to see what happens when they are combined. Remember that, even if the immigrants are weak, cross-over can result in offspring that are fitter than either of their parents. In this way, the introduction to the population of new genetic building blocks may result in evolutionary progress even if the immigrants themselves are not viable in the new population.
Islands in the Watchmaker Framework
The Watchmaker Framework for Evolutionary Computation supports islands models via the IslandEvolution class. Each island is a self-contained EvolutionEngine just like those we have been using previously for single-population evolutionary algorithms. The evolution is divided into epochs. Each epoch consists of a fixed number of generations that each island completes in isolation. At the end of an epoch migration occurs. Then, if the termination conditions are not yet satisfied, a new epoch begins.
The IslandEvolution supports pluggable migration strategies via different implementations of the Migration interface. An island version of the string evolution example from this previous article might look something like this:
IslandEvolution engine = new IslandEvolution(5, // Number of islands. new RingMigration(), candidateFactory, evolutionaryOperator, fitnessEvaluator, selectionStrategy, rng); engine.evolve(100, // Population size per island. 5, // Elitism for each island. 50, // Epoch length (no. generations). 3, // Migrations from each island at each epoch. new TargetFitness(0, false));
We can add listeners to an IslandEvolution object, just as we can with individual EvolutionEngines. We use a different interface for this though, IslandEvolutionObserver, which provides two call-backs. The populationUpdate method reports the global state of the combined population of all islands at the end of each epoch. The islandPopulationUpdate method reports the state of individual island populations at the end of each generation.
Advanced Usage
In the example code above we specified how many islands we wanted to use and the IslandEvolution class created one GenerationalEvolutionEngine for each island. Using this approach all of the islands have the same configuration; they use the same candidate factory, evolutionary operator(s) and selection strategy. This is the easiest way to create an island system but it is also possible to construct each island individually for ultimate flexibility.
List> islands = new ArrayList>(); // Create individual islands here and add them to the list. // ... IslandEvolution engine = new IslandEvolution(islands, new RingMigration(), false, // Natural fitness? rng);
One reason you might choose to construct the islands explicitly is that it makes it possible to configure individual islands differently. You may choose to have different islands use different parameters for evolutionary operators, or even to use different evolutionary operators all together. Alternatively, you could use the same evolutionary operators and parameters but have different selection strategies so that some islands have stronger selection pressure than others. You should generally use the same fitness function for all islands though, otherwise you might get some strange results.
Another possible reason for creating the islands explicitly is so you don’t have to use the standard GenerationalEvolutionEngine for the islands. You can choose to use any implementation of the EvolutionEngine interface, such as the SteadyStateEvolutionEngine class or the EvolutionStrategyEngine class. You can even use a mixture of different island types with the same IslandEvolution object.



