A Java Programmer’s Guide to Random Numbers, Part 1: Beyond java.util.Random
More than you ever wanted to know about randomness in Java programs.
This is the first in a series of articles about random numbers in Java programs. The series covers the following topics and also serves as an introduction to the random numbers package of the Uncommons Maths library:
- True Random Numbers and Pseudorandom Numbers
- Statistical Quality (making sure the dice are not loaded)
- Performance (because sometimes you really do need half a million random numbers a second)
- Different kinds of randomness (because not everything can be modelled by tossing a coin)
- Degrees of Freedom (is every possible outcome actually possible?)
- Security and Integrity (when not losing money depends on nobody knowing what happens next)
Part 1: Beyond java.util.Random
Random numbers are useful in a wide variety of software applications. They provide a crucial element of uncertainty in an otherwise deterministic world. Without random numbers in computers, the hugely popular online poker industry would not exist, video games would be boring and predictable, iTunes would have no shuffle, cryptography would be much more difficult, and many innovative algorithms, such as those used in artificial intelligence and evolutionary computation, simply couldn’t work.
True Random Numbers and Pseudorandom Numbers
“Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.” - John von Neumann
Before continuing, it is important to make the distinction between so-called “true” random numbers and pseudorandom numbers. Though it may often seem otherwise to us as programmers, computer systems are infuriatingly deterministic. They are generally incapable of doing things randomly. To get a computer to behave in a way that is truly random, it is necessary to introduce some non-deterministic input. We could have somebody constantly rolling dice or flipping coins and typing in the results, but that is not very practical. A slightly more feasible approach is to construct a device that observes some real world phenomenon that is known to be unpredictable, such as radioactive decay or atmospheric noise. Data extracted from these events can be used a source of entropy for our applications. You could purchase a device that plugs into a serial port or USB port. To access these devices from Java you’d probably have to use C/C++ and JNI. Alternatively, you could get true random numbers indirectly - from an online service such as Random.org or Hotbits.
Since we can get truly unpredictable random numbers from this kind of hardware, why don’t we satisfy all of our randomness requirements in this way? Well the primary problem is throughput. These devices are quite limited in the quantity of randomness they can produce. They simply aren’t fast enough for many uses.
Pseudorandom numbers are not really random at all. They are the result of deterministic mathematical formulae. The best of these algorithms have been devised so that their output is statistically indistinguishable from true random numbers. PRNGs start with a single numeric seed value. The algorithm is applied to this seed to generate the output and a new, updated seed that is used to generate the next value. The mathematics involved is beyond the scope of this article - the definitive guide is the second volume of Donald Knuth’s The Art of Computer Programming.
An interesting property of this approach is that if you always start with the same seed value, you will always get the same sequence of “random” numbers. Though this can occasionally be useful, you would normally strive to avoid using the same seed value in the interests of unpredictability. A simple approach that is sufficient in many cases is to seed the PRNG from the current value of the system clock.
Aside from speed, another advantage of pseudorandom number generators (PRNGs) over true random number generators (TRNGs) is that they are more predictably unpredictable. That is, the statistical properties of PRNGs are more reliable than is the case with TRNGs.
Statistical Quality
Why not java.util.Random?
The Java API helpfully provides a PRNG that we can use, the java.util.Random class. If you are a Java programmer you will almost certainly have used it at some point. For non-critical random numbers, e.g. adding some unpredictability to a game, it’s fine. It’s even pretty quick. Unfortunately, it’s not random enough - not by the standards required for more serious applications.
The problem of deciding whether a sequence of numbers is random is not an easy one to solve. You can’t simply look at the output and decide that it doesn’t look random enough. After all, if you toss a coin ten times, it could randomly come up heads every time, even though the probability of this sequence is pretty small. To get any kind of meaningful evaluation requires a large sample of the RNG output (perhaps millions of generated values). This sample can then be subjected to sophisticated statistical analysis.
Probably the best known test suite for random number generators is George Marsaglia’s Diehard Battery of Tests of Randomness. Diehard says that java.util.Random is not sufficiently random. But you don’t have to interpret Diehard’s complicated reports to see for yourself. This applet demonstrates it visibly.
Performance
So if not java.util.Random, how about Java’s other RNG, java.security.SecureRandom? SecureRandom is built for cryptography so it is specifically designed to avoid any such flaws. Diehard reports no issues with its output. Unfortunately, this quality comes at a high price - performance. In benchmarks, SecureRandom can be up to 60 times slower at generting random numbers than java.util.Random. This is bearable if you are only generating random values infrequently, but if your program relies on generating random numbers non-stop and as fast as possible (as many simulations do) then it’s a show-stopping bottleneck.
The good news is that, beyond the core API, there are random number generators as fast as, or faster than, java.util.Random with statistical properties as good as SecureRandom. The Uncommons Maths project provides a comprehensive open source random numbers package. Uncommons Maths provides three Random Number Generators, with different properties, for a wide variety of applications. Unlike java.util.Random, each of these RNGs completes the Diehard suite without any problems. Additionally, each of the Uncommons Maths RNGs is a sub-class of java.util.Random, which means any of them can be used as a drop-in replacement with minimal effort.
A good general-purpose PRNG is the MersenneTwisterRNG class. It is a pure Java port of Makoto Matsumoto and Takuji Nishimura’s proven and ultra-fast Mersenne Twister PRNG for C. Even faster is CellularAutomatonRNG, the Java port of Tony Pasqualoni’s experimental cellular automaton PRNG.
In the next article: Beyond dice and coins - using the right probability distribution.

on April 4th, 2008 at 7:32 am
I really like the Mersenne Twister (Kenta Cho used it in his games for example), but most implementations seem to use some really awkward license. Apache is a good one though.
CellularAutomatonRNG sounds intriguing. I’ll have to take a closer look.
I wrote a lengthy article about randomness in games (shuffling in particular). Eventually it’s of some use for the follow up article.
http://kaioa.com/node/53
on April 6th, 2008 at 12:03 am
[...] A Java Programmer’s Guide to Random Numbers, Part 1: Beyond java.util.Random [...]
on April 7th, 2008 at 12:08 pm
[...] A Java Programmer’s Guide to Random Numbers, Part 1: Beyond java.util.Random [...]
on April 10th, 2008 at 9:42 pm
[...] A Java Programmer’s Guide to Random Numbers, Part 1: Beyond java.util.Random [...]
on May 19th, 2008 at 10:21 am
Hi Dan,
The following code took 60 ms to execute on my Pentium 4, 3.0 GHZ 5 year old PC.
long t1 = System.currentTimeMillis();
byte[] holder = new byte[1024/8];
SecureRandom secureRandom = SecureRandom.getInstance(”SHA1PRNG”);
secureRandom.nextBytes(holder);
System.out.println(”new BASE64Encoder().encode(holder) = ” + new BASE64Encoder().encode(holder));
long t2 = System.currentTimeMillis();
System.out.println(t2-t1);
Which makes me wonder why you call SecureRandom slow?
on May 19th, 2008 at 8:41 pm
Swapnonil, it’s all relative. Your code is only generating 256 bytes of random data (that’s only 64 32-bit integers). Any RNG is going to be fast enough for that (unless this routine is in some critical loop).
Some applications need to generate millions of random numbers as quickly as possible. When you benchmark SecureRandom in this context it is many times slower than the alternatives.
The RNGs in Uncommons Maths were originally written for use in evolutionary computation. When I was running evolutionary simulations for my masters degree (without these RNGs), a single experiment would take about 12 hours to complete, and the time spent generating random values was a significant chunk of this.
on May 20th, 2008 at 6:55 pm
Well I see your point.
Keep up the good work.
on July 23rd, 2008 at 2:18 am
[...] each time it runs. Basically, it does the same thing around a billion times with different random inputs for each iteration. I calculated that, for my first working version of the program, it would take [...]