Programming the Semantic Web and Beautiful Data
As I’ve mentioned previously, I’m a big fan of Toby Segaran’s book Programming Collective Intelligence. It introduces several cutting-edge algorithms for building intelligent web applications through a well chosen set of compelling example programs . A different author might have made the book a dull, overly mathematical ordeal but Segaran manages to inspire the reader to find ways to apply these exotic techniques in their own projects. I was therefore interested to discover that he has since collaborated on two new books that will both be released in July.
For Programming the Semantic Web, Segaran has teamed up with Colin Evans and Jamie Taylor. I was unable to find a table of contents for this book but the publisher’s blurb suggests that it will follow the same pragmatic, hands-on formula that worked so well for Programming Collective Intelligence:
With this book, the promise of the Semantic Web — in which machines can find, share, and combine data on the Web — is not just a technical possibility, but a practical reality. Programming the Semantic Web demonstrates several ways to implement semantic web applications, using existing and emerging standards and technologies. With this book, you will learn how to incorporate existing data sources into semantically aware applications and publish rich semantic data.
Programming the Semantic Web will help you:
- Learn how the semantic web allows new and unexpected uses of data to emerge
- Understand how semantic technologies promote data portability with a simple, abstract model for knowledge representation
- Be familiar with semantic standards, such as the Resource Description Framework (RDF) and the Web Ontology Language (OWL)
- Make use of semantic programming techniques to both enrich and simplify current web applications
- Learn how to incorporate existing data sources into semantically aware applications
Each chapter walks you through a single piece of semantic technology, and explains how you can use it to solve real problems. Whether you’re writing a simple “mashup” or a maintaining a high-performance enterprise solution, this book provides a standard, flexible approach for integrating and future-proofing systems and data.
Toby has clearly been keeping himself busy because he’s also found time to co-edit the latest installment in O’Reilly’s Beautiful Code series. In 2007 the original Beautiful Code book presented an eclectic mix of 33 essays about elegance in software design and implementation, each written by a different well-known programmer. The success of this anthology has resulted in O’Reilly issuing three companion volumes in 2009: Beautiful Architecture, Beautiful Security and the forthcoming Beautiful Data: The Stories Behind Elegant Data Solutions (edited by Toby Segaran and Jeff Hammerbacher).
Beautiful Data follows the same format as the other books in the series, with each chapter authored by different expert practitioners. One of these chapters covers the making of the video for Radiohead’s House of Cards single, while another is about data processing challenges faced by NASA’s Mars exploration program.
Clearly I haven’t read either of these books because they are not available yet, so I can’t make any informed recommendations, but they do both look like they could be interesting.
First Qualifying Solution Submitted for $1 Million Netflix Prize
The word on the street (well Reddit actually) is that the BellKor’s Pragmatic Chaos team today submitted the first qualifying solution for the Netflix Prize. If nobody submits a better solution within the next 30 days then they will claim the $1 million reward that has so far eluded the best efforts of thousands of programmers and researchers since the competition was launched in October 2006.
Netflix is a US-based online DVD rental service. One of their features is that they make movie recommendations to customers based on their previous viewing history. In order to improve their recommendations system, Netflix has been offering a million dollar reward to any individual or team that is able to develop software that increases the accuracy of these recommendations by at least 10%.
The financial rewards and intellectual challenge of the Netflix Prize have encouraged almost 50,000 individuals and teams to attempt to solve the problem using a vast array of different AI and data-mining techniques.
The BellKor team have overcome such obstacles as the Napolean Dynamite problem and will no doubt have the champagne on ice while they nervously wait to see if anybody else is able to surpass their results within the next month.
SICP – The most divisive book in Computer Science?
Structure and Interpretation of Computer Programs (universally referred to as SICP) seems to be mentioned whenever people are discussing the great/classic/essential Computer Science books. It typically generates a mixed response.
Somebody recently sent a copy (anonymously?) to Python creator Guido van Rossum, apparently as a comment on his supposed ignorance (incidentally, this is an incredibly arsey thing to do).
It seems that SICP is a real love-it-or-hate-it kind of book. Depending on who you listen to, it’s either a mind-bending classic through which true enlightenment can be achieved, or it’s dull, obvious and poorly written.
The distribution of the reviews for SICP on Amazon (UK) is striking:
If you haven’t already read it, you can decide for yourself. The whole thing is available online. I didn’t get very far the one time I started to read it. I quickly got bored with the introductory stuff, but I intend to give it another go sometime. I’ve seen several people recommend the associated video lectures, which may be a better entry point.
5 Ways to Become a Famous Programmer (Probably)
How do ordinary programmers become famous programmers? Since I am not already a famous programmer I can’t speak from experience but, from scientific observations of the those programmers who are well-known, I have been able to identify the following five strategies for becoming a “thought leader”:
1. Do Great Things
Build software that everybody uses and you’ll become famous. Easy.
This is the Torvalds method. Everybody knows who Linus is because of Linux. And, just to prove it wasn’t a fluke, he followed up by creating Git. This approach is not fool-proof though. Not all great projects become famous and not all famous projects have well-known developers.
2. Talk the Talk
Start a blog, get 100,000 readers, retire on the Adsense revenue.
The best known programmers aren’t necessarily known for their programming. They may still be good programmers but they made their mark by being excellent communicators. Joel Spolsky is the poster child for this category. If it weren’t for Joel’s blog who knows whether Fog Creek Software would be in business at all? By writing regular common sense about software development and management, and by evangelising the programmer’s utopia that he’s building in New York, Joel keeps his company, its products and its offices in the minds of his hundreds of thousands of readers.
Joel’s StackOverflow business partner, Jeff Atwood, is another example of programmer-turned-A-list-blogger. I would love to know how to get from 200-reader blog to 100k-reader blog. I’m sure that in Jeff’s case it has a lot to do with the regular posting schedule that he has maintained over a number of years. Easy-to-read articles, with a hint of danger, posted several times a week leads to an extensive archive of articles that no doubt brings in a huge amount of Google traffic.
Steve Yegge is somebody else you could look to emulate as a blogger, but if you find yourself writing articles so long that they need a full-page table of contents, it’s probably time to lay off the dope.
3. Write the Book
Next time you are in a job interview or pitching for some consulting work and somebody asks “do you know anything about technology XYZ?”, wouldn’t it be great to be able to respond “actually, I wrote the book Professional XYZ in Action for Dummies in a Nutshell”?
There are thousands of software development books published every year. Technology books have short life spans and publishers are always on the lookout for potential authors to write a book on the next big thing. If you can demonstrate basic literacy and sufficient technical knowledge, it could be you. Just don’t expect to get rich from the royalties. If you divide the amount that you make by the number of hours you spend writing, you’ll be lucky to come out ahead of minimum wage levels.
It’s a big time commitment for modest direct financial rewards but, if you’re playing the long game, writing a book can lead to other opportunities such as speaking at conferences, providing expert consultancy and more. You also get the satisfaction of seeing your book on Amazon and, better still, you get to harrass local bookstores by re-enacting the J.R. Hartley Yellow Pages ad.
4. Become a Cult Figure
Jon Skeet was a respected member of the programming community before the advent of StackOverflow but now, as the number one ranked user by some distance, Jon Skeet is a cult.
5. Work at Google
I don’t know whether being a well-known developer gets you a job at Google or if getting a job at Google makes you a well-known developer but either way there are a lot of famous programmers at Google.
Working at Google is like an extra stamp of credibility. If you don’t work at Google and you say something stupid, people think you are dumb. If you do work at Google and you say something stupid, people think that you know something that they don’t.
Software Naming Revisited
What do I know about naming software projects? Maybe it’s not such a good idea to give your project a name which is a common typo of a common word?
Google Search suggests that it’s a mistake:

TestNG has achieved sufficient popularity to overcome that problem. ReportNG is not there yet.
Google Alerts doesn’t like it much either. I have alerts for each of my projects so I can see where they are being used and respond to any queries that may arise elsewhere on the web. An alert for “ReportNG” results in an e-mail every time somebody somewhere on the web misspells “reporting”. Not particularly helpful. So I tried “ReportNG AND TestNG”. Now I get an e-mail for every slack-fingered typist who manages to make two separate typos on the same page.
Practical Evolutionary Computation: Elitism
In my previous article about evolutionary computation, I glossed over the concept of elitism. The Watchmaker Framework’s evolve methods require you to specify an elite count. I told you to set this parameter to zero and forget about it. This brief article ties up that loose end by explaining how to use elitism to improve the performance of your evolutionary algorithm.
In an evolutionary algorithm (EA), sometimes good candidates can be lost when cross-over or mutation results in offspring that are weaker than the parents. Often the EA will re-discover these lost improvements in a subsequent generation but there is no guarantee of this. To combat this we can use a feature known as elitism. Elitism involves copying a small propotion of the fittest candidates, unchanged, into the next generation. This can sometimes have a dramatic impact on performance by ensuring that the EA does not waste time re-discovering previously discarded partial solutions. Candidate solutions that are preserved unchanged through elitism remain eligible for selection as parents when breeding the remainder of the next generation.
NOTE: One potential downside of elitism is that it may make it more likely that the evolution converges on a sub-optimal local maximum.
The Watchmaker Framework supports elitism via the second parameter to the evolve method of an EvolutionEngine. This elite count is the number of candidates in a generation that should be copied unchanged from the previous generation, rather than created via evolution. Collectively these candidates are the elite. So for a population size of 100, setting the elite count to 5 will result in the fittest 5% of each generation being copied, without modification, into the next generation.
If you run the Hello World example from the previous article both with and without elitism, you will see that it completes in fewer generations with elitism enabled (22 generations vs. 40 when I ran it – though your mileage may vary due to the random nature of the evolution).
Source code for the Hello World example (and several other, more interesting evolutionary programs) is included in the download.
Further Reading
Waiting for the Magic – The Great Twitter Experiment, Day 1
Getting Started
So I started a Twitter account. Signing-up was not without its problems. The sign-up form has some AJAX functionality for checking whether a user name is in use or not. Except that functionality is just absent in Opera, and it turns out that in Twitterland I am not the only person called Dan Dyer. There is not even an error message when the registration fails. You just get returned to the sign-up form. Very annoying. It wasn’t until I tried it in Safari that I figured out what was happening. You are also limited to a 15-character username, so I could not register as “newadventuresinsoftware”.
That initial hurdle overcome, I proceded to post my first “tweet”, a link back to the blog post announcing this little adventure. The translation of the URL to a Tiny URL tripped over the brackets that I used to surround the link and messed up my tweet. Lesson learned for next time.
The use of these shrunk and obfuscated URLs is necessitated by the strict 140 character limit. It’s not done very cleverly though. I tried to post a tweet that would be 140 characters after the link had been shrunk, but the Twitter website would not permit this.
In all, the Twitter website is pants. It’s a bad way to post messages and a bad way to follow other people. If it were the only way to interact with the service I would have aborted my trial already. Most of the Twitter pros are using some kind of desktop or mobile client. I’ve tried using the Opera widget and that’s an improvement. Next step is to settle on one of the more full-featured desktop clients.
Technical Problems
Twitter’s technical issues are legendary. I hadn’t heard much about them recently, so I assumed things had got better. But I’m only a few hours in and I’ve already experienced my first brief outage:

Day One Summary
As well as getting signed-up and posting my first tweets, I’ve also attracted my first disciples followers (thanks). I don’t really know who are the best people to follow, so as well as the few people I have picked out, I’m following everybody who follows me. That raises a question though: if I’m following you, and you’re following me, are we both lost?
At the moment I’m underwhelmed by the whole experience, but I didn’t expect to achieve enlightenment on day one. I’m just doing the ground work for the epiphany that will surely occur at some point in the next 2 weeks. Right now I’m just sitting back waiting for the magic to happen.
User Experience Fail – Am I wrong to expect better from the ACM?
I received the following e-mail today:
On February 9, 2009 ACM will be replacing some book titles in our Safari Online Books Collection with new titles, including titles that have been requested by ACM Members. In choosing which titles to remove, we look for the ones that are used the least often. Unfortunately, according to a recent usage report, some of these titles were on your bookshelf. *** Please remove these titles before February 9, 2009. *** ================================================ Effective Java™: Programming Language Guide ================================================ If you fail to remove the titles by the deadline, you will notice that the "slots" for the removed books will still be counted against your bookshelf, but you will no longer be able to access the books. At that point, we will need to refer your case to the Safari Customer Support desk.
That’s from the Association for Computing Machinery (“Advancing Computing as a Science & Profession”). Firstly, I should state that the ACM’s online books facility is an excellent service that justifies the membership fee on its own. But surely there is a better way for them to perform this update than requiring potentially every single user to logon and manually perform this task?
I don’t know if it’s the ACM or O’Reilly who would be responsible but, whoever it is, they already know which users are affected and which books are involved. I refuse to believe that this process could not be automated.
The reason they are removing Effective Java from the library is that they are replacing it with Effective Java 2nd Edition. The path to full customer satisfaction ends with them just swapping one for the other on my bookshelf. I shouldn’t need to get involved.
I particularly like how they make it sound like it would be my fault if they had to refer me to the Safari Customer Support desk. They also do a good job of making that fate sound a lot more sinister than it should.
Practical Evolutionary Computation: An Introduction
Software is normally developed in a very precise, deterministic way. The behaviour of a computer is governed by strict logical rules. A computer invariably does exactly what it is told to do.
When writing a program to solve a particular problem, software developers will identify the necessary sub-tasks that the program must perform. Algorithms are chosen and implemented for each task. The completed program becomes a detailed specification of exactly how to get from A to B. Every aspect is carefully designed by its developers who must understand how the various components interact to deliver the program’s functionality.
This prescriptive approach to solving problems with computers has served us well and is responsible for most of the software applications that we use today. However, it is not without limitations. Solutions to problems are constrained by the intuition, knowledge and prejudices of those who develop the software. The programmers have to know exactly how to solve the problem.
Another characteristic of the prescriptive approach that is sometimes problematic is that it is best suited to finding exact answers. Not all problems have exact solutions, and some that do may be too computationally expensive to solve. Sometimes it is more useful to be able to find an approximate answer quickly than to waste time searching for a better solution.
What are Evolutionary Algorithms?
Evolutionary algorithms (EAs) are inspired by the biological model of evolution and natural selection first proposed by Charles Darwin in 1859. In the natural world, evolution helps species adapt to their environments. Environmental factors that influence the survival prospects of an organism include climate, availability of food and the dangers of predators.
Species change over the course of many generations. Mutations occur randomly. Some mutations will be advantageous, but many will be useless or detrimental. Progress comes from the feedback provided by non-random natural selection. For example, organisms that can survive for long periods without water will be more likely to thrive in dry conditions than those that can’t. Likewise, animals that can run fast will be more successful at evading predators than their slower rivals. If a random genetic modification helps an organism to survive and to reproduce, that modification will itself survive and spread throughout the population, via the organism’s offspring.
Evolutionary algorithms are based on a simplified model of this biological evolution. To solve a particular problem we create an environment in which potential solutions can evolve. The environment is shaped by the parameters of the problem and encourages the evolution of good solutions.
The field of Evolutionary Computation encompasses several types of evolutionary algorithm. These include Genetic Algorithms (GAs), Evolution Strategies, Genetic Programming (GP), Evolutionary Programming and Learning Classifier Systems.
The most common type of evolutionary algorithm is the generational genetic algorithm. The basic outline of a generational GA is as follows (most other EA variants are broadly similar). A population of candidate solutions is iteratively evolved over many generations. Mimicking the concept of natural selection in biology, the survival of candidates (or their offspring) from generation to generation in an EA is governed by a fitness function that evaluates each candidate according to how close it is to the desired outcome, and a selection strategy that favours the better solutions. Over time, the quality of the solutions in the population should improve. If the program is successful, we can terminate the evolution once it has found a solution that is good enough.
An Example
Now that we have introduced the basic concepts and terminology, I will attempt to illustrate by way of an example. Suppose that we want to use evolution to generate a particular character string, for example “HELLO WORLD”. This is a contrived example in as much as it assumes that we don’t know how to create such a string and that evolution is the best approach available to us. However, bear with me as this simple example is useful for demonstrating exactly how the evolutionary approach works.
Each candidate solution in our population will be a string. We’ll use a fixed-length representation so that each string is 11 characters long. Each character in a string will be one of the 27 valid characters (the upper case letters ‘A’ to ‘Z’ plus the space character).
For the fitness function we’ll use the simple approach of assigning a candidate solution one point for each position in the string that has the correct character. For the string “HELLO WORLD” this gives a maximum possible fitness score of 11 (the length of the string).
The first task for the evolutionary algorithm is to randomly generate the initial population. We can use any size population that we choose. Typical EA population sizes can vary from tens to thousands of individuals. For this example we will use a population size of 10. After the initialisation of the population we might have the following candidates (fitness scores in brackets):
1. GERZUNFXCEN (1) 2. HSFDAHDMUYZ (1) 3. UQ IGARHGJN (0) 4. ZASIB WSUVP (2) 5. XIXROIUAZBH (1) 6. VDLGCWMBFYA (1) 7. SY YUHYRSEE (0) 8. EUSVBIVFHFK (0) 9. HHENRFZAMZH (1) 10. UJBBDFZPLCN (0)
None of these candidate solutions is particularly good. The best (number 4) has just two characters out of eleven that match the target string (the space character and the ‘W’).
The next step is to select candidates based on their fitness and use them to create a new generation. One technique for favouring the selection of fitter candidates over weaker candidates is to assign each candidate a selection probability proportionate to its fitness.
If we use fitness-proportionate selection, none of the candidates with zero fitness will be selected and the candidate with a fitness of 2 is twice as likely to be selected as any of the candidates with a fitness of 1. For the next step we need to select 10 parents, so it is obvious that some of the fit candidates are going to be selected multiple times.
Now that we have some parents, we can breed the next generation. We do this via a process called cross-over, which is analogous to sexual reproduction in biology. For each pair of parents, a cross-over point is selected randomly. Assuming that the first two randomly selected parents are numbers 2 and 4, if the cross-over occurs after the first four characters, we will get the following offspring:
Parent 1: HSFDAHDMUYZ Parent 2: ZASIB WSUVP Offspring 1: HSFDB WSUVP Offspring 2: ZASIAHDMUYZ
This recombination has given us two new candidates for the next generation, one of which is better than either of the parents (offspring 1 has a fitness score of 3). This shows how cross-over can lead towards better solutions. However, looking at the initial population as a whole, we can see that no combination of cross-overs will ever result in a candidate with a fitness higher than 6. This is because, among all 10 original candidates, there are only 6 positions in which we have the correct character.
This can be mitigated to some extent by increasing the size of the population. With 100 individuals in the initial population we would be much more likely to have the necessary building blocks for a perfect solution, but there is no guarantee. This is where mutation comes in.
Mutation is implemented by modifying each character in a string according to some small probability, say 0.02 or 0.05. This means that any single individual will be changed only slightly by mutation, or perhaps not at all.
By applying mutation to each of the offspring produced by cross-over, we will occasionally introduce correct characters in new positions. We will also occasionally remove correct characters but these bad mutations are unlikely to survive selection in the next generation, so this is not a big problem. Advantageous mutations will be propagated by cross-over and selection and will quickly spread throughout the population.
After repeating this process for dozens or perhaps even hundreds of generations we will eventually converge on our desired solution.
This is a convoluted process for finding a string that we already knew to start with. However, as we shall see later, the evolutionary approach generalises to deal with problems where we don’t know what the best solution is and therefore can’t encode that knowledge in our fitness function.
The important point demonstrated by this example is that we can arrive at a satisfactory solution without having to enumerate every possible candidate in the search space. Even for this trivial example, a brute force search would involve generating and checking approximately 5.6 quadrillion strings.
The Outline of an Evolutionary Algorithm
- Genesis – Create an initial set (population) of n candidate solutions. This may be done entirely randomly or the population may be seeded with some hand-picked candidates.
- Evaluation – Evaluate each member of the population using some fitness function.
- Survival of the Fittest – Select a number of members of the evaluated population, favouring those with higher fitness scores. These will be the parents of the next generation.
- Evolution – Generate a new population of offspring by randomly altering and/or combining elements of the parent candidates. The evolution is performed by one or more evolutionary operators. The most common operators are cross-over and mutation. Cross-over takes two parents, cuts them each into two or more pieces and recombines the pieces to create two new offspring. Mutation copies an individual but with small, random modifications (such as flipping a bit from zero to one).
- Iteration – Repeat steps 2-4 until a satisfactory solution is found or some other termination condition is met (such as the number of generations or elapsed time).
When are Evolutionary Algorithms Useful?
Evolutionary algorithms are typically used to provide good approximate solutions to problems that cannot be solved easily using other techniques. Many optimisation problems fall into this category. It may be too computationally-intensive to find an exact solution but sometimes a near-optimal solution is sufficient. In these situations evolutionary techniques can be effective. Due to their random nature, evolutionary algorithms are never guaranteed to find an optimal solution for any problem, but they will often find a good solution if one exists.
One example of this kind of optimisation problem is the challenge of timetabling. Schools and universities must arrange room and staff allocations to suit the needs of their curriculum. There are several constraints that must be satisfied. A member of staff can only be in one place at a time, they can only teach classes that are in their area of expertise, rooms cannot host lessons if they are already occupied, and classes must not clash with other classes taken by the same students. This is a combinatorial problem and known to be NP-Hard. It is not feasible to exhaustively search for the optimal timetable due to the huge amount of computation involved. Instead, heuristics must be used. Genetic algorithms have proven to be a successful way of generating satisfactory solutions to many scheduling problems.
Evolutionary algorithms can also be used to tackle problems that humans don’t really know how to solve. An EA, free of any human preconceptions or biases, can generate surprising solutions that are comparable to, or better than, the best human-generated efforts. It is merely necessary that we can recognise a good solution if it were presented to us, even if we don’t know how to create a good solution. In other words, we need to be able to formulate an effective fitness function.
Engineers working for NASA know a lot about physics. They know exactly which characteristics make for a good communications antenna. But the process of designing an antenna so that it has the necessary properties is hard. Even though the engineers know what is required from the final antenna, they may not know how to design the antenna so that it satisfies those requirements.
NASA’s Evolvable Systems Group has used evolutionary algorithms to successfully evolve antennas for use on satellites. These evolved antennas (pictured) have irregular shapes with no obvious symmetry. It is unlikely that a human expert would have arrived at such an unconventional design. Despite this, when tested these antennas proved to be extremely well adapted to their purpose.
Other Examples of Evolutionary Computation in Action
- Evolving the Mona Lisa. How well can you approximate Leonard da Vinci’s Mona Lisa using only 50 polygons?
- Evolving a buggy to ride a randomly-generated landscape (Flash movie).
- Evolving clocks (video).
- Evolving Lego bridges.
- Solving Sudoku with evolution (Java applet).
Pre-requisites
There are two requirements that must be met before an evolutionary algorithm can be used for a particular problem. Firstly, we need a way to encode candidate solutions to the problem. The simplest encoding, and that used by many genetic algorithms, is a bit string. Each candidate is simply a sequence of zeros and ones. This encoding makes cross-over and mutation very straightforward, but that does not mean that you cannot use more complicated representations. In fact, most of the examples listed in the previous section used more sophisticated candidate representations. As long as we can devise a scheme for evolving the candidates, there really is no restriction on the types that we can use. Genetic programming (GP) is a good example of this. GP evolves computer programs represented as syntax trees.
The second requirement for applying evolutionary algorithms is that there must be a way of evaluating partial solutions to the problem – the fitness function. It is not sufficient to evaluate solutions as right or wrong, the fitness score needs to indicate how right or, if your glass is half empty, how wrong a candidate solution is. So a function that returns either 0 or 1 is useless. A function that returns a score on a scale of 1 – 100 is better. We need shades of grey, not just black and white, since this is how the algorithm guides the random evolution to find increasingly better solutions.
Further Reading
On the Stupidity of People
The big news in the UK today is the mysterious destruction of a wind turbine in Lincolnshire. The 300ft high turbine lost one of its three blades and suffered damage to a second at about 4am on Sunday morning. Based on careful analysis of the facts, most of the nation’s media has attributed the incident to a UFO. The Sun “newspaper” felt that this incident was sufficiently important to dedicate its front page to the story:
Dorothy, of Louth, said: “The lights were moving across the sky towards the wind farm. Then I saw a low flying object. It was skimming across the sky towards the turbines.”
Hours later there was an almighty smash.
Only “hours later”? I’m ready to believe already, but there’s more. The BBC corroborates this evidence with a quote from a spokesman for the prestigious Flying Saucer Bureau:
Russ Kellett, from the Flying Saucer Bureau, said witnesses had told him of activity in the area.
“One saw what they at first thought was a low-flying aircraft on the Saturday evening and another heard a loud banging in the early hours of Sunday,” he said.
A low flying aircraft on Saturday, a bang on Sunday, how can they not be linked? Dale Vince, a spokesman for the turbine’s owners Ecotricity helpfully suggested to the Today programme on Radio 4 that “something the size and weight of a cow would do it” (which itself suggests an appropriate soundtrack for the incident).
If you’re still sceptical about the involvement of extra-terrestrials, possibly the most compelling evidence comes from witness John Harrison:
John Harrison, another witness, described how he looked out of his landing window and saw a “massive ball of light with tentacles going right down to the ground” over the wind farm. He said: “It was huge. With the tentacles it looked just like an octopus.”
Unfortunately, the journalists at the Guardian don’t exhibit the same imagination as John. They scandalously suggest that John and other witnesses might actually have been observing the fireworks display just down the road from the wind farm. It’s an easy mistake to make (at least compared to this).
Now I’m not a wind turbine engineer, but I’m not yet ready to rule out the possibility of mechanical failure. It wouldn’t be the first time. This particular turbine had only been operational since April and this week experienced its lowest temperatures so far. Perhaps there’s a link there? Or maybe I’m just jumping to ridiculous conclusions?
So what the hell does all this have to do with software development? Not much really, except it provides an opportunity to mention Occam’s Razor, which is as applicable to debugging as it is to debunking. The idea is that you should favour the explanation that fits the facts and relies on the fewest assumptions. Next time you hear yourself uttering the fateful phrase “it must be a compiler bug”, think of the good people of Lincolnshire. Likewise, if there are 100,000 other developers successfully using a given library and it doesn’t work with your program, you shouldn’t be looking at the library’s source until you’ve proved the correctness of your own.
The alternative to Occam’s Razor is to believe that the Earth is flat, that all the space programmes are fakes (because those photos of a spherical Earth can’t be real) and that the destruction of Tower 7 was an inside job to destroy the evidence of the US government’s involvement in this spherical conspiracy.



