RSS Feed

New Adventures in Software


More thoughts on Stackoverflow.com

Posted in Software Development,The Internet by Dan on September 26th, 2008

Since my previous post on the subject, Stackoverflow.com has moved from private beta to public beta.  I’ve had more time to use the site and have some more thoughts.  The criticisms here are meant to be constructive.  Hopefully the feedback from users will help the Stackoverflow team to make a good site even better.

Performance

First the good news.  The site has transitioned from private to public very well.  Jeff and his team seem to have got it right in terms of architecture and infrastructure because, even with the increased load, it remains blindingly fast.

Front Page

In terms of usability, I think there’s more that could be done to help me find the content that I’m interested in.  The default front page is, to be honest, not very useful.  New questions are coming in so fast and on so many topics that displaying the most recent questions is just noise.

I would prefer to have a personalised home page that shows me relevant questions based on my previous answering/voting history.  I realise that this is major new functionality and I’m not criticising the Stackoverflow team for not having this in the initial version, it makes sense to get the site up and running first.  However, it would be great if this could be implemented at some point.  I’m not alone on this one, it’s the second most popular requested feature at the moment.

Presently I’m finding stuff that I want to look at by going to the tags page and clicking on interesting topics.  But I’m sure I’m missing out on questions that would be of interest if only I could find them.

Tag Cloud

The tag cloud on the right of the front page isn’t very helpful either.  It’s ordered with the most recent first.  If I just wanted to view questions tagged “html”, I’m going to struggle to find the tag in the cloud.  An alphabetical ordering would be more usable.  Unfortunately, this has already been suggested and rejected.

Voting and Reputation

I outlined my concerns on the voting mechanism previously.  In the interests of being constructive, rather than just a whiny blogger, I’ve opened new issues on the Stackoverflow Uservoice page.  If you agree with me, please vote on these issues:

Addressing each of these will help in resolving The Fastest Gun in the West Problem (currently the number one voted-on issue).  The problem is that early answers get the votes and later, better answers are largely ignored.  Removing the penalty for down-voting will encourage more down votes where they are deserved (so an early answer that is later shown to be wrong is less likely to retain a high score).  Also, if a down vote was as powerful as an up vote, people might be more careful in crafting good answers as opposed to quick answers.

Source Control and Backups – More than just a good idea

Posted in Software Development by Dan on September 25th, 2008

Are there really software development teams out there that don’t use any form of proper source control at all, even the bad kind?  I’d like to think that it wasn’t the case but I’m not so naive.

There’s a reason that “Do you use source control?” is the first question on the Joel Test.  It’s because it’s the most important.  If you answer “no” to this question you shouldn’t be allowed to answer subsequent questions.  Even if the rest of your process is perfect, you score zero.  You failed at software development.  I could say that if your team doesn’t use source control it is a disaster waiting to happen, but more likely the disaster already happened and you haven’t noticed yet.

Of course, you and I aren’t nearly dumb enough to try developing anything more complex than “Hello World” without version control in place.  I’m sure I’m preaching to the converted.  The kind of people who read obscure software development blogs probably already know a few things about effective software development.

But how good are your back-ups?

You do have a back-up, don’t you?

If you don’t have a back-up you are one accidental key-stroke or one hardware failure away from scoring zero on the Joel Test (under my rules)… and failing at software development.  Hardware will fail, people will screw-up, disgruntled former employees will set fire to the building.  None of these is a problem but a failure to anticipate and prepare is.

How often do you back-up?

There is only one right answer to this: every day.  Weekly back-ups are too costly.  Can you really afford to have your whole team redo an entire week’s work?  The first time you lose a week’s work you will switch to daily back-ups, so why not just do it now?

A melted back-up is no back-up at all

Off-Site Storage. You could physically take tapes to another location or you could upload files to a remote server.  Just don’t leave them here.

Does it actually work?

Honestly, have you ever tried restoring your source control back-up onto a different machine?  The most comprehensive back-up plan imaginable is useless if you can’t restore the back-ups.  If you haven’t seen it working (recently) then it doesn’t work.  There’s a good time and a bad time to find out that your back-ups don’t work.  15 minutes after your source control server spontaneously combusted is the bad time.

Are you still here?  You should be checking those back-up tapes…

UPDATE: The good people of Stackoverflow are discussing what could possibly be a good excuse for not using source control.

Stackoverflow.com – First Impressions

Posted in Software Development,The Internet by Dan on September 12th, 2008

Over the last few days I’ve been playing with the beta of Stackoverflow.  In case you are unaware, Stackoverflow is a joint venture between Jeff Atwood of Coding Horror and Joel Spolsky of Joel on Software fame.  It’s basically a question and answers site for software developers.  A mixture of Experts Exchange, Proggit and Wikipedia.  The site is scheduled to come out of beta on Monday when it will open its doors to everyone.

From initial impressions I think it’s fair to say that the site will be a success, initially at least.  Being A-list bloggers (and now podcasters too), Jeff and Joel have been able to generate a lot of exposure for their project.

Like Jeff’s blog, the minimalist site design is clean and bold, and so far the whole system is very responsive (we’ll see if that’s still the case when the traffic spikes on Monday :) ).  The beta audience are already posting thousands of questions, almost all of which generate extremely prompt answers (of varying quality).

However, I think the site suffers a little from the ambitions of trying to be too many different things (is it a programming forum, or is it a Wiki?).  There are a lot of different ideas in the implementation that interact via a quite complicated set of rules that have evolved over the course of the beta.

Reputation & Badges

Stackoverflow has two mechanisms for measuring a user’s standing within the community.  Firstly, each user has a reputation score.  This starts at 1 and increases as you make positive contributions (posting questions and answers that get voted up).  As you reach various milestones, you get more privileges within the community, such as being able to vote on answers or tag other people’s questions.

Your reputation can be diminished if you get voted down or reported for abuse, but it can’t go below 1 and on the whole it’s heavily biased in favour of upward movement.

The second incentive for users to contribute is the ability to collect “badges”.  This works exactly like the Cub Scouts.  Some badges are easy to achieve (just fill in your profile or post your first question), and others are much harder to obtain (get 100 up votes for one of your answers).

Voting

Voting is one area of the site that I think could do with an overhaul.  It’s unbalanced and not transparent enough.  If your answer gets voted up, you gain 10 points of reputation.  But if your answer gets voted down, you only lose 2 points.  So if you post something that sounds plausible to the uninformed masses but is actually wrong, you could get 5 up votes and 6 down votes for a net score of -1 yet still gain a 38-point reputation boost.  An up vote should have equal weight to a down vote, just like on DZone or Slashdot.  It also might be better to show both the number of up votes and the number of down votes (as on DZone) rather than just the net total.  This would make it easier to identify controversial content (something with 10 down votes and 12 up votes is not quite the same as something with no down votes and 2 up votes).

Another problem with the voting is that down votes penalise the voter as well as the user whose answer is being voted on.  So if you post something wrong like “Java passes objects by reference”, I can either ignore it or lose 1 point of reputation for giving you the down vote that you deserve (even then it will take five of us to fully cancel out the one up vote that you got from someone who didn’t know better).

When I queried the justification for penalising down-voters, I was told that it was to combat attempts to game the system.  Apparently, earlier in the beta, users were posting answers to questions and then voting down everybody else’s answers so that their answer would appear at the top.  The idea was that by making users pay to vote down this behaviour would be discouraged.  A better solution to this problem would have been to remove the conflict of interest by not allowing users to answer and vote on the same question (which is how Slashdot’s mod points work), rather than punishing all down votes across the whole site.

The net effect of this voting system is that everybody’s reputation increases pretty quickly.  Beyond the minimum score required to get full privileges the numbers can become meaningless.  To avoid having to rename the site Integeroverflow there are a couple of artifical limits that restrict the number of votes you can cast and the number of reputation points you can earn each day.

Other Thoughts

Aside from my reservations about the voting, my impressions of Stackoverflow are mostly positive. The fact that it has already attracted hundreds of enthusiastic participants suggests that it has genuinely found a niche. However, I do feel that it is probably more elaborate than it needs to be (I don’t really get the need for the Wiki functionality).

Further Reading

Denton Gentry and Sara Chipps have also written about their impressions of Stackoverflow.  Or, on a less positive note, you could try Crapoverflow.

Battling iTunes 8

Posted in Mac,The Internet by Dan on September 11th, 2008

Frustrated by iTunes…

Apple recently dumped iTunes 8 on us, quietly removing a couple of configuration options in the process.  The new iTunes, with its not-so-clever “Genius” function, seems to be a not very subtle attempt to sell more music through the iTunes Music Store.

In iTunes 7 you could hide the “Genre” column in the 3-column browser pane and instead show just the “Artist” and “Album” columns.  In iTunes 8, “Genre” is back and there is no way to get rid of it from the preferences.  Likewise, the annoying arrow links to the iTunes Music store, which could previously be turned off, are back.

Fortunately, on the Mac at least, the old preferences can still be tweaked, just not from the iTunes GUI.  You have to run the following commands from Terminal (thanks to Mac OS X Hints for the information):

defaults write com.apple.iTunes show-genre-when-browsing -bool FALSE
defaults write com.apple.iTunes show-store-arrow-links -bool FALSE

…impressed by BBC Radio Labs

While on the subject of iTunes, Matthew Wood at the BBC’s Radio Labs has come up with a way to access the BBC’s on-demand radio content, and associated programme information, from iTunes.

Unfortunately, the BBC’s live streaming is still not available in iTunes because it uses RealPlayer.  It would be great to have a single, useful front end for all Internet radio stations rather than having different applications and web players for different broadcasters.

On the plus side, I like the new Radio Pop site that the Radio Labs team have built using Ruby on Rails.  The name’s a bit misleading – it sounds like it’s a new pop music radio station.  It’s actually a social radio site.  A bit like Last.fm but for BBC Radio (and potentially other radio stations in the future).  Perhaps they can hook it up to the real Last.fm so that your radio listening is combined with your CD/MP3 listening?  In fact, they already have something along those lines.  They also already automatically scrobble all tracks broadcast (see the BBC 6Music page on Last.fm for an example).

I’ve been following the Radio Labs blog for a while now.  It seems like a great place to work.  Lots of interesting experimental projects and cutting-edge technology.

Comments Off

Software Project Names – We have a winner

Posted in PHP,Software Development by Dan on September 5th, 2008

After an exhaustive search, my quest to find the best software project name has struck gold.  Following in the footsteps of Ruby on Rails, Groovy on Grails and a whole host of similarly monickered web frameworks, I give you… PHP on Crutches.  It’s so much more than a name though.  It also has a quality logo and comes with the following helpful advice:

PHP is hazardous to your health. Use something else if you can.

Avoid NIO, Get Better Throughput

Posted in Java by Dan on September 3rd, 2008

The Java NIO (new/non-blocking I/O) API introduced in Java 1.4 is arguably the most arcane part of the standard library.  With channels, selectors, byte buffers and all the associated flipping, marking, compacting, event-handling and registering/de-registering of read/write interest, it’s an entirely different level of complexity to the old-fashioned, straightforward blocking I/O.  And if you want to use SSL with NIO then it’s a whole new world of pain.

Few have mastered NIO.  For most it provides an opportunity to really get to know your debugger.  “Should this buffer be flipped before I pass it to this method, or should the method flip it?”.  BufferOverflowExceptions and memory leaks abound.

So, in the spirit of doing the simplest thing that could possibly work, writing your own NIO code is usually best avoided unless you have a compelling reason.  Fortunately, some masochistic individuals have done a lot of the hard work so that we don’t have to.  Projects such as Grizzly and QuickServer provide proven, reusable non-blocking server components.

However, in most instances, maybe non-blocking I/O is not necessary at all? In fact, maybe it is detrimental to performance?

That’s the point that Paul Tyma makes.  He attacks some of the received wisdom about the relative merits of blocking and non-blocking servers in Java.  The characteristics of JVMs and threading libraries change as new advances are made.  Good advice often becomes bad advice over time, demonstrating the importance of making your own measurements rather than falling back on superstitions.

Paul’s experiments show that higher throughput is achieved with blocking I/O, that thread-per-socket designs actually scale well, and that the costs of context-switching and synchronisation aren’t always significant.  Paul’s slides form his talk “Thousands of Threads and Blocking I/O: The Old Way to Write Java Servers Is New Again (and Way Better)” are well worth a look.

If you are writing your own multi-threaded servers in Java, Esmond Pitt’s Fundamental Java Networking and Java Concurrency in Practice by Brian Goetz et al. are essential reading.

Real World Haskell

Posted in Books,Haskell by Dan on September 2nd, 2008

The book Real World Haskell by Bryan O’Sullivan, Don Stewart, and John Goerzen, will be available to buy from November.

The content is also freely available online already and is well worth a look if, like me, you are keen to learn more about developing actual useful programs with Haskell.

I first mentioned Real World Haskell last year.  At the time I also highlighted GHC’s LGPL problems as an obstacle that could potentially discourage the wider adoption of Haskell.  It seems that some progress is being made on that front.  At present though, GHC will still statically link to GMP, which means that developers who distribute GHC-compiled binaries are distributing “derivative works” as defined by the LGPL.

Comments Off

Naming Software Projects

Posted in Software Development by Dan on September 1st, 2008

“What’s in a name?” asked William Shakespeare, but then he wasn’t a software developer.  If he had been his plays might have had mutually-recursive titles or Henry V might have been called YAPAK.

“How do you name your software projects?” was a question posed on Reddit recently.  Coming up with good names is not easy.  I’m reminded of the quote that letting software developers name products is like letting the marketing people code them.  But in the absence of focus groups, or even a marketing professional, we try our best anyway.


© 2004-2007 Jeffrey Rowland/overcompensating.com

Common Strategies

Many developers will default to the tried-and-tested naming algorithm popular with the Linux crowd:

  1. Take a single word that loosely indicates what the software does, such as “mail”, “reader”, “writer”, “player”, etc.
  2. Then choose a single-letter prefix according to the following rules:
    • Is it a GNOME application or other GNU-related program?  Then it should begin with the letter ‘g’.
    • Is it a KDE application?  Then it should begin with the letter ‘k’.
    • Is it an X-Windows application?  Then it should begin with the letter ‘x’.
    • Is it written in Java?  Then it should begin with the letter ‘J’.
    • Is it written in Ruby?  Then it should begin with the letter ‘r’.
    • Do you want to attract the attention of Apple’s legal team?  Then it should begin with the letter ‘i’.

Another naming strategy, for the creatively-challenged, is the Ronseal approach.  This involves choosing an entirely accurate but wholly unimaginitive name (often abbreviated to an acronym).  I’ve been guilty of using this technique in the past.  Occasionally, you can derive a mildly amusing acronym.

If you have the Web 2.0 cool you can take a word and miss a letter out, brilliantly disguising the fact that the domain name that you really wanted wasn’t available.

My Own Efforts

I didn’t have to think too long over the name for ReportNG.  Unexciting as it is, it was a fairly obvious due to its relation to TestNG.  It also has a built-in barometer for success.  When Google stops asking “Did you mean reporting?” I’ve made it.

On the other hand, I spent ages trying to come up with a name for the Watchmaker Framework, and a couple of years later I’m not entirely satisfied with it, though I haven’t come up with anything better in the meantime.  All of the obvious evolution-related words were already being used for other evolutionary computation projects.  Eventually I settled on “Watchmaker”, an allusion to the Watchmaker Analogy and, by extension, to Richard Dawkins’ The Blind Watchmaker, but it’s probably a confusing project name to those who aren’t familiar with the analogy.

Thinking of a good name can be difficult.  You have to wait for inspiration to strike and you have to try to avoid Firefox-esque cock-ups.  I find it very difficult to start on a new project idea until I have thought of a name for it.  And yet sometimes you have a great idea for a project name when you’re not looking for one and then you need to find a project to apply it to.

(Dis)Honourable Mentions

Which software projects have the best and worst names?  I think Hibernate is a good name (although I often refer to that software in my own more colourful terms).  Lisp and Smalltalk aren’t exactly positive words.  Names like Python, Java and Ruby are much cooler, regardless of the relative merits of the languages.

What other projects have particularly good or bad names?