
Last week I came through this article about results of analyzing more than 2 Million builds in Travis Continuous Integration.
As I started reading, the author drew my attention with interesting data when comparing statically typed languages vs dynamic languages. So I ended up downloading the full report to read on.
I must say I have always been a Java developer myself, and apart from the irrational attachment I feel for this language, there are some good reasons why I still stick to it. This should be enough to advise you readers that my opinions may be biased. So pick them with care and use your own judgement.
Shedding Light
The following figure shows the percentage of builds resulting in each state (passed, failed, errored, cancelled). I would like to focus on the failed builds. Errored builds are a result broken builds due to infrastructure issues and canceled builds, I assume, are a result of human intervention, maybe recognizing a build that would result in error. So errored and canceled builds signal bad builds, not bad code.
Failed builds in a statically typed language such as Java account for 13.4% of all builds. Whereas in a dynamic language like Ruby this number is 21.3% . This represents a 7.9 points difference in tests failure rate (and 9.7 points difference in success rate of Java builds vs Ruby builds).
Another significant difference is the number of tests run by projects in Java and Ruby, with Ruby projects having on average 10 times more tests than Java projects. The report explains this difference:
The lack of a type system in Ruby might force developers to write more tests for what the compiler can check automatically in the case of Java
This in turn explains why Ruby has more failures than Java in its builds. The more tests, the more chances of failure.
Ruby projects have a four-times higher likelihood for their tests to fail in the CI environment than Java projects
More Questions Than Answers
This report raises some questions in the battle of statically vs dynamically typed languages.
One of the main arguments against statically typed languages is boilerplate code. But this report suggests that what you save writing code, you end up writing tests. If I had an oracle I’d ask if properties of dynamically typed languages do pay off in terms of time-to-market?
Another question is cost. There are several studies reflecting on how error correction costs escalates the later it is discovered in the software life cycle. This report from NASA shows that fixing an error discovered in the testing phase are 50 times more expensive than discovering it during requirements gathering. The cost difference in error fixing between development and test phases is 40 times.
Based on this study, choosing Java over Ruby to develop the same project could drive the costs of fixing errors discovered by tests down by 7.7025%. The next question I’d ask my oracle would be: What’s the overall cost difference between development stacks?
Conclusion
It’s time to stop the debate about languages just based on likes and dislikes. We now have data in open repositories and cloud development tools to extract conclusions about technology stacks that could drive real improvements in software engineering practices.
We need to understand the questions we need to asks and maybe provide additional tools to complete the picture and get the right answers.
Until then, I guess we will have to keep the conversation on opinions, or directly avoid it because, at least in my experience, has proven non constructive on these terms.
I originally posted this article on linkedin: The Hidden Cost of Speed: Statically vs Dynamically typed languages