About a month ago, Nat Friedman, the CEO of Github welcomed Semmle to Github with the words:

Human progress depends on the open source community.

Around those days, I was lookingat an old presentation by Sonatype which provides some data on the explosion of open source usage claiming that:

In a typical application 90% of the code is from 3rd parties.

I set out to find a possible explanation for such strong statements which I will briefly share in this story.


Repair vs Replace: The Old Hardware Story

In my experience, when you gather 50 people in one room and ask them if they would take their broken 2+ year old phone to a repair shop, you are likely to get fewer than 5 people who raise their hand. An explanation for this observation is that the cost of hardware is very low (even if you keep paying more and more for your iPhone). For example, the producer price index for semiconductors since 1975 looks like this:

Granted that you typically don’t replace an iPhone with another of the same model, the current price of hardware is so low that it makes more sense to actually replace than repair it. The discrepancy in price could partly be explained by an inflation over the price of the delta between the functionalities.

To understand the battle of repair vs replace, I like to use the formulas:

repair = cost of identification + cost of fix

replace = cost of equivalent replacement

I don’t know if it is just the cost of equivalent replacement that went down because the lower cost of production — it is fair to suspect that the cost of repair also went up with the increasing complexity of hardware over time. Nevertheless, probably everyone would agree that from a cost perspective: repair > replace.


Repair vs Replace: The (New?) Software Story

Let’s now focus on software. By just looking at the salaries of software developers, we have reason to believe that the cost of software production is still rising:

In fact, when you compare hardware vs software costs over time you get something like this:

Image taken from here.

We also definitely know that the complexity of software is increasing dramatically. An example that always boggles my mind comes from Google’s article on their monorepo experience:

The number of Google software developers has steadily increased, and the size of the Google codebase has grown exponentially.

With a linear increase of software developers in the world (which are likely to produce larger codebases than Googlers), we can only expect the size of the World’s codebase to grow to fantastic numbers. I believe it is safe to conclude that at least the cost of identification is rising dramatically.

Another data point regarding the cost of repair (more towards the cost of fixing) that I also find very revealing comes from Stripe’s Report on The Developer Coefficient:

So, it seems that in battle of repair vs replace for softwarethe cost of repair is also increasing at a fast pace.

At the same time, when you consider general algorithms and technologies that are battle tested, we have plenty of ready available options. In other words, for a lot (if not all) problems to which a team of developers can produce software solution from scratch, we have a basis for an alternative equivalent solution under the umbrella of open source software.

It is then possible that the explosion of open source usage is explained by the fact that the repair cost of in-house software is higher than replacing it with a modification of an open source solution.


A Possible Future

Even if we are not there yet, we are probably approaching fast the replace phase and we should expect a major disruption in the daily lives of developers.

Open source software is only a solution to fight the costs of repair — I find it ironic that the repair of open source software is actually paid with the after-work time of software developers that are increasing the costs of repair in their companies. Actually that’s not really true and the situation is worse: according to this report, of the top 10 projects by number of contributors on GitHub, only npmjs is not under the umbrella of the BigTech. So, from the perspective of a small company, not only you are running on their infrastructure but also your software is becoming more crucially dependent on software they actively maintain/control.

It is only a matter of time until this cost becomes outrageous and the security implications too serious to use these solutions. Going back to the beginning of this story, it would probably be more accurate to say that in the very short-term human progress depends on the open source community.

The truth might be that we are just waiting for an economically viable technology that can efficiently store and retrieve code based on some high-level query language. Once a sketch solution for any problem is available and easily accessible in some gigantic database of code, the next step is the emergence of a Software as a Service paid by the hour in some marketplace. Then, we can expect automation to take over the process of gluing together these solutions. The end result will be a machine produced solution given to a human for final touches. A major consequence of such scenario is that software development would fundamentally shift towards software checking — instead of writing code, we could become expert reviewers of code produced by machines.