Blog atrasado con gajes del oficio

Linux Market Share Pitfalls: data manipulation and non-existent ranking markers

A while ago someone whom I came across on a technical forum was vouching so much for one of the most used/popular Linux Distributions that he/she almost gave the impression of fully endorsing the distro in question, without further proof.

I said that he/she stated it without further proof because all distributions consist of several software which make up the whole system. So it would be gullible to think that one particular distro employs Free Software while clearly stating that the rest of the license by which the distro developers go aheaad and implement it, is - as a result - within the framework of say for example, what the Free Software Foundation accepts.

This is not the case. Most Linux distros, and even more those that are amongst the most popular, are not very clear in their terms with the programs that the distro’ s developers include and consequently employ within it, for further mass use. No distribution out there but only a handful of them do this.

The problem that the above presents to the adoption of the Free Software license is that many of these distributions do not uphold licenses as they should or simply because its developers have included other elements such as firmware nonfree blobs, or have provided ambiguously writing in the documentation, and have offered other methods that have provided download venues which in turn officially endorse nonfree programs. More information about this can be seen on the article entitled Explaining Why We Don’t Endorse Other Systems 1 posted on the Free Software Foundation website.

So the poor statements that this user was trying to make about one of the most popular Linux distributions at the time, could no longer hold valid together under the license for which the Free Software Foundation stands upon.

For further info about this, one could visit the Free Software Foundation to get a glimpse at what this is about.

But to make matters even worse, the user who was trying to make his silly point, brought up one if not the most inaccurate and misleading source of Linux Market Share measure rankings’ markers out there, a website named Distrowatch that lists the most popular GNU/Linux distributions.

Which takes me to the second rant about why the above cannot be taken as an statistical absolute. For Distrowatch is just a community driven, heavily edited, bureaucratically biased reviewed epicenter which sits on the web with layers of html markup with a domain registration. Let me emphasize once again about its biased one sided view about the certain distros’ Linux Market Share. It is important to do so because it is almost and will always be imprecise to measure the use of GNU/Linux distributions across the spectrum and compare these - whether community or commercially based

  • with many of the most heard- Operating systems without reliable methods and just by a simple likes/preferences which in the end, is nothing more than a preconceived notion of what distro should and should not be on the top.

Because of this, Distrowatch statement is vague to say the least.

According to Stack Frames: A Look From Inside, its author said that:

Each website has its own list of the most popular distributions, updated on the basis of different criteria, hence with different results. To give one example, Linux Mint was the most popular distribution in 2013 and in the first half of 2014 according to Distrowatch, but is ranked twelfth on the LinuxCounter and LWN. 2

And this is important because the disparity on numbers when all the distributions are compared across LinuxCounter 3 and LWN, 4 greatly differ from the ranking on Distrowatch. The differences would be much more profound if other websites would try to gather these numbers as long as the list is not heavily edited or maintained by users who manipulate this data like in the case of Distrowatch.

In the case of the LWN site, the author of Stack Frames: A Look From Inside, was also on the err side and may have made some subtle mistakes with the rankings of the distros. This may be noted with some of the inconsistencies with the top market share of the distros in use. That is, the chronologically order of the distributions as it was reflected on the book and as the author pointed out, is certainly troublesome. The list on LWN is enumerated alphabetically and it does not necessarily offers numbers that could be used as a better counter/measurement tool/indicator/predictor in this case. It does show, however, the mainstream distributions or the ones that have been around the longest

  • commercial and community based - in the computing world.

Reading about this pervasive problem with the market share of GNU/Linux programs, and the distributions that include these programs in addition to the share of these distributions along with the biases that are taking place in community sites such as Distrowatch, the author of a blog on the Medium blogging service platform said that:

I notice when they have an axe to grind against a particular distribution bad reviews are allowed to flourish but if I or others submit positive reviews they tend to get ignored and there are more than a handful of distributions where I have noticed that trend

But what it’s unfortunate is that such misleading bits of information occur nowadays, especially within the culture of the GNU/Linux Operating system which even includes its kernel without which the vast GNU software which is comprised of, accomplishes nothing. But most importantly, listing it as so called ‘popular’ when most of the ‘popular’ distributions do not even adhere to the Free Software Foundation standards guidelines of Free Software.

The author in that blog stated that Distrowatch is biased towards free-as-in-beer distributions, but I would disagree because I do think the bias is there regardless whether is more commercially oriented by which the distribution in question implements free software or not. It is the case that community driven distros do not necessarily implement the licenses any better than commercially based distros. But it makes no difference in the case of this website. The users responsible to update the counter list for the distros or in this case the distro that should appear on the top are undoubtedly misinforming real/accurate market share data without further repercussions. And by the looks of it, what matters is the ad-driven commercially source of income that derives from this data manipulation to the owner of Distrowatch.

I have no idea whether this GNU/Linux user that commented about how that Linux distro fared in comparison with the rest, was aware of these facts, but Distrowatch in general has been around for a while and I wonder whether they have affected the objective and the goal overall of the Free Software Foundation Guidelines.
No doubt that the reviews need further revision but they’re doing so while promoting their own, without regard for the particular distro and whether this is community-driven or not.

The Linux Distribution Timeline offers in turn the progress of the Linux Distributions and although is a great project on its own right, it does not undertake where all the sites have fallen short to accomplish this difficult task of having a better Market Share Model that tackles the Linux world.

The website for this project is located at futurist.se

Sources

1- Explaining Why We Don’t Endorse Other Systems

2-Stack Frames: A Look From Inside

3-www.linuxcounter.net/statistics/distributions

4-lwn.net/Distributions