Re: PostgreSQL infographics

From: "Jonathan S(dot) Katz" <jonathan(dot)katz(at)excoventures(dot)com>
To: damien clochard <damien(at)dalibo(dot)info>
Cc: pgsql-advocacy(at)postgresql(dot)org
Subject: Re: PostgreSQL infographics
Date: 2012-07-20 15:12:21
Message-ID: 5F1AC852-573C-4A50-BB00-40EBC1CB220D@excoventures.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy

On Jul 20, 2012, at 4:55 AM, damien clochard wrote:

>>> I'm interested help building such infographics for the PostgreSQL
>>> project but I believe this should be a collective work, with a long term
>>> perspective. The numbers are not that hard to gather (pages hits on the
>>> website ? , mail traffic on pgsql-hackers ? git stats ?) and we can find
>>> even more stats from services like ohloh
>>> (http://www.ohloh.net/p/postgres)... But I think the hard part will
>>> keeping the numbers up-to-date and release a new version on a regular
>>> basis (maybe once every semester ?)
>>>
>>> Is anyone else interested by creating this infographic ?
>>
>> I am interested, but I would want to augment the dataset that we are showing. While I think it is interesting seeing the history of PostgreSQL development shown visually, I think would also be beneficial if we could show statistics that indicate adoption and usage as well. Number of downloads over time would be a good place to start, but trying to determine what percentage of the RDBMS market or the total database market would be great at helping us indicate how much more PostgreSQL is being adopted. Even an estimate of how much data is being stored in Postgres globally would be insightful.
>>
>> I know it is difficult to collect that data but I think it could significantly help advocacy efforts by easily displaying adoption statistics.
>>
>
> Imho, we can build several infographics. Right now I'm talking about
> showing the rapid progress of the community (lines of codes,
> contributors, ML traffic, etc.) because it's easier to start with that
> and I think it's important the make a quick and easy first step in order
> to launch a project like that.

Sure - `git shortlog -s` :-) (though that would just show committers, not contributors - but the `git log` command would provide a lot of the information we need that we can extract from commits and group them by time)

> The numbers you're talking about are important. Everyone is looking for
> some maket share stats.... I think its a very difficult and probably
> endless quest :-) And even if that's possible, I tend to believe that
> the "market share" approach is a trap for PostgreSQL. I mean what are we
> talking about ? How do you express that ? In terms of "Sales Revenue" ?
> The revenues of PostgreSQL itself will always be zero... You can
> aggregate the revenue of each PostgreSQL related company in the world
> but even if it was realistic, we'd still be far below Oracle.

Well that's specifically why I said we should display stats that show adoption and usage, like number of downloads.

Also cloud-based and dedicated Postgres hosts would have some numbers to show usage trends - perhaps by reaching out to them they would be willing to share some statistics that we could use to benefit the community.

It's actually okay if our numbers lag behind Oracle, as long as the key metrics we pick show that Postgres usage is growing. There is definitely a herd mentality in open-source technology adoption, for better or worse, and if we can effectively demonstrate that more people continue to adopt Postgres, I think more technologists would be willing to try it out.

> And if we're talking about "Unit market share", in terms of number of
> servers installed then MySQL will be ahead for a long long time. So
> that's not a key indicator of market competitiveness either.

Which is why I also suggested we look at how much data is stored across Postgres databases, which I acknowledge would be a NP-hard task to even get a reasonable estimate :-) But it would be nice to claim "Postgres installs around the world hold XYZ amount of data." We don't even need it for comparison purposes - just to demonstrate that "Yes, a lot of the world's data goes into a Postgres installation." However, I do also understand how even having such a number might not be the most useful metric to the decision makers.

> More and more, I try to avoid putting the words "percentage" and
> "market" in the same sentence :-) Simply because in order to fully
> understand the meaning of that kind numbers, you need to understand
> first the difference between the economic models of PostgreSQL and other
> RDBMS. This might change in a near future but for now most managers I
> meet don't have that understanding and therefore giving market share
> percentage to them is like shooting myself in the foot :-)

And that is a fair point :-) And based on that, I think it would be good to try and get some of the metrics suggested above over a time period to show growth via Postgres usage, and worry about market comparisons later.

Jonathan

In response to

Browse pgsql-advocacy by date

  From Date Subject
Next Message Bruce Momjian 2012-07-27 01:33:59 Re: Gartner report on Postgresql
Previous Message Robert Bernier 2012-07-20 13:13:48 nice article