Re: [PATCH] pg_stat_toast

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Gunnar Nick Bluth <gunnar(dot)bluth(at)pro-open(dot)de>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Subject: Re: [PATCH] pg_stat_toast
Date: 2022-04-06 16:24:20
Message-ID: CA+TgmoY0+rVu04Cxs1dF6HuaKzYPjuwO+Hxgkh6vq5XKDd2t4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 6, 2022 at 12:01 PM Gunnar "Nick" Bluth
<gunnar(dot)bluth(at)pro-open(dot)de> wrote:
> Fair enough. At that point, a lot of things become unexpectedly painful.
> How many % of the installed base may that be though?

I don't have statistics on that, but it's large enough that the
expense associated with the statistics collector is a reasonably
well-known pain point, and for some users, a really severe one.

Also, if we went out and spun up a billion new PostgreSQL instances
that were completely idle and had no data in them, that would decrease
the percentage of the installed base with high table counts, but it
wouldn't be an argument for or against this patch. The people who are
using PostgreSQL heavily are both more likely to have a lot of tables
and also more likely to be interested in more obscure statistics. The
question isn't - how likely is a random PostgreSQL installation to
have a lot of tables? - but rather - how likely is a PostgreSQL
installation that cares about this feature to have a lot of tables? I
don't know either of those percentages but surely the second must be
significantly higher than the first.

> I'm far from done reading the patch and mail thread Andres mentioned,
> but I think the general idea is to move the stats to shared memory, so
> that reading (and thus, persisting) pg_stats is required far less often,
> right?

Right. I give Andres a lot of props for dealing with this mess,
actually. Infrastructure work like this is a ton of work and hard to
get right and you can always ask yourself whether the gains are really
worth it, but your patch is not anywhere close to the first one where
the response has been "but that would be too expensive!". So we have
to consider not only the direct benefit of that work in relieving the
pain of people with large database clusters, but also the indirect
benefits of maybe unblocking some other improvements that would be
beneficial. I'm fairly sure it's not going to make things so cheap
that we can afford to add all the statistics anybody wants, but it's
so painful that even modest relief would be more than welcome.

> > However, experience has taught me that a lot of skepticism is
> > warranted when it comes to claims about how cheap extensions to the
> > statistics system will be.
>
> Again, fair enough!
> Maybe we first need statistics about statistics collection and handling? ;-)

Heh.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-04-06 16:34:19 Re: Mingw task for Cirrus CI
Previous Message Andres Freund 2022-04-06 16:20:30 Re: shared-memory based stats collector - v70