Re: [PATCH] pg_stat_toast

From: "Gunnar \"Nick\" Bluth" <gunnar(dot)bluth(at)pro-open(dot)de>
To: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Subject: Re: [PATCH] pg_stat_toast
Date: 2022-04-07 07:20:39
Message-ID: 83538137-f40b-5ead-7966-9b91a4af1fac@pro-open.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Am 06.04.22 um 18:55 schrieb Andres Freund:
> Hi,
>
> On 2022-04-06 12:24:20 -0400, Robert Haas wrote:
>> On Wed, Apr 6, 2022 at 12:01 PM Gunnar "Nick" Bluth
>> <gunnar(dot)bluth(at)pro-open(dot)de> wrote:
>>> Fair enough. At that point, a lot of things become unexpectedly painful.
>>> How many % of the installed base may that be though?
>>
>> I don't have statistics on that, but it's large enough that the
>> expense associated with the statistics collector is a reasonably
>> well-known pain point, and for some users, a really severe one.
>
> Yea. I've seen well over 100MB/s of write IO solely due to stats files writes
> on production systems, years ago.

Wow. Yeah, I tend to forget there's systems like ads' out there ;-)

>> I'm fairly sure it's not going to make things so cheap that we can afford to
>> add all the statistics anybody wants, but it's so painful that even modest
>> relief would be more than welcome.
>
> It definitely doesn't make stats free. But I'm hopefull that avoiding the
> regular writing out / readin back in, and the ability to only optionally store
> some stats (by varying allocation size or just having different kinds of
> stats), will reduce the cost sufficiently that we can start keeping more
> stats.

Knock on wood!

> Which is not to say that these stats are the right ones (nor that they're the
> wrong ones).

;-)

> I think if I were to tackle providing more information about toasting, I'd
> start not by adding a new stats view, but by adding a function to pgstattuple
> that scans the relation and collects stats for each toasted column. An SRF
> returning one row for each toastable column. With information like
>
> - column name
> - #inline datums
> - #compressed inline datums
> - sum(uncompressed inline datum size)
> - sum(compressed inline datum size)
> - #external datums
> - #compressed external datums
> - sum(uncompressed external datum size)
> - sum(compressed external datum size)
>
> IIRC this shouldn't require visiting the toast table itself.

But it would still require a seqscan and quite some cycles. However,
sure, something like that is an option.

> Perhaps also an SRF that returns information about each compression method
> separately (i.e. collect above information, but split by compression method)?
> Perhaps even with the ability to measure how big the gains of recompressing
> into another method would be?

Even more of the above, but yeah, sound nifty.

>>>> However, experience has taught me that a lot of skepticism is
>>>> warranted when it comes to claims about how cheap extensions to the
>>>> statistics system will be.
>>>
>>> Again, fair enough!
>>> Maybe we first need statistics about statistics collection and handling? ;-)
>>
>> Heh.
>
> I've wondered about adding pg_stat_stats the other day, actually :)
> https://postgr.es/m/20220404193435.hf3vybaajlpfmbmt%40alap3.anarazel.de

OMG LOL!

--
Gunnar "Nick" Bluth

Eimermacherweg 106
D-48159 Münster

Mobil +49 172 8853339
Email: gunnar(dot)bluth(at)pro-open(dot)de
__________________________________________________________________________
"Ceterum censeo SystemD esse delendam" - Cato

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-04-07 07:24:54 Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN
Previous Message Peter Eisentraut 2022-04-07 07:19:14 Remove error message hints mentioning configure options