Re: PG15 beta1 sort performance regression due to Generation context change

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PG15 beta1 sort performance regression due to Generation context change
Date: 2022-05-31 16:03:44
Message-ID: CA+TgmoZoYxFBN+AEJGfjJCCbeW8MMkHfFVcC61kP2OncmeYDWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 31, 2022 at 11:09 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Yeah, we don't have any hard data here. It could be that it's a win to
> switch to a rule that chunks must present an offset (instead of a pointer)
> back to a block header, which'd then be required to contain a link to the
> actual context, meaning that every context has to do something like what
> I proposed for generation.c. But nobody's coded that up let alone done
> any testing on it, and I feel like it's too late in the v15 cycle for
> changes as invasive as that. Quite aside from that change in itself,
> you wouldn't get any actual space savings in aset.c contexts unless
> you did something with the chunk-size field too, and that seems a lot
> messier.
>
> Right now my vote would be to leave things as they stand for v15 ---
> the performance loss that started this thread occurs in a narrow
> enough set of circumstances that I don't feel too much angst about
> it being the price of winning in most other circumstances. We can
> investigate these options at leisure for v16 or later.

I don't think it's all that narrow a case, but I think it's narrow
enough that I agree that we can live with it if we don't feel like the
risk-reward trade-offs are looking favorable. I don't think tuples
whose size, after rounding to a multiple of 8, is exactly a power of 2
are going to be terribly uncommon. It is not very likely that many
people will have tuples between 25 and 31 bytes, but tables with lots
of 57-64 byte tuples are probably pretty common, and tables with lots
of 121-128 and 249-256 byte tuples are somewhat plausible as well. I
think we will win more than we lose, but I think we will lose often
enough that I wouldn't be very surprised if we get >0 additional
complaints about this over the next 5 years. While it may seem risky
to do something about that now, it will certainly seem a lot riskier
once the release is out, and there is some risk in doing nothing,
because we don't know how many people are going to run into the bad
cases or how severely they might be impacted.

I think the biggest risk in your patch as presented is that slowing
down pfree() might suck in some set of workloads upon which we can't
easily identify now. I don't really know how to rule out that
possibility. In general terms, I think we're not doing ourselves any
favors by relying partly on memory context cleanup and partly on
pfree. The result is that pfree() has to work for every memory context
type and can't be unreasonably slow, which rules out a lot of really
useful ideas, both in terms of how to make allocation faster, and also
in terms of how to make it denser. If you're ever of a mind to put
some efforts into really driving things forward in this area, I think
figuring out some way to break that hard dependence would be effort
really well-spent.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2022-05-31 17:22:44 Re: PostgreSQL Limits: maximum number of columns in SELECT result
Previous Message Vladimir Sitnikov 2022-05-31 15:59:23 Re: PostgreSQL Limits: maximum number of columns in SELECT result