Re: POC: GROUP BY optimization

From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>, Белялов Дамир Наилевич <d(dot)belyalov(at)postgrespro(dot)ru>
Subject: Re: POC: GROUP BY optimization
Date: 2023-09-26 05:51:16
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20/7/2023 18:46, Tomas Vondra wrote:
> 2) estimating quicksort comparisons - This relies on ndistinct
> estimates, and I'm not sure how much more reliable we can make those.
> Probably not much :-( Not sure what to do about this, the only thing I
> can think of is to track "reliability" of the estimates and only do the
> reordering if we have high confidence in the estimates. That means we'll
> miss some optimization opportunities, but it should limit the risk.
According to this issue, I see two options:
1. Go through the grouping column list and find the most reliable one.
If we have a unique column or column with statistics on the number of
distinct values, which is significantly more than ndistincts for other
grouping columns, we can place this column as the first in the grouping.
It should guarantee the reliability of such a decision, isn't it?
2. If we have extended statistics on distinct values and these
statistics cover some set of first columns in the grouping list, we can
optimize these positions. It also looks reliable.

Any thoughts?

Andrey Lepikhov
Postgres Professional

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Karl O. Pinc 2023-09-26 06:03:28 Re: [PGdocs] fix description for handling pf non-ASCII characters
Previous Message Hayato Kuroda (Fujitsu) 2023-09-26 05:46:55 RE: pg_upgrade and logical replication