Re: POC: GROUP BY optimization

From: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>, Белялов Дамир Наилевич <d(dot)belyalov(at)postgrespro(dot)ru>
Subject: Re: POC: GROUP BY optimization
Date: 2023-12-29 03:23:59
Message-ID: 3a34029e-a485-4861-ae90-54dbe730debc@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 28/12/2023 18:29, Alexander Korotkov wrote:
> On Thu, Dec 28, 2023 at 10:22 AM Andrei Lepikhov
> <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
>> But arrangement with an ORDER BY clause doesn't work:
>>
>> DROP INDEX abc;
>> explain SELECT x,w,z FROM t GROUP BY (w,x,z) ORDER BY (x,z,w);
>>
>> I think the reason is that the sort_pathkeys and group_pathkeys are
>> physically different structures, and we can't just compare pointers here.
>
> I haven't yet looked into the code. But this looks strange to me.
> Somehow, optimizer currently matches index pathkeys to ORDER BY
> pathkeys. If GROUP BY pathkeys could be matched to index pathkeys,
> then it should be possible to match them to ORDER BY pathkeys too.
Oh, I found the mistake: I got used to using GROUP BY and ORDER BY on
many columns with round brackets. In the case of the grouping list, it
doesn't change anything. But ordering treats it as a WholeRowVar and
breaks group-by arrangement. Look:
explain (COSTS OFF) SELECT relname,reltuples FROM pg_class
GROUP BY relname,reltuples ORDER BY reltuples,relname;

Group
Group Key: reltuples, relname
-> Sort
Sort Key: reltuples, relname
-> Seq Scan on pg_class
But:
explain (COSTS OFF) SELECT relname,reltuples FROM pg_class
GROUP BY relname,reltuples ORDER BY (reltuples,relname);

Sort
Sort Key: (ROW(reltuples, relname))
-> Group
Group Key: relname, reltuples
-> Sort
Sort Key: relname, reltuples
-> Seq Scan on pg_class

So, let's continue to work.

--
regards,
Andrei Lepikhov
Postgres Professional

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-12-29 03:50:52 Re: Track in pg_replication_slots the reason why slots conflict?
Previous Message Jeff Davis 2023-12-29 02:57:16 Re: Built-in CTYPE provider