Quick Links

Re: POC: GROUP BY optimization

From:	Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To:	jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard Guo <guofenglinux(at)gmail(dot)com>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>
Subject:	Re: POC: GROUP BY optimization
Date:	2024-05-16 07:47:01
Message-ID:	5cd9b44a-5ece-441a-8cc2-89d250f180aa@postgrespro.ru
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 24.04.2024 13:25, jian he wrote:
> hi.
> I found an interesting case.
>
> CREATE TABLE t1 AS
> SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
> z, i::int4 AS w
> FROM generate_series(1, 100) AS i;
> CREATE INDEX t1_x_y_idx ON t1 (x, y);
> ANALYZE t1;
> SET enable_hashagg = off;
> SET enable_seqscan = off;
>
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,y,z;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,w,y;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,z,y;
> the above part will use:
> -> Incremental Sort
> Sort Key: x, $, $, $
> Presorted Key: x
> -> Index Scan using t1_x_y_idx on t1
>
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY z,y,w,x;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY w,y,z,x;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,z,x,w;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,w,x,z;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,z,w;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY y,x,w,z;
>
> these will use:
> -> Incremental Sort
> Sort Key: x, y, $, $
> Presorted Key: x, y
> -> Index Scan using t1_x_y_idx on t1
>
> I guess this is fine, but not optimal?
It looks like a bug right now - in current implementation we don't
differentiate different orders. So:
1. Applying all the patches from the thread which I proposed as an
answer to T.Lane last rebuke - does behavior still the same?.
2. Could you try to find the reason?

--
regards,
Andrei Lepikhov
Postgres Professional

In response to

Re: POC: GROUP BY optimization at 2024-04-24 06:25:22 from jian he

Responses

Re: POC: GROUP BY optimization at 2024-05-20 08:54:56 from jian he

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	jian he	2024-05-16 08:29:38	Re: First draft of PG 17 release notes
Previous Message	Daniel Gustafsson	2024-05-16 07:24:12	Minor cleanups in the SSL tests