From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard Guo <guofenglinux(at)gmail(dot)com>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru> |
Subject: | Re: POC: GROUP BY optimization |
Date: | 2024-04-29 02:50:00 |
Message-ID: | CACJufxH2pG930ntB=0dFZvL3sNZmamxuU0ORFE=i34qVKvAhLA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 24, 2024 at 2:25 PM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> hi.
> I found an interesting case.
>
> CREATE TABLE t1 AS
> SELECT (i % 10)::numeric AS x,(i % 10)::int8 AS y,'abc' || i % 10 AS
> z, i::int4 AS w
> FROM generate_series(1, 100) AS i;
> CREATE INDEX t1_x_y_idx ON t1 (x, y);
> ANALYZE t1;
> SET enable_hashagg = off;
> SET enable_seqscan = off;
>
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,y,z;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,w,y;
> EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,w,z,y;
> the above part will use:
> -> Incremental Sort
> Sort Key: x, $, $, $
> Presorted Key: x
> -> Index Scan using t1_x_y_idx on t1
We can make these cases also `Presorted Key: x, y`.
in
`if (path->pathkeys && !pathkeys_contained_in(path->pathkeys,
root->group_pathkeys))` branch
we can simple do
- infos = lappend(infos, info);
+ infos = lcons(info, infos);
similar to what we did at plancat.c (search lcons).
get_useful_group_keys_orderings returns a list of PathKeyInfo,
then the caller function just iterates each element.
so for the caller, order of the returned list element from
get_useful_group_keys_orderings
does not matter.
for path Incremental Sort:
function make_ordered_path will return the same cost for different
numbers of presorted keys.
for example:
EXPLAIN (COSTS OFF) SELECT count(*) FROM t1 GROUP BY x,z,y,w;
make_ordered_path cost is same for:
`info->pathkeys: x,y,z,w`
`info->pathkeys:x,z,y,w`
if we arrange `info->pathkeys: x,y,z,w` before `info->pathkeys:x,z,y,w`
in get_useful_group_keys_orderings.
then with the same cost, we will choose the first one
(`info->pathkeys: x,y,z,w`),
if we use IncrementalSort, then we use `Presorted Key: x, y`.
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2024-04-29 03:21:55 | Re: Race condition in FetchTableStates() breaks synchronization of subscription tables |
Previous Message | David G. Johnston | 2024-04-29 01:45:30 | Re: pg_input_error_info doc 2 exampled crammed together |