Re: POC: GROUP BY optimization

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: GROUP BY optimization
Date: 2021-03-09 23:05:55
Message-ID: 22c44f98-bfa8-8630-62b5-5155e11eb284@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I take a look at the patch today. Attached is the v12 and a separate
patch with some comment tweaks and review comments.

1) I see the patch results in some plan changes in postgres_fdw. I
assume it's somehow related to the sort costing changes, but I find it a
bit suspicious. Why should the plan change from this:

Foreign Scan
Remote SQL: ... ORDER BY ... OFFSET 100 LIMIT 10;

to

Limit
Foreign Scan
Remote SQL: ... ORDER BY ...

But even if this is due to changes to order by costing, postponing the
limit seems like a net loss - the sort happens before the limit, and by
postponing the limit after foreign scan we need to transfer 100 more
rows. So what's causing this?

The difference in plan cost seems fairly substantial (old: 196.52, new:
241.94). But maybe that's OK and expected.

2) I wonder how much more expensive (in terms of CPU) is the new sort
costing code. It's iterative, and it's calling functions to calculate
number of groups etc. I wonder what's the impact simple queries?

3) It's not clear to me why we need "fake var" protection? Why would the
estimate_num_groups() fail for that?

4) In one of the places a comment says DEFAULT_NUM_DISTINCT is not used
because we're dealing with multiple columns. That makes sense, but maybe
we should use DEFAULT_NUM_DISTINCT at least to limit the range. That is,
with N columns we should restrict the nGroups estimate by:

min = DEFAULT_NUM_DISTINCT
max = Min(pow(DEFAULT_NUM_DISTINCT, ncolumns), tuples)

Also, it's not clear what's the logic behind the formula:

nGroups = ceil(2.0 + sqrt(tuples) *
list_length(pathkeyExprs) / list_length(pathkeys));

A comment explaining that would be helpful.

5) The compute_cpu_sort_cost comment includes two references (in a quite
mixed-up way), and then there's another reference in the code. I suggest
we list all of them in the function comment.

6) There's a bunch of magic values (various multipliers etc.). It's not
clear which values are meant to be equal, etc. Let's introduce nicer
constants with reasonable names.

7) The comment at get_cheapest_group_keys_order is a bit misleading,
because it talks about "uniqueness" - but that's not what we're looking
at here (I mean, is the column unique or not). We're looking at number
of distinct values in the column, which is a different thing. Also, it'd
be good to roughly explain what get_cheapest_group_keys_order does - how
it calculates the order (permutations up to 4 values, etc.).

8) The comment at compute_cpu_sort_cost should also explain basics of
the algorithm. I tried doing something like that, but I'm not sure I got
all the details right and it probably needs further improvements.

9) The main concern I have is still about the changes in planner.c, and
I think I've already shared it before. The code takes the grouping
pathkeys, and just reshuffles them to what it believes is a better order
(cheaper, ...). That is, there's on input pathkey list, and it gets
transformed into another pathkey list. The problem is that even if this
optimizes the sort, it may work against some optimization in the upper
part of the plan.

Imagine a query that does something like this:

SELECT a, b, count(*) FROM (
SELECT DISTINCT a, b, c, d FROM x
) GROUP BY a, b;

Now, from the costing perspective, it may be cheaper to do the inner
grouping by "c, d, a, b" for example. But that works directly against
the second grouping, which would benefit from having the input sorted by
"a, b". How exactly would this behaves depends on the number of distinct
values in the columns, how expensive the comparisons are for each
column, and so on. But you get the idea.

I admit I haven't managed to construct a reasonably query that'd be
significantly harmed by this, but I haven't spent a lot of time on it.

I'm pretty sure I used this trick in the past, when doing aggregations
on large data sets, where it was much better to "pipe" the data through
multiple GroupAggregate nodes.

For this reason I think the code generating paths should work a bit like
get_useful_pathkeys_for_relation and generate_useful_gather_paths.

* group_keys_reorder_by_pathkeys should not produce just one list of
pathkeys, but multiple interesting options. At the moment it should
produce at least the input grouping pathkeys (as specified in the
query), and the "optimal" pathkeys (by lowest cost).

* the places calling group_keys_reorder_by_pathkeys should loop on the
result, and generate separate path for each option.

I'd guess in the future we'll "peek forward" in the plan and see if
there are other interesting pathkeys (that's the expectation for
get_useful_pathkeys_for_relation).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-v12-20210309.patch text/x-patch 100.9 KB
0002-review-20210309.patch text/x-patch 8.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-09 23:37:44 Re: Huge memory consumption on partitioned table with FKs
Previous Message Peter Geoghegan 2021-03-09 22:42:39 Re: Lowering the ever-growing heap->pd_lower