Re: [HACKERS] WIP: Aggregation push-down

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] WIP: Aggregation push-down
Date: 2019-02-21 20:18:03
Message-ID: 11172.1550780283@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Antonin Houska <ah(at)cybertec(dot)at> writes:
> Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> Latest patch set fails to apply, so moved to next CF, waiting on
>> author.

> Rebased.

This is in need of rebasing again :-(. I went ahead and pushed the 001
part, since that seemed fairly uncontroversial. (Note that I changed
estimate_hashagg_tablesize's result type to double on the way; you
probably want to make corresponding adjustments in your patch.)

I did not spend a whole lot of time looking at the patch today, but
I'm still pretty distressed at the data structures you've chosen.
I remain of the opinion that a grouped relation and a base relation
are, er, unrelated, even if they happen to share the same relid set.
So I do not see the value of the RelOptInfoSet struct you propose here,
and I definitely don't think there's any value in having, eg,
create_seqscan_path or create_index_path dealing with this stuff.

I also don't like changing create_nestloop_path et al to take a PathTarget
rather than using the RelOptInfo's pathtarget; IMO, it's flat out wrong
for a path to generate a tlist different from what its parent RelOptInfo
says that the relation produces.

I think basically the way this ought to work is that we generate baserel
paths pretty much the same as today, and then we run through the baserels
and see which ones are potentially worth doing partial aggregation on,
and for each one that is, we create a separate "upper relation" RelOptInfo
describing that. The paths for this RelOptInfo would be
partial-aggregation paths using paths from the corresponding baserel as
input. Then we'd run a join search that only considers joining grouped
rels with plain rels (I concur that joining two grouped rels is not worth
coping with, at least for now).

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-02-21 20:44:14 Re: NOT IN subquery optimization
Previous Message David Rowley 2019-02-21 19:49:26 Re: Delay locking partitions during INSERT and UPDATE