Quick Links

Re: An improvement on parallel DISTINCT

From:	Richard Guo <guofenglinux(at)gmail(dot)com>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: An improvement on parallel DISTINCT
Date:	2024-02-05 01:42:01
Message-ID:	CAMbWs4_=f9gr+DyYoP9WtGbtfJTHM_CZmqrUkYhQ+TuOjhs9qQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Feb 2, 2024 at 7:36 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> Now for the other stuff you had. I didn't really like this part:
>
> + /*
> + * Set target for partial_distinct_rel as generate_useful_gather_paths
> + * requires that the input rel has a valid reltarget.
> + */
> + partial_distinct_rel->reltarget = cheapest_partial_path->pathtarget;
>
> I think we should just make it work the same way as
> create_grouping_paths(), where grouping_target is passed as a
> parameter.
>
> I've done it that way in the attached.

The change looks good to me.

BTW, I kind of doubt that 'create_partial_distinct_paths' is a proper
function name given what it actually does. It not only generates
distinct paths based on input_rel's partial paths, but also adds
Gather/GatherMerge on top of these partially distinct paths, followed by
a final unique/aggregate path to ensure uniqueness of the final result.
So maybe 'create_parallel_distinct_paths' or something like that would
be better?

I asked because I noticed that in create_partial_grouping_paths(), we
only generate partially aggregated paths, and any subsequent
FinalizeAggregate step is called in the caller.

Thanks
Richard

In response to

Re: An improvement on parallel DISTINCT at 2024-02-02 11:35:52 from David Rowley

Responses

Re: An improvement on parallel DISTINCT at 2024-02-07 08:24:08 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Zhijie Hou (Fujitsu)	2024-02-05 02:16:54	RE: Synchronizing slots from primary to standby
Previous Message	Richard Guo	2024-02-05 01:36:28	Re: An improvement on parallel DISTINCT