Re: [PATCH] Erase the distinctClause if the result is unique by definition

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Erase the distinctClause if the result is unique by definition
Date: 2020-03-02 20:25:51
Message-ID: CAKU4AWoVRfFbWmxeQ9vACh2U=LXV-6FoscVrFSEPXM5ZAbOXrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 3, 2020 at 1:24 AM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:

> Thank you Tom for the review!
>
> On Mon, Mar 2, 2020 at 4:46 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> writes:
>> > Please see if you have any comments. Thanks
>>
>> The cfbot isn't at all happy with this. Its linux build is complaining
>> about a possibly-uninitialized variable, and then giving up:
>>
>> https://travis-ci.org/postgresql-cfbot/postgresql/builds/656722993
>>
>> The Windows build isn't using -Werror, but it is crashing in at least
>> two different spots in the regression tests:
>>
>>
>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.81778
>>
>> I've not attempted to identify the cause of that.
>>
>>
> Before I submit the patch, I can make sure "make check-world" is
> successful, but
> since the compile option is not same, so I didn't catch
> the possibly-uninitialized
> variable. As for the crash on the windows, I didn't get the enough
> information
> now, I will find a windows server and reproduce the cases.
>
> I just found the link http://commitfest.cputube.org/ this morning, I will
> make sure
> the next patch can go pass this test.
>
>
> At a high level, I'm a bit disturbed that this focuses only on DISTINCT
>> and doesn't (appear to) have any equivalent intelligence for GROUP BY,
>> though surely that offers much the same opportunities for optimization.
>> It seems like it'd be worthwhile to take a couple steps back and see
>> if we couldn't recast the logic to work for both.
>>
>>
> OK, Looks group by is a bit harder than distinct since the aggregation
> function.
> I will go through the code to see why to add this logic.
>
>

Can we grantee any_aggr_func(a) == a if only 1 row returned, if so, we
can do
some work on the pathtarget/reltarget by transforming the Aggref to raw
expr.
I checked the execution path of the aggregation call, looks it depends on
Agg node
which is the thing we want to remove.

>
>> * There seem to be some pointless #include additions, eg in planner.c
>> the added code doesn't look to justify any of them. Please also
>> avoid unnecessary whitespace changes, those also slow down reviewing.
>>
>
fixed some typo errors.

That may be because I added the header file at time 1 and then refactored
the code but forget to remove the header file when it is not necessary.
Do we need to relay on experience to tell if the header file is needed or
not,
or do we have any tool to tell it automatically?

regards, Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesse Zhang 2020-03-02 20:45:21 Re: Use compiler intrinsics for bit ops in hash
Previous Message Tom Lane 2020-03-02 19:11:10 Re: Allowing ALTER TYPE to change storage strategy