Re: [PATCH] Erase the distinctClause if the result is unique by definition

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Erase the distinctClause if the result is unique by definition
Date: 2020-03-18 01:56:08
Message-ID: CAApHDvrNqgOgrLfhApdUH5P9fnwOuO35CPwNG7=ybH5EMBugaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 16 Mar 2020 at 06:01, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
>
> Hi All:
>
> I have re-implemented the patch based on David's suggestion/code, Looks it
> works well. The updated patch mainly includes:
>
> 1. Maintain the not_null_colno in RelOptInfo, which includes the not null from
> catalog and the not null from vars.

What about non-nullability that we can derive from means other than
NOT NULL constraints. Where will you track that now that you've
removed the UniqueKeySet type?

Traditionally we use attno or attnum rather than colno for variable
names containing attribute numbers

> 3. postpone the propagate_unique_keys_to_joinrel call to populate_joinrel_with_paths
> since we know jointype at that time. so we can handle the semi/anti join specially.

ok, but the join type was known already where I was calling the
function from. It just wasn't passed to the function.

> 4. Add the rule I suggested above, if both of the 2 relation yields the a unique result,
> the join result will be unique as well. the UK can be ( (rel1_uk1, rel1_uk2).. )

I see. So basically you're saying that the joinrel's uniquekeys should
be the cartesian product of the unique rels from either side of the
join. I wonder if that's a special case we need to worry about too
much. Surely it only applies for clauseless joins.

> 5. If the unique key is impossible to be referenced by others, we can safely ignore
> it in order to keep the (join)rel->unqiuekeys short.

You could probably have an equivalent of has_useful_pathkeys() and
pathkeys_useful_for_ordering()

> 6. I only consider the not null check/opfamily check for the uniquekey which comes
> from UniqueIndex. I think that should be correct.
> 7. I defined each uniquekey as List of Expr, so I didn't introduce new node type.

Where will you store the collation Oid? I left comments to mention
that needed to be checked but just didn't wire it up.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2020-03-18 01:58:53 Re: Berserk Autovacuum (let's save next Mandrill)
Previous Message Alvaro Herrera 2020-03-18 01:49:10 Re: Autovacuum on partitioned table