Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, rushabh(dot)lathia(at)gmail(dot)com
Subject: Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Date: 2020-06-05 04:20:58
Message-ID: CAKU4AWoJXCyh3LOShOb5bRUM2bguGDT=HQ6Wadnd2B0L-ohtsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 5, 2020 at 10:57 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> On Fri, 5 Jun 2020 at 14:36, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> > On Mon, May 25, 2020 at 2:34 AM David Rowley <dgrowleyml(at)gmail(dot)com>
> wrote:
> >>
> >> On Sun, 24 May 2020 at 04:14, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
> wrote:
> >> >
> >> > > On Fri, May 22, 2020 at 08:40:17AM +1200, David Rowley wrote:
> >> > > I imagine we'll set some required UniqueKeys during
> >> > > standard_qp_callback()
> >> >
> >> > In standard_qp_callback, because pathkeys are computed at this point I
> >> > guess?
> >>
> >> Yes. In particular, we set the pathkeys for DISTINCT clauses there.
> >>
> >
> > Actually I have some issues to understand from here, then try to read
> index
> > skip scan patch to fully understand what is the requirement, but that
> doesn't
> > get it so far[1]. So what is the "UniqueKeys" in "UniqueKeys during
> > standard_qp_callback()" and what is the "pathkeys" in "pathkeys are
> computed
> > at this point” means? I tried to think it as root->distinct_pathkeys,
> however I
> > didn't fully understand where root->distinct_pathkeys is used for as
> well.
>
> In standard_qp_callback(), what we'll do with uniquekeys is pretty
> much what we already do with pathkeys there. Basically pathkeys are
> set there to have the planner attempt to produce a plan that satisfies
> those pathkeys. Notice at the end of standard_qp_callback() we set

the pathkeys according to the first upper planner operation that'll
> need to make use of those pathkeys. e.g, If there's a GROUP BY and a
> DISTINCT in the query, then use the pathkeys for GROUP BY, since that
> must occur before DISTINCT.

Thanks for your explanation. Looks I understand now based on your comments.
Take root->group_pathkeys for example, the similar information also
available in
root->parse->groupClauses but we do use of root->group_pathkeys with
pathkeys_count_contained_in function in many places, that is mainly because
the content between between the 2 is different some times, like the case in
pathkey_is_redundant.

Likely uniquekeys will want to follow the
> same rules there for the operations that can make use of paths with
> uniquekeys, which in this case, I believe, will be the same as the
> example I just mentioned for pathkeys, except we'll only be able to
> support GROUP BY without any aggregate functions.
>
>
All the places I want to use UniqueKey so far (like distinct, group by and
others)
have an input_relation (RelOptInfo), and the UniqueKey information can be
get
there. at the same time, all the pathkey in PlannerInfo is used for Upper
planner
but UniqueKey may be used in current planner some time, like
reduce_semianti_joins/
remove_useless_join, I am not sure if we must maintain uniquekey in
PlannerInfo.

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-06-05 06:03:59 Re: BufFileRead() error signalling
Previous Message Andy Fan 2020-06-05 03:32:43 Re: A wrong index choose issue because of inaccurate statistics