Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, rushabh(dot)lathia(at)gmail(dot)com
Subject: Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Date: 2020-06-06 09:17:51
Message-ID: 20200606091751.y6oxesi45xlffz5k@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Fri, Jun 05, 2020 at 12:26:15PM +1200, David Rowley wrote:
> On Mon, 25 May 2020 at 19:14, Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
> >
> > > On Mon, May 25, 2020 at 06:34:30AM +1200, David Rowley wrote:
> > > The difference will be that you'd be setting some distinct_uniquekeys
> > > in standard_qp_callback() to explicitly request that some skip scan
> > > paths be created for the uniquekeys, whereas the patch here just does
> > > not bother doing DISTINCT if the upper relation already has unique
> > > keys that state that the DISTINCT is not required. The skip scans
> > > patch should check if the RelOptInfo for the uniquekeys set in
> > > standard_qp_callback() are already mentioned in the RelOptInfo's
> > > uniquekeys. If they are then there's no point in skip scanning as the
> > > rel is already unique for the distinct_uniquekeys.
> >
> > It sounds like it makes semantics of UniqueKey a bit more confusing,
> > isn't it? At the moment it says:
> >
> > Represents the unique properties held by a RelOptInfo.
> >
> > With the proposed changes it would be "unique properties, that are held"
> > and "unique properties, that are requested", which are partially
> > duplicated, but stored in some different fields. From the skip scan
> > patch perspective it's probably doesn't make any difference, seems like
> > the implementation would be almost the same, just created UniqueKeys
> > would be of different type. But I'm afraid potentiall future users of
> > UniqueKeys could be easily confused.
>
> If there's some comment that says UniqueKeys are for RelOptInfos, then
> perhaps that comment just needs to be expanded to mention the Path
> uniqueness when we add the uniquekeys field to Path.

My concerns are more about having two different sets of distinct
uniquekeys:

* one prepared in standard_qp_callback for skip scan (I guess those
should be added to PlannerInfo?)

* one in create_distinct_paths as per current implementation

with what seems to be similar content.

> I think the main point of basing skip scans on top of this uniquekeys
> patch is to ensure it's the right thing for the job. I don't think
> it's realistic to be maintaining two different sets of infrastructure
> which serve a very similar purpose. It's important we make UniqueKeys
> general purpose enough to support future useful forms of optimisation.
> Basing skip scans on it seems like a good exercise towards that. I'm
> not expecting that we need to make zero changes here to allow it to
> work well with skip scans.

Sure, no one suggests to have two ways of saying "this thing is unique".
I'm just trying to figure out how to make skip scan and uniquekeys play
together without having rough edges.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-06-06 14:23:32 Re: how to create index concurrently on paritioned table
Previous Message Peter Eisentraut 2020-06-06 07:58:43 Re: Vacuuming the operating system documentation