Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, rushabh(dot)lathia(at)gmail(dot)com
Subject: Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Date: 2020-05-25 07:16:26
Message-ID: 20200525071626.oafqoucjmswkjn2b@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mon, May 25, 2020 at 06:34:30AM +1200, David Rowley wrote:
>
> > For a simple distinct query those UniqueKeys would be set based on
> > distinct clause. If I understand correctly, the very same is implemented
> > right now in create_distinct_paths, just after building all index paths,
> > so wouldn't it be just a duplication?
>
> I think we need to create the skip scan paths when we create the other
> paths for base relations. We shouldn't be adjusting existing index
> paths during create_distinct_paths(). The last code I saw for the
> skip scans patch did something like if (IsA(path, IndexScanPath)) in
> create_distinct_paths()

It's not the case since the late March.

> > In general UniqueKeys in the skip scan patch were created from
> > distinctClause in build_index_paths (looks similar to what you've
> > described) and then based on them created index skip scan paths. So my
> > expectations were that the patch from this thread would work similar.
>
> The difference will be that you'd be setting some distinct_uniquekeys
> in standard_qp_callback() to explicitly request that some skip scan
> paths be created for the uniquekeys, whereas the patch here just does
> not bother doing DISTINCT if the upper relation already has unique
> keys that state that the DISTINCT is not required. The skip scans
> patch should check if the RelOptInfo for the uniquekeys set in
> standard_qp_callback() are already mentioned in the RelOptInfo's
> uniquekeys. If they are then there's no point in skip scanning as the
> rel is already unique for the distinct_uniquekeys.

It sounds like it makes semantics of UniqueKey a bit more confusing,
isn't it? At the moment it says:

Represents the unique properties held by a RelOptInfo.

With the proposed changes it would be "unique properties, that are held"
and "unique properties, that are requested", which are partially
duplicated, but stored in some different fields. From the skip scan
patch perspective it's probably doesn't make any difference, seems like
the implementation would be almost the same, just created UniqueKeys
would be of different type. But I'm afraid potentiall future users of
UniqueKeys could be easily confused.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2020-05-25 07:53:27 Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Previous Message Ian Barwick 2020-05-25 07:13:02 Re: pg13 docs: minor fix for "System views" list