Re: [PATCH] Keeps tracking the uniqueness with UniqueKey

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, rushabh(dot)lathia(at)gmail(dot)com
Subject: Re: [PATCH] Keeps tracking the uniqueness with UniqueKey
Date: 2020-05-24 18:34:30
Message-ID: CAApHDvqO543rifM8LMYBW9DgSJaiX52V2C2sPRzsc2T4qd2Ytw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 24 May 2020 at 04:14, Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
>
> > On Fri, May 22, 2020 at 08:40:17AM +1200, David Rowley wrote:
> > I imagine we'll set some required UniqueKeys during
> > standard_qp_callback()
>
> In standard_qp_callback, because pathkeys are computed at this point I
> guess?

Yes. In particular, we set the pathkeys for DISTINCT clauses there.

> > and then we'll try to create some Skip Scan
> > paths (which are marked with UniqueKeys) if the base relation does not
> > already have UniqueKeys that satisfy the required UniqueKeys that were
> > set during standard_qp_callback().
>
> For a simple distinct query those UniqueKeys would be set based on
> distinct clause. If I understand correctly, the very same is implemented
> right now in create_distinct_paths, just after building all index paths,
> so wouldn't it be just a duplication?

I think we need to create the skip scan paths when we create the other
paths for base relations. We shouldn't be adjusting existing index
paths during create_distinct_paths(). The last code I saw for the
skip scans patch did something like if (IsA(path, IndexScanPath)) in
create_distinct_paths(), but that's only ever going to work when the
query is to a single relation. You'll never see IndexScanPaths in the
upper planner's paths when there are joins. You'd see join type paths
instead. It is possible to make use of skip scans for DISTINCT when
the query has joins. We'd just need to ensure the join does not
duplicate the unique rows from the skip scanned relation.

> In general UniqueKeys in the skip scan patch were created from
> distinctClause in build_index_paths (looks similar to what you've
> described) and then based on them created index skip scan paths. So my
> expectations were that the patch from this thread would work similar.

The difference will be that you'd be setting some distinct_uniquekeys
in standard_qp_callback() to explicitly request that some skip scan
paths be created for the uniquekeys, whereas the patch here just does
not bother doing DISTINCT if the upper relation already has unique
keys that state that the DISTINCT is not required. The skip scans
patch should check if the RelOptInfo for the uniquekeys set in
standard_qp_callback() are already mentioned in the RelOptInfo's
uniquekeys. If they are then there's no point in skip scanning as the
rel is already unique for the distinct_uniquekeys.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dagfinn Ilmari Mannsåker 2020-05-24 21:05:30 Missing links between system catalog documentation pages
Previous Message Victor Yegorov 2020-05-24 18:30:15 Failure to create GiST on ltree column