Re: Index Skip Scan (new UniqueKeys)

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Floris Van Nee <florisvannee(at)optiver(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Index Skip Scan (new UniqueKeys)
Date: 2021-03-17 02:28:00
Message-ID: f9eedec7-a6d8-fd06-6934-2397a91671bd@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I took a look at the new patch series, focusing mostly on the uniquekeys
part. It'd be a bit tedious to explain all the review comments here, so
attached is a patch series with a "review" patch for some of the parts.
Most of it is fairly small (corrections to comments etc.), I'll go over
the more serious part so that we can discuss it here. I'll keep it split
per parts of the original patch series.

I suggest looking for XXX and FIXME comments in all the review patches.

0001
----

1) I wonder if the loop in set_append_rel_size should do continue
instead of break, to continue with other (not whole row) attributes?

2) A couple comments that'd deserve clarification.

0002
----

1) A bunch of comment fixes, clarifications, etc. Various missing
comments for functions added. Please review that I got all the details
right, etc.

2) Significant reworks of the README.uniquekeys - I found the original
version rather hard to understand, hopefully this is better. But I'm not
sure I got all the details right, so please review.

3) I see set_append_rel_pathlist uses IS_PARTITIONED_REL to decide
whether we need to generate unique keys. Why not to try doing the same
thing for plain append relations? Can't we handle at least the case with
just a single child relation? (Maybe not, we probably don't know if the
parent is empty in that case. Then perhaps mention that in a comment.)

4) Is there a reason why populate_joinrel_uniquekeys gets executed after
try_partitionwise_join? Does that allow generating additional unique
keys, or something like that?

5) A lot of new comments in uniquekeys.c - most of it was seriously
under-documented. Please check I got all the details right.

6) Doesn't populate_baserel_uniquekeys need to look at collations too,
not just opfamilies?

7) The code does something like this:

if (!ind->unique || !ind->immediate ||
(ind->indpred != NIL && !ind->predOK))
continue;

in a number of places. I suggest we define a nicer macro for that.

8) I'm not sure what happens in populate_baserel_uniquekeys if there are
multiple restrictinfos for the same expression.

9) I've modified a couple places to replace this:

foreach(lc, list)
{
...

if (a)
{
...
}
}

to something like this:

foreach(lc, list)
{
...

if (!a)
continue;

...
}

which I think is easier to read (it reduces the level of nesting etc.).
But I admit it's also a matter of personal taste, to some extent.

10) in populate_partitionedrel_uniquekeys I've also added a return to
the first special-case block, so that the second block does not need to
be in an else.

11) It seems weird having to build a new copy of the IndexOptInfo using
simple_copy_indexinfo_to_parent just to check that the parent/child are
compatible. Seems quite expensive, and the code does it quite often. Why
not to invent a much smaller struct just for this?

12) Shouldn't populate_grouprel_uniquekeys do the checks for one-row and
grouping sets in the opposite order? I mean, with grouping sets it seems
possible that we produce multiple rows from one-row input rel, no?

13) Doesn't populate_joinrel_uniquekeys also deal with JOIN_RIGHT? If
not, maybe explain that in a comment.

14) I find it rather strange that we use innerrel_keeps_unique for both
inner-outer and outer-inner directions. Perhaps the name is a bit
misleading?

15) Why does the first "innerrel_keeps_unique" block care about
JOIN_FULL and the second about JOIN_FULL and JOIN_LEFT?

16) If either of the innerrel_keeps_unique blocks gets executed, doesn't
that mean the _ukey_ctx lists may contain incorrect data? Consider that
initialize_uniquecontext_for_joinrel sets "useful=true" for all indexes,
and it gets updated only in the blocks. So if te blocks do not execute,
this info is wrong, no? If not, why?

17) Isn't the last part of add_combined_uniquekey wrong? It does this:

foreach(lc1, get_exprs_from_uniquekey(..., outer_ukey))
{
foreach(lc2, get_exprs_from_uniquekey(..., inner_ukey))
{
List *exprs = list_concat_copy(lfirst_node(List, lc1),
lfirst_node(List, lc2));

joinrel->uniquekeys = lappend(joinrel->uniquekeys,
makeUniqueKey(exprs, ...));
}
}

That seems to iterate over expressions in the unique keys. But consider
you have inner unique key on (a1,a2) and outer unique key (b1,b2). AFAIK
we should add (a1,a2,b1,b2) for the join, but this seems to add (a1,b1),
(a1,b2), (a2,b1), (a2,b2). Seems bogus?

18) I find it a bit annoying that there are no new regression tests.
Surely we need to test this somehow?

0003
----

Just some comments/whitespace.

0004
----

I wonder why we don't include this in explain TEXT format? Seems it
might make it harder to write regression tests for this? It's easier to
just check that we deduced the right unique key(s) than having to
construct an example where it actually changes the plan.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Introduce-RelOptInfo-notnullattrs-attribute-20210317.patch text/x-patch 4.8 KB
0002-review-20210317.patch text/x-patch 2.5 KB
0003-Introduce-UniqueKey-attributes-on-RelOptInf-20210317.patch text/x-patch 59.3 KB
0004-review-20210317.patch text/x-patch 68.5 KB
0005-Extend-UniqueKeys-20210317.patch text/x-patch 13.0 KB
0006-review-20210317.patch text/x-patch 1.6 KB
0007-Index-skip-scan-20210317.patch text/x-patch 45.0 KB
0008-review-20210317.patch text/x-patch 1.3 KB
0009-Btree-implementation-of-skipping-20210317.patch text/x-patch 44.3 KB
0010-Index-skip-scan-documentation-20210317.patch text/x-patch 4.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2021-03-17 02:31:01 Re: Getting better results from valgrind leak tracking
Previous Message tsunakawa.takay@fujitsu.com 2021-03-17 02:09:32 RE: libpq debug log