Re: PATCH: index-only scans with partial indexes

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: tomas(dot)vondra(at)2ndquadrant(dot)com, kgrittn(at)ymail(dot)com, simon(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: index-only scans with partial indexes
Date: 2016-03-09 08:29:49
Message-ID: 20160309.172949.84135555.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, thank you for the comments. The new v8 patch is attched.

At Tue, 08 Mar 2016 18:08:55 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in <21567(dot)1457478535(at)sss(dot)pgh(dot)pa(dot)us>
> Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> writes:
> > Hello, This is a (maybe) committer-ready patch of a Tomas
> > Vondra's project.
>
> I think this needs quite a bit of work yet. A few comments:

Not a few at all.

> * If we're going to pay the price of identifying implied restriction
> conditions in check_partial_indexes(), we should at least recoup some
> of that investment by not doing it again in create_indexscan_plan().

Moved a part of the check from create_indexscan_plan into
check_partial_indexes.

I noticed that we should avoid to exlude
clauses that contain mutable functions so I added that. But I
don't understand the reason for the following condition to refuse
clause pruning.

| rel->relid == root->parse->resultRelation

> * create_indexscan_plan() has this comment about what it's doing:
> * We can also discard quals that are implied by a partial index's
> * predicate, but only in a plain SELECT; when scanning a target relation
> * of UPDATE/DELETE/SELECT FOR UPDATE, we must leave such quals in the
> * plan so that they'll be properly rechecked by EvalPlanQual testing.
> I believe that that problem applies for this optimization as well,
> and thus that you can only remove implied quals in plain SELECT.
> At least, if there's some reason why that problem does *not* apply,
> there had darn well better be a comment explaining why it's safe.

This is done in check_partial_indexes using parse_rowmark.

The problem haven't realized with the previous patch because it
(accidentially) used rel->baserestirictinfo, not
index->indrinfos for scan_clauses in create_scan_plan.

But the way how create_scan_plan gets scan_clauses seems
bad. I haven't have any clean idea to deal with it.

> * Adding indexrinfos to IndexPath seems unnecessary, since that struct
> already has the "index" pointer --- you can just get the field out of the
> IndexOptInfo when you need it. If you insist on having the extra field,
> this patch is short of the threshold for correctness on adding fields to
> paths. It missed _outIndexPath for instance.

Sorry for the stupid code. I don't insist to keep it. Removed.

> * The additional #include in costsize.c has no apparent reason.

Thank you for pointing out. Removed.

> * The changes in cost_index() really ought to come with some change
> in the associated comments.

I tried to add a comment but it doesn't seem clear.

> * Personally I'd not change the signature of
> match_restriction_clauses_to_index; that's just code churn that
> may have to get undone someday.

No problem, reverted.

> * The block comment in check_index_only needs more thought than this,
> as the phrase "The same is true" is now invalid; the "same" it refers
> to *isn't* the same anymore.

Maybe I took this "the same" wrongly. Tried to fix it but I'm not
confident on the result.

> * I'm not too thrilled with injecting the initialization of
> index->indrinfos into the initial loop in check_partial_indexes().
> If it stays there, I'd certainly expect the comment ahead of the
> loop to be changed to have something to do with reality. But can't
> we find some more-appropriate place to initialize it? Like maybe
> where the IndexOptInfo is first created? I would not really expect
> check_partial_indexes() to have side-effects on non-partial indexes.

Mmm. That is quote right in general. IndexOptInfo is created in
get_relation_info() but baserestrictinfo has not been fixed at
the point. It is fixed as late as set_append_rel_size, almost
just before set_rel_size, and just before the
check_partial_indexes. But initializing indrinfos as a
side-effect of check_partial_indexes is not good as you pointed.
But it is called in two ways, set_tablesample_rel_size and
set_plain_rel_size. So the only possible position of that other
than check_partial_indexes is set_rel_size.

> * I think the double loop in check_partial_indexes() is too cute by half.
> I'd be inclined to just build the replacement list unconditionally while
> we do the predicate_implied_by() tests. Those are expensive enough that
> saving one lappend per implication-test is a useless optimization,
> especially if it requires code as contorted and bug-prone as this.

Ok, I removed the too cute part and added comment mentioning the
reason for the unconditional replacement.

> * The comment added to IndexOptInfo is not very adequate, and not spelled
> correctly either. There's a block comment you should be adding a para to
> (probably take the text you added for struct IndexPath).

I understand that you are mentioning here.

+ List *indrinfos; /* baseristrict info which are not implied by
+ * indpred */

I rewritten to make sense, maybe.

> And again,
> there is more work to do to add a field to such a struct, eg outfuncs.c.
> Usually a good way to find all the places to touch is to grep for some of
> the existing field names in the struct.

Sorry, I just forgot of that. (In spite that I myself give such
kind of comments..) Yeah, I love find-grep on emacs.

By the way, I found this comment in copyfuncs.c but I couldn't
find the "subsidiary structs".

| * We don't support copying RelOptInfo, IndexOptInfo, or Path nodes.
| * There are some subsidiary structs that are useful to copy, though.

Finally, all I added for this was one line in _outIndexOptInfo.

> * I don't much care for the field name "indrinfos"; it's neither very
> readable nor descriptive. Don't have a better suggestion right now
> though.

I agree with you. I didn't like the name so I rethought that. I
followed the seeming rule that prefixing with 'ind' to the field
name, but it is not for index, but for the parent relation. So I
renamed it as "baserestrictinfo" in this version.

> * Not sure if new regression test cases would be appropriate. The changes
> in the existing cases seem a bit unfortunate actually; I'm afraid that
> this may be defeating the original intent of those tests.

Only aggregates.out is modifed in this patch. The comment for the
test says that,

> --
> -- Test cases that should be optimized into indexscans instead of
> -- the generic aggregate implementation.
...
> -- try it on an inheritance tree
...
> explain (costs off)
> select min(f1), max(f1) from minmaxtest;

and

> -- DISTINCT doesn't do anything useful here, but it shouldn't fail
> explain (costs off)
> select distinct min(f1), max(f1) from minmaxtest;

Utterly no problem from the point of the comment. Although this
patch removes "Index Cond"s for the index minmaxtest3i, it is
simplly caused by a index predicate on the index, which is the
very result of this patch.

> I'm setting this back to Waiting on Author.

Attached the new version v8.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
partial-index-only-scan-v8.patch text/x-patch 13.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2016-03-09 08:30:44 Re: Optimization for updating foreign tables in Postgres FDW
Previous Message Etsuro Fujita 2016-03-09 08:20:26 Re: Minor documentation tweak to GetForeignPlan documentation