Re: [WIP] Better partial index-only scans

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joshua Yanovski <pythonesque(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [WIP] Better partial index-only scans
Date: 2014-06-30 16:17:48
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joshua Yanovski <pythonesque(at)gmail(dot)com> writes:
> Proof of concept initial patch for enabling index only scans for
> partial indices even when an attribute is not in the target list, as
> long as it is only used in restriction clauses that can be proved by
> the index predicate. This also works for index quals, though they
> still can't be used in the target list. However, this patch may be
> inefficient since it duplicates effort that is currently delayed until
> after the best plan is chosen.

I took a quick look at this. I think it's logically incorrect to exclude
Vars used only in index quals from the set that the index has to return,
since you can't know at this stage whether the index is lossy (ie, might
report xs_recheck = TRUE at runtime). While this is moot for btree
indexes, it's not moot for SPGiST indexes which also support index-only
scans today.

In principle we could extend the AM and opclass API and demand that AMs
tell us whether they might return xs_recheck = TRUE. However, I'm pretty
hesitant to change the opclass APIs for this purpose; it'd likely break
third-party code. Moreover, an advantage of confining the patch to
considering only partial-index quals is that you could skip all the added
work for non-partial indexes, which would probably largely solve the
added-planning-time problem.

> It also includes a minor
> fix in the same code in createplan.c to make sure we're explicitly
> comparing an empty list to NIL, but I can take that out if that's not
> considered in scope.

I don't think the existing code is poor style there. There are certainly
hundreds of other cases where we treat "!= NULL" or "!= NIL" as implicit
(though of course also other places where we don't).

> ... as I see it performance could improve in any combination of five
> ways:
> * Improve the performance of determining which clauses can't be
> discarded (e.g. precompute some information about equivalence classes
> for index predicates, mess around with the order in which we check the
> clauses to make it fail faster, switch to real union-find data
> structures for equivalence classes).

This is certainly possible, though rather open-ended, and it's not clear
that it really fixes the objection (ie, if you speed these things up then
you still have a performance discrepancy from adding the tests earlier).

> * Take advantage of work we do here to speed things up elsewhere (e.g.
> if this does get chosen as the best plan we don't need to recompute
> the same information in create_indexscan_plan).

That would likely be worth doing if we do this, but it will only buy
back a small part of the cost, since the whole problem here is we'd
be doing this work for all indexes and not only the eventually selected

> * Delay determining whether to use an index scan or index only scan
> until after cost analysis somehow. I'm not sure exactly what this
> would entail.

That seems impossible to me, since the whole point of an index-only
scan is that it's a lot cheaper than a regular one.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-06-30 16:20:38 Re: Spinlocks and compiler/memory barriers
Previous Message Robert Haas 2014-06-30 16:15:06 Re: better atomics - v0.5