Re: Projection pushdown to index access method

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Chris Cleveland <ccleveland(at)dieselpoint(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Projection pushdown to index access method
Date: 2023-09-19 17:46:36
Message-ID: 387035.1695145596@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Sep 19, 2023 at 12:35 PM Chris Cleveland
> <ccleveland(at)dieselpoint(dot)com> wrote:
>> I'm working on an index access method. I have a function which can appear in a projection list which should be evaluated by the access method itself. Example:
>> ...
>> How do I get the system to pull the value from the index instead of trying to calculate it?

> I don't see how you can do this in general, because there's no
> guarantee that the plan will be an Index Scan or Index Only Scan
> instead of a Seq Scan or Bitmap Heap/Index Scan.

Yeah. There is some adjacent functionality for indexed expressions,
which maybe you could use, but it has a lot of shortcomings yet.
For example:

regression=# create or replace function f(x int) returns int as $$begin return x+1; end$$ language plpgsql strict immutable cost 1000;
CREATE FUNCTION
regression=# create table mytable (id int, x int);
CREATE TABLE
regression=# create index on mytable(x, f(x));
CREATE INDEX
regression=# set enable_seqscan TO 0;
SET
regression=# set enable_bitmapscan TO 0;
SET
regression=# explain verbose select f(x) from mytable;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Index Only Scan using mytable_x_f_idx on public.mytable (cost=0.15..5728.06 rows=2260 width=4)
Output: (f(x))
(2 rows)

If you examine the plan tree closely you can confirm that it is pulling
f(x) from the index rather than recomputing it. So maybe you could get
somewhere by pretending that my_special_function(body) is an indexed
expression. However, there are a couple of big gotchas, which this
example illustrates:

1. The index has to also provide x (or for you, "body") or else the
planner fails to detect that an IOS is applicable. This comes back
to the point Robert made about the planner preferring to think about
pulling individual Vars from tables: we don't believe the index is
usable in an IOS unless it provides all the Vars the query needs from
that table. This wouldn't be hard to fix exactly; the problem is to
fix it without spending exponential amounts of planning time in
check_index_only. We'd have to detect that all uses of "x" appear in
the context "f(x)" in order to realize that we don't need to be able
to fetch "x" itself.

2. Costing doesn't account for the fact that we've avoided runtime
computation of f(), thus the IOS plan may not be preferred over
other plan shapes, which is why I had to force it above. Again,
this is pretty closely tied to the fact that we don't recognize
until very late in the game that we can get f(x) from the index.

3. This only works for an index-only scan, not regular index scans.
There's some early discussion happening about unifying IOS and
regular scans a bit more, which perhaps would allow relaxing that
(and maybe even solve issue #1?). But it's a long way off yet.

If my_special_function() is supposed to always be applied to an
indexed column, then issue #1 would fortuitously not be a problem
for you. But #2 is a pain, and #3 might be a deal-breaker for you.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2023-09-19 18:05:41 Re: GenBKI emits useless open;close for catalogs without rows
Previous Message Matthias van de Meent 2023-09-19 17:33:22 Re: Disabling Heap-Only Tuples