Re: execExprInterp() questions / How to improve scalar array op expr eval?

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: execExprInterp() questions / How to improve scalar array op expr eval?
Date: 2020-04-13 14:40:52
Message-ID: CAAaqYe---vhjWyShvaJk5S=L60T2hiRiXmuZM9Ps7QKhpq+7Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've read through all of the previous discussions related to stable
subexpression caching, and I'm planning to send a summary email with
all of those links in one place.

But I also happened to stumble upon mention in the TODO of some email
discussion way back in 2007 where Tom suggested [1] we should really
try planning scalar array ops (particularly those with large IN lists)
as `IN (VALUES ...)`.

That actually would solve the specific case I'd had this problem with
(seq scan on a large constant array IN expression). Ideally any query
with forms like:
select * from t where a in (1, 2,...)
select * from t where a in ((select i from x))
would always be isomorphic in planning. But thinking about this
overnight and scanning through things quickly this morning, I have a
feeling that'd be 1.) a pretty significant undertaking, and 2.) likely
to explode the number of plans considered.

Also I don't know if there's a good place to slot that into planning.
Do either of you happen to have any pointers into places that do
similar kinds of rewrites I could look at? And in those cases do we
normally always rewrite or do we consider both styles independently?

I suppose _only_ handling the case where a `IN (VALUES ...)` replaces
a seq scan with a scalar array op might be somewhat easier...but feels
like it leaves a lot of holes.

I'm still at the point where I'm trying to determine if any of the
above (subexpression caching, saop optimization only on constants,
re-planning as `IN (VALUES ...)`) is something reasonable enough
relative to the amount of effort to be worth working on.

James

[1]: https://www.postgresql.org/message-id/19001.1178823208%40sss.pgh.pa.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2020-04-13 14:45:53 Re: spin_delay() for ARM
Previous Message Robert Haas 2020-04-13 14:20:03 Re: where should I stick that backup?