| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
| Cc: | Julien Tachoires <julien(at)tachoires(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Qual push down to table AM |
| Date: | 2025-12-09 23:08:00 |
| Message-ID: | mmvp5gcbwmjpl2bb7e3qytam3iy3wonpz26djrge5fcyyqnrui@dckh6eggdqlu |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 2025-12-09 16:40:17 -0500, Robert Haas wrote:
> On Fri, Aug 29, 2025 at 4:38 AM Julien Tachoires <julien(at)tachoires(dot)me> wrote:
> Potentially, there could be a performance problem
I think the big performance hazard with this is repeated deforming. The
scankey infrastructure deforms attributes one-by-one *and* it does not
"persist" the work of deforming for later accesses. So if you e.g. have
something like
SELECT sum(col_29) FROM tbl WHERE col_30 = common_value;
or
SELECT * FROM tbl WHERE col_30 = common_value;
we'll now deform col_30 in isolation for the ScanKey evaluation and then we'll
deform columns 1-29 in the slot (because we always deform all the leading
columns), during projection.
But even leaving the slot issue aside, I'd bet that you'll see overhead due to
*not* deforming multiple columns at once. If you have a ScanKey version of
something like
WHERE column_20 = common_val AND column_21 = some_val AND column_22 = another_val;
and there's a NULL or varlena value in one of the leading columns, we'll redo
a fair bit of work during the fastgetattr() for column_22.
I don't really see this being viable without first tackling two nontrivial
projects:
1) Make slot deforming for expressions & projections selective, i.e. don't
deform all the leading columns, but only ones that will eventually be
needed
2) Perform ScanKey evaluation in slot form, to be able to cache the deforming
and to make deforming of multiple columns sufficiently efficient.
> So, somewhat to my surprise, I think that v4-0001 might be basically
> fine. I wonder if anyone else sees a problem that I'm missing?
I doubt this would be safe as-is: ISTM that if you release the page lock
between tuples, things like the number of items on the page can change. But we
store stuff like that in registers / on the stack, which could change while
the lock is not held.
We could refetch the number items on the page for every loop iteration, but
that'd probably not be free. OTOH, it's probably nothing compared to the cost
of relocking the page...
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2025-12-09 23:08:42 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Previous Message | Heikki Linnakangas | 2025-12-09 23:07:01 | Re: Fix a minor typo in the comment of read_stream_start_pending |