| From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
|---|---|
| To: | Julien Tachoires <julien(at)tachoires(dot)me> |
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Qual push down to table AM |
| Date: | 2025-12-09 21:40:17 |
| Message-ID: | CA+TgmoZhzRxZ9RWsTpUQiw+zNGUtNAFsFQDqr+W1dn2BJom7TQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Fri, Aug 29, 2025 at 4:38 AM Julien Tachoires <julien(at)tachoires(dot)me> wrote:
> Thank you for this quick feedback.
>
> One potential approach to solve this in heapgettup() would be:
> 1. hold the buffer lock
> 2. get the tuple from the buffer
> 3. if the tuple is not visible, move to the next tuple, back to 2.
> 4. release the buffer lock
> 5. if the tuple does not satisfy the scan keys, take the buffer lock,
> move to the next tuple, back to 2.
> 6. return the tuple
>
> Do you see something fundamentally wrong here?
I spent a bit of time this afternoon looking at v4-0001. I noticed a
few spelling mistakes (abritrary x2, statisfied x1). As far as the
basic approach is concerned, I don't see how there can be a safety
problem here. If it's safe to release the buffer lock when we find a
tuple that matches the quals, for the purposes of returning that tuple
to the caller, then it seems like it must also be safe to release it
to evaluate a proposed qual.
Potentially, there could be a performance problem. Imagine that we
have some code right now that uses this code path and it's safe
because the qual that we're evaluating is something super-simple like
the integer less-than operator, so calling it under the buffer lock
doesn't create a stability hazard. Well, with the patch, we'd
potentially take and release the buffer lock a lot more times than we
do right now. Imagine that there are lots of tuples on each page but
only 1 or very few of them satisfy the qual: then we lock and unlock
the buffer a whole bunch of times instead of just once.
However, I don't think this really happens in practice. I believe it's
possible to take this code path if you set ignore_system_indexes=on,
because that turns index scans --- which, not surprisingly, have
scankeys --- into sequential scans which then end up also having
scankeys. Many of those scans use catalog snapshots so there's no
issue, but a little bit of debugging code seems to show that
systable_beginscan() can also be called with snapshot->snapshot_type
set to SNAPSHOT_ANY or SNAPSHOT_DIRTY. For example, see
GetNewOidWithIndex(). However, even if ignore_system_indexes=on gets a
little slower as a result of this or some other patch, I don't think
we really care, and without that setting, this code doesn't seem to
get exercised at all.
So, somewhat to my surprise, I think that v4-0001 might be basically
fine. I wonder if anyone else sees a problem that I'm missing?
--
Robert Haas
EDB: http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Melanie Plageman | 2025-12-09 21:42:21 | Re: Trying out read streams in pgvector (an extension) |
| Previous Message | Andres Freund | 2025-12-09 21:26:42 | Re: [PATCH] VACUUM: avoid pre-creation transactions holding back cleanup of newly created relations |