Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)lists(dot)postgresql(dot)org, Richard Guo <guofenglinux(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
Date: 2024-01-10 18:00:01
Message-ID: 74335d44-819d-60e0-859a-efe74d9c84c2@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

08.01.2024 22:46, Robert Haas wrote:
> On Mon, Jan 8, 2024 at 3:00 AM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
>>> But that makes me wonder ... how exactly do oldslot
>>> and newslot end up sharing the same buffer without holding two pins on
>>> it? tts_heap_copyslot() looks like it basically forces the tuple to be
>>> materialized first and then copies that, so you'd end up with, I
>>> guess, no buffer pins at all, rather than 1 or 2.
>>>
>>> I'm obviously missing something important here...
>> As my analysis [2] showed, epqslot_clean is always equal to newslot, so
>> ExecCopySlot() isn't called at all.
>>
>> [1] https://www.postgresql.org/message-id/17798-0907404928dcf0dd%40postgresql.org
>> [2] https://www.postgresql.org/message-id/e989408c-1838-61cd-c968-1fcf47c6fbba%40gmail.com
> Sorry, I still don't get it. I can't really follow the analysis in
> [2]. Your analysis there relies on analyzing how
> ExecBRUpdateTriggers() gets called, but it seems to me that what
> matters is what happens inside that function, specifically what
> GetTupleForTrigger does. And I think that function is going to either
> produce an EPQ tuple or not depending on whether table_tuple_lock
> returns TM_Ok and whether it sets tmfd.traversed, independently of how
> ExecBRUpdateTriggers() is called.

Yes, GetTupleForTrigger() returns an EPQ tuple in our case (because of
concurrent update).

> Also, even if you're right that epqslot_clean always ends up equal to
> newslot, my question was about how oldslot and newslot end up sharing
> the same buffer without holding two pins on it, and I don't quite see
> what epqslot_clean has to do with that. Apologies if I'm being dense
> here; this code seems dramatically under-commented to me. But FWICT
> the design here is that a slot only holds a pin until somebody copies
> it or materializes it. So I don't understand what conceptually could
> happen that would end up with two slots holding the same pin, unless
> it was just that you had two variables holding the same pointer value.
> But if that's what is happening, then materializing one slot would
> also materialize the other.

As far as I can see, oldslot and newslot hold different pointer values;
e. g., with debug logging added (see attached (please forgive me dirty
hacks inside)), I get:
LOG:  ExecBRUpdateTriggers after ExecGetUpdateNewTuple()  1; oldslot: 0x758e3c8, slot->tts_tupleDescriptor: 0x7362fc8,
slot->buffer: 1810, slot->buffer->refcount: 1, slot->tts_values: 0x758e438, i: 1, slot->tts_values[ii]: 0x9982c74
STATEMENT:  UPDATE bruttest SET cnt = cnt + 1;
LOG:  ExecBRUpdateTriggers after ExecGetUpdateNewTuple()  1; newslot: 0x747fb48, slot->tts_tupleDescriptor: 0x7362fc8,
slot->buffer: 0, slot->buffer->refcount: 0, slot->tts_values: 0x747fbb8, i: 1, slot->tts_values[ii]: 0x9982c74
STATEMENT:  UPDATE bruttest SET cnt = cnt + 1;
==00:00:00:05.052 3934530== Invalid read of size 1
==00:00:00:05.052 3934530==    at 0x1EECAC: heap_compute_data_size (heaptuple.c:244)
...
==00:00:00:05.052 3934530==  Address 0x9982c74 is in a rw- anonymous segment
==00:00:00:05.052 3934530==
(when executing `make check` for the isolation test [1] under Valgrind)

But as we can see, both slots use the same buffer (have pointers to the
same attribute values) as a result of the following operations:
                 EEOP_SCAN_FETCHSOME:
                     slot_getsomeattrs(scanslot, op->d.fetch.last_var);
                 ...
                 EEOP_ASSIGN_SCAN_VAR:
                     resultslot->tts_values[resultnum] = scanslot->tts_values[attnum]
(More details upthread in [2])

In the meantime, newslot doesn't hold a pin at all.

[1] https://www.postgresql.org/message-id/950f4f1a-fb71-3e45-bb65-6a57da9f9b9e%40gmail.com
[2] https://www.postgresql.org/message-id/9f23591c-610a-60d4-2b29-26a860f7c3a8%40gmail.com

Best regards,
Alexander

Attachment Content-Type Size
dirty-debugging-17798.patch text/x-patch 2.8 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2024-01-10 19:06:42 Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Previous Message Tom Lane 2024-01-10 16:45:29 Re: BUG #18281: Superuser can rename the schema with the prefix "pg_" (Applies to all versions of postgresql)