From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | anisimow(dot)d(at)gmail(dot)com |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17462: Invalid memory access in heapam_tuple_lock |
Date: | 2022-04-11 15:55:19 |
Message-ID: | 286748.1649692519@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> When running parallel queries using pgbench with valgrind-enabled server:
> pgbench -i -s 1
> pgbench -t 1000 -c 10 -j 10
> I get:
> ==00:00:03:09.642 456530== Invalid read of size 2
Reproduced here. It's surprising that nobody noticed this before,
because AFAICS the bug is pretty old: it dates to somebody foolishly
deciding that heap_fetch didn't need its keep_buf argument, which
evidently happened in v12 (didn't track down the exact commit yet).
As you say, valgrind would not have caught this problem before
1e0dfd166, but that's not so new anymore either.
In principle, this is showing an actual bug, because once we drop
the buffer pin somebody could replace the page before we get done
examining the tuple. I'm not sure what the odds are of that happening
in the field, but they're probably mighty low because a just-accessed
buffer should not be high priority for replacement.
My inclination for a fix is to revert the removal of the keep_buf argument
and go back to having heapam_tuple_lock and other callers release the
buffer after they are done. However, that's problematic in released
branches, because it seems likely that there are outside callers of
heap_fetch. Can we get away with only fixing this in HEAD?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2022-04-11 16:19:51 | Re: BUG #17462: Invalid memory access in heapam_tuple_lock |
Previous Message | wangsh.fnst@fujitsu.com | 2022-04-11 10:24:26 | Re: "unexpected duplicate for tablespace" problem in logical replication |