Re: Using Valgrind to detect faulty buffer accesses (no pin or buffer content lock held)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Using Valgrind to detect faulty buffer accesses (no pin or buffer content lock held)
Date: 2020-04-26 00:17:17
Message-ID: CAH2-WznfQiprP=-+Geuuy0zueZs86Gazgz+-65g-t6hsxN9kBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 24, 2020 at 6:37 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> The final two patches fix the two newly discovered bugs -- I don't
> have a fix for the first bug yet, since that one is more complicated
> (and probably more serious).

I pushed both of the two fixes that I posted yesterday -- fixes for
the benign "no buffer lock held" issues in nbtree page deletion. We
still need a fix for the more serious issue, though. And that will
need to be backpatched down to v11. I'll try to get to that next week.

Attached is v2 of the patch series. This version has a centralized
description of what we require from nbtree code above _bt_getbuf(),
alongside existing, similar commentary. Here's the general idea: If we
lock a buffer, that has to go through one of the wrapper functions
that knows about Valgrind in all cases. It's presented as a stricter
version of what happens in bufmgr.c for all buffers from all access
methods.

I also added something about how the nbtree Valgrind client requests
(those added by the second patch in the series) assume that the
bufmgr.c client requests (from the first patch) also take place. We
need to be able to rely on bufmgr.c to "clean up" in the event of an
error, so the second patch has a strict dependency on the first. If
the transaction aborts, we can rely on bufmgr.c marking the buffer's
page as defined when the backend acquires its first buffer pin on the
buffer in the next transaction (doesn't matter whether or not the same
block/page is in the same buffer as before). This is why we can't use
client requests for local buffers (not that we'd want to; the existing
aset.c Valgrind client requests take care of them automatically).

--
Peter Geoghegan

Attachment Content-Type Size
v2-0001-Add-Valgrind-buffer-access-instrumentation.patch application/octet-stream 2.3 KB
v2-0002-Add-nbtree-Valgrind-buffer-lock-checks.patch application/octet-stream 14.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-04-26 00:31:37 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Previous Message James Coleman 2020-04-25 22:47:41 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays