Re: Replace buffer I/O locks with condition variables (reviving an old patch)

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Replace buffer I/O locks with condition variables (reviving an old patch)
Date: 2021-03-09 05:25:15
Message-ID: 20210309052515.rupkmtydkzknhwk6@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 08, 2021 at 06:10:36PM +1300, Thomas Munro wrote:
> On Fri, Mar 5, 2021 at 12:12 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > On Fri, Feb 26, 2021 at 7:08 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > > Back in 2016, Robert Haas proposed to replace I/O locks with condition
> > > variables[1]. Condition variables went in and have found lots of
> > > uses, but this patch to replace a bunch of LWLocks and some busy
> > > looping did not. Since then, it has been tested quite a lot as part
> > > of the AIO project[2], which currently depends on it. That's why I'm
> > > interested in following up now. I asked Robert if he planned to
> > > re-propose it and he said I should go for it, so... here I go.
> >
> > I removed a redundant (Size) cast, fixed the wait event name and
> > category (WAIT_EVENT_BUFFILE_XXX is for buffile.c stuff, not bufmgr.c
> > stuff, and this is really an IPC wait, not an IO wait despite the
> > name), updated documentation and pgindented.
>
> More review and some proposed changes:
>
> The old I/O lock array was the only user of struct
> LWLockMinimallyPadded, added in commit 6150a1b08a9, and it seems kinda
> strange to leave it in the tree with no user. Of course it's remotely
> possible there are extensions using it (know of any?). In the
> attached, I've ripped that + associated commentary out, because it's
> fun to delete dead code. Objections?

None from me. I don't know of any extension relying on it, and neither does
codesearch.debian.net. I would be surprised to see any extension actually
relying on that anyway.

> Since the whole reason for that out-of-line array in the first place
> was to keep BufferDesc inside one cache line, and since it is in fact
> possible to put a new condition variable into BufferDesc without
> exceeding 64 bytes on a 64 bit x86 box, perhaps we should just do that
> instead? I haven't yet considered other architectures or potential
> member orders.

+1 for adding the cv into BufferDesc. That brings the struct size to exactly
64 bytes on x86 64 bits architecture. This won't add any extra overhead to
LOCK_DEBUG cases, as it was already exceeding the 64B threshold, if that even
was a concern.

> I wonder if we should try to preserve user experience a little harder,
> for the benefit of people who have monitoring queries that look for
> this condition. Instead of inventing a new wait_event value, let's
> just keep showing "BufferIO" in that column. In other words, the
> change is that wait_event_type changes from "LWLock" to "IPC", which
> is a pretty good summary of this patch. Done in the attached. Does
> this make sense?

I think it does make sense, and it's good to preserve this value.

Looking at the patch itself, I don't have much to add it all looks sensible and
I agree with the arguments in the first mail. All regression tests pass and
documentation builds.

I'm marking this patch as RFC.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-03-09 05:31:30 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message Peter Geoghegan 2021-03-09 05:21:54 Re: New IndexAM API controlling index vacuum strategies