Re: Reducing WaitEventSet syscall churn

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: thomas(dot)munro(at)gmail(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Reducing WaitEventSet syscall churn
Date: 2020-03-09 23:19:24
Message-ID: 20200310.081924.1757484278318516911.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

I looked this.

At Thu, 27 Feb 2020 12:17:45 +1300, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote in
> On Sat, Feb 8, 2020 at 10:15 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > > > Here are some patches to get rid of frequent system calls.
>
> Here's a new version of this patch set. It gets rid of all temporary
> WaitEventSets except a couple I mentioned in another thread[1].
> WaitLatch() uses CommonWaitSet, and calls to WaitLatchOrSocket() are
> replaced by either the existing FeBeWaitSet (walsender, gssapi/openssl
> auth are also candidates) or a special purpose long lived WaitEventSet
> (replication, postgres_fdw, pgstats). It passes make check-world with
> WAIT_USE_POLL, WAIT_USE_KQUEUE, WAIT_USE_EPOLL, all with and without
> -DEXEC_BACKEND, and make check with WAIT_USE_WIN32 (Appveyor).
>
> 0001: "Don't use EV_CLEAR for kqueue events."
>
> This fixes a problem in the kqueue implementation that only shows up
> once you switch to long lived WaitEventSets. It needs to be
> level-triggered like the other implementations, for example because
> there's a place in the recovery tests where we wait twice in a row
> without trying to do I/O in between. (This is a bit like commit
> 3b790d256f8 that fixed a similar problem on Windows.)

It looks fine in the light of document of kqueue.

> 0002: "Use a long lived WaitEventSet for WaitLatch()."
>
> In the last version, I had a new function WaitMyLatch(), but that
> didn't help with recoveryWakeupLatch. Switching between latches
> doesn't require a syscall, so I didn't want to have a separate WES and
> function just for that. So I went back to using plain old
> WaitLatch(), and made it "modify" the latch every time before waiting
> on CommonWaitSet. An alternative would be to get rid of the concept
> of latches other than MyLatch, and change the function to
> WaitMyLatch(). A similar thing happens for exit_on_postmaster_death,
> for which I didn't want to have a separate WES, so I just set that
> flag every time. Thoughts?

It is surely an improvement from that we create a full-fledged WES
every time. The name CommonWaitSet gives an impression as if it is
used for a variety of waitevent sets, but it is used only by
WaitLatch. So I would name it LatchWaitSet. With that name I won't be
surprised by that the function is pointing WL_LATCH_SET by index 0
without any explanation when calling ModifyWaitSet.

@@ -700,7 +739,11 @@ FreeWaitEventSet(WaitEventSet *set)
ReleaseExternalFD();
#elif defined(WAIT_USE_KQUEUE)
close(set->kqueue_fd);
- ReleaseExternalFD();
+ if (set->kqueue_fd >= 0)
+ {
+ close(set->kqueue_fd);
+ ReleaseExternalFD();
+ }

Did you forget to remove the close() outside the if block?
Don't we need the same amendment for epoll_fd with kqueue_fd?

WaitLatch is defined as "If the latch is already set (and WL_LATCH_SET
is given), the function returns immediately.". But now the function
reacts to latch even if WL_LATCH_SET is not set. I think actually it
is alwys set so I think we need to modify Assert and function comment
following the change.

> 0003: "Use regular WaitLatch() for condition variables."
>
> That mechanism doesn't need its own WES anymore.

Looks fine.

> 0004: "Introduce RemoveWaitEvent()."
>
> We'll need that to be able to handle connections that are closed and
> reopened under the covers by libpq (replication, postgres_fdw). We
> also wanted this on a couple of other threads for multiplexing FDWs,
> to be able to adjust the wait set dynamically for a proposed async
> Append feature.
>
> The implementation is a little naive, and leaves holes in the
> "pollfds" and "handles" arrays (for poll() and win32 implementations).
> That could be improved with a bit more footwork in a later patch.
>
> XXX The Windows version is based on reading documentation. I'd be
> very interested to know if check-world passes (especially
> contrib/postgres_fdw and src/test/recovery). Unfortunately my
> appveyor.yml fu is not yet strong enough.

I didn't find the documentation about INVALID_HANDLE_VALUE in
lpHandles. Could you give me the URL for that?

I didn't run recoverycheck because because I couldn't install IPC::Run
for ActivePerl.. But contribcheck succeeded.

+ for (int i = 0; i < nevents; ++i)
+ set->handles[i + 1] = INVALID_HANDLE_VALUE;

It accesses set->handles[nevents], which is overrun.

+ /* Set up the free list. */
+ for (int i = 0; i < nevents; ++i)
+ set->events[i].next_free = i + 1;
+ set->events[nevents - 1].next_free = -1;

It sets the last element twice. (harmless but useless).

set->handles = (HANDLE) data;
data += MAXALIGN(sizeof(HANDLE) * nevents);

It is not an issue of thie patch, but does hadles need nevents + 1
elements?

WaitEventSetSize is not checking its parameter range.

I'l continue reviewing in later mail.

> 0005: "libpq: Add PQsocketChangeCount to advertise socket changes."
....

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-03-09 23:20:31 Re: effective_io_concurrency's steampunk spindle maths
Previous Message James Coleman 2020-03-09 22:40:32 Re: Nicer error when connecting to standby with hot_standby=off