Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>, Rémi Zara <remi_zara(at)mac(dot)com>
Subject: Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date: 2020-03-28 22:25:12
Message-ID: CA+hUKGLzaR5cV0EmZWoVXJDO_XwZpmpQX_sYwCBRE1qLBEcGPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Sun, Mar 29, 2020 at 7:43 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > Pushed.
>
> prairiedog just turned up a different issue in this area [1].
> I wondered why it hadn't reported in for awhile, and upon
> investigation I found that the test run was stuck in the
> final pg_dump step of the pg_upgrade test. pg_dump itself
> was waiting for a query result, while the connected backend
> was sitting here:
>
> (gdb) bt
> #0 0x9002ec88 in kevent ()
> #1 0x0039cff8 in WaitEventSetWait (set=0x20c502c, timeout=-1, occurred_events=0xbfffdd4c, nevents=1, wait_event_info=100663296) at latch.c:1443
> #2 0x00261d98 in secure_read (port=0x2401ba0, ptr=0x713558, len=8192) at be-secure.c:184
> #3 0x00269d34 in pq_recvbuf () at pqcomm.c:950
> #4 0x00269e24 in pq_getbyte () at pqcomm.c:993
> #5 0x003cec2c in PostgresMain (argc=1, argv=0x38060ac, dbname=0x20c5154 "regression", username=0x20c5138 "buildfarm") at postgres.c:337
> #6 0x0032de0c in BackendStartup (port=0x2401ba0) at postmaster.c:4510
> #7 0x0032fcf8 in PostmasterMain (argc=1585338749, argv=0x5e7e59b9) at postmaster.c:1727
> #8 0x0026f32c in main (argc=6, argv=0x24009b0) at main.c:210
>
> It'd appear that we dropped an input-is-available condition.
>
> Now prairiedog is running a museum-grade macOS release, so
> it's hardly impossible that this is a kernel bug not a
> Postgres bug. But we shouldn't jump to that conclusion,
> either, given that our kevent support is so new.

My first thought was that it might have been due to the EV_CLEAR flag
problem discussed elsewhere, but the failing build has commit 9b8aa092
so that's not it.

About the kernel bug hypothesis: I see that the libevent project
doesn't use kqueue on early macOS versions due to some bug that it
tests for that apparently fails on 10.4/kernel 8.11 (what you have
there). Kqueue was added to macOS 10.3 (which pulled a bunch of code
from FreeBSD 5 including this), so in 10.4 I suppose it was still
somewhat new. I also found a few other vague complaints about bugs
from that era including some claims of missing events, but without
conclusions. The kernel source is mirrored on github with change
history[1], but without commit log messages or a public bug tracker
it's practically impossible for a drive-by reader to figure out what
was broken and fixed. That seems like a bit of a wild dino-goose
chase.

Hmm, I see that Remi also runs an ancient PowerPC Mac on macOS
10.5/Darwin 9.8. His build farm animal "locust" hasn't reported in 22
days. Remi, is that animal down for other reasons, or could it be
stuck like this?

Further evidence for a version-specific problem is that there are
surely many in our hacker community working on modern Macs, and I
haven't heard of any problems so far. Of course that doesn't rule
anything out.

[1] https://github.com/apple/darwin-xnu/blob/master/bsd/kern/kern_event.c

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2020-03-28 22:31:14 pgsql: Fix lquery's behavior for consecutive '*' items.
Previous Message Tom Lane 2020-03-28 21:10:11 pgsql: Protect against overflow of ltree.numlevel and lquery.numlevel.

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-03-28 22:30:52 debian bugrept involving fast default crash in pg11.7
Previous Message Ranier Vilela 2020-03-28 22:04:00 [PATCH] Redudant initilization