Re: Windows buildfarm members vs. new async-notify isolation test

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Mark Dilger <hornschnorter(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Windows buildfarm members vs. new async-notify isolation test
Date: 2019-12-06 23:31:45
Message-ID: 20102.1575675105@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> writes:
> On 12/5/19 4:37 AM, Amit Kapila wrote:
>> IIUC, this means that commit (step l2commit) is finishing before the
>> notify signal is reached that session. If so, can we at least confirm
>> that by adding something like select pg_sleep(1) in that step? So,
>> l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }. I
>> think we can try by increasing sleep time as well to confirm the
>> behavior if required.

> Yeah, with the sleep in there the NOTIFY is seen.

Well, that is *really* interesting, because I was fairly sure that
everything was adequately interlocked. The signal must have been
sent before step notify1 finishes, and then we do several other
things, so how could the listener2 process not have gotten it by
the time we run the l2commit step? I still think this is showing
us some sort of deficiency in our Windows signal mechanism.

A possible theory as to what's happening is that the kernel scheduler
is discriminating against listener2's signal management thread(s)
and not running them until everything else goes idle for a moment.
(If true, even a very short sleep ought to be enough to fix the test.)
If that's what's happening, though, I think we ought to look into
whether we can raise the priority of the signal threads compared to
the main thread. I don't think we want this much variation between
the way signals work on Windows and the way they work elsewhere.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-12-07 00:12:32 smgr vs DropRelFileNodeBuffers() vs filesystem state vs no critical section
Previous Message Tom Lane 2019-12-06 23:20:54 Re: ssl passphrase callback