Re: Assertion failing in master, predicate.c

From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ashwin Agrawal <aagrawal(at)pivotal(dot)io>, Martijn van Oosterhout <kleptog(at)gmail(dot)com>
Subject: Re: Assertion failing in master, predicate.c
Date: 2019-11-22 18:57:22
Message-ID: c166215f-ef57-0890-2501-8ab374d2c5a9@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/21/19 8:03 PM, Tom Lane wrote:
> Mark Dilger <hornschnorter(at)gmail(dot)com> writes:
>> I have winnowed down the test a bit further. The attached
>> smaller patch still triggers the same assertion as the prior
>> patch did.
>
> FWIW, I can reproduce the assertion failure with your first test,
> but not with this simplified one.

Thanks for checking!

> I also confirm that it only happens in HEAD, not v12. I've not
> actually bisected, but a look at the git history for predicate.c
> sure makes it look like db2687d1f ("Optimize PredicateLockTuple")
> must be to blame.

`git bisect` shows the problem occurs earlier than that, and by
chance the first bad commit was one of yours. I'm not surprised
that your commit was regarding LISTEN/NOTIFY, as the error is
always triggered with a LISTEN statement. (I've now hit this
many times in many tests of multiple SQL statements, and the
last statement before the error is always a LISTEN.)

I'm cc'ing Martijn because he is mentioned in that commit.

51004c7172b5c35afac050f4d5849839a06e8d3b is the first bad commit
commit 51004c7172b5c35afac050f4d5849839a06e8d3b
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Sun Sep 22 11:46:29 2019 -0400

Make some efficiency improvements in LISTEN/NOTIFY.

Move the responsibility for advancing the NOTIFY queue tail pointer
from the listener(s) to the notification sender, and only have the
sender do it once every few queue pages, rather than after every batch
of notifications as at present. This reduces the number of times we
execute asyncQueueAdvanceTail, and reduces contention when there are
multiple listeners (since that function requires exclusive lock).
This change relies on the observation that we don't really need the
tail
pointer to be exactly up-to-date. It's certainly not necessary to
attempt to release disk space more often than once per SLRU segment.
The only other usage of the tail pointer is that an incoming listener,
if it's the only listener in its database, will need to scan the queue
forward from the tail; but that's surely a less performance-critical
path than routine sending and receiving of notifies. We compromise by
advancing the tail pointer after every 4 pages of output, so that it
shouldn't get more than a few pages behind.

Also, when sending signals to other backends after adding notify
message(s) to the queue, recognize that only backends in our own
database are going to care about those messages, so only such
backends really need to be awakened promptly. Backends in other
databases should get kicked if they're well behind on reading the
queue, else they'll hold back the global tail pointer; but wakening
them for every single message is pointless. This change can
substantially reduce signal traffic if listeners are spread among
many databases. It won't help for the common case of only a single
active database, but the extra check costs very little.

Martijn van Oosterhout, with some adjustments by me

Discussion:
https://postgr.es/m/CADWG95vtRBFDdrx1JdT1_9nhOFw48KaeTev6F_LtDQAFVpSPhA@mail.gmail.com
Discussion:
https://postgr.es/m/CADWG95uFj8rLM52Er80JnhRsTbb_AqPP1ANHS8XQRGbqLrU+jA@mail.gmail.com

--
Mark Dilger

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-11-22 19:01:44 Re: backup manifests
Previous Message Robert Haas 2019-11-22 18:24:16 Re: backup manifests