Re: [PATCH] Improve performance of NOTIFY over many databases (v2)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Improve performance of NOTIFY over many databases (v2)
Date: 2019-09-15 22:14:24
Message-ID: 22742.1568585664@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martijn van Oosterhout <kleptog(at)gmail(dot)com> writes:
> On Sat, 14 Sep 2019 at 17:08, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> None of this seems to respond to my point: it looks to me like it would
>> work fine if you simply dropped the patch's additions in PreCommit_Notify
>> and ProcessCompletedNotifies, because there is already enough logic to
>> decide when to call asyncQueueAdvanceTail.

> ...
> However, I guess you're thinking of asyncQueueReadAllNotifications()
> triggering if the queue as a whole was too long. This could in
> principle work but it does mean that at some point all backends
> sending NOTIFY are going to start calling asyncQueueAdvanceTail()
> every time, until the tail gets advanced, and if there are many idle
> listening backends behind this could take a while. The slowest backend
> might receive more signals while it is processing and so end up
> running asyncQueueAdvanceTail() twice. The fact that signals coalesce
> stops the process getting completely out of hand but it does feel a
> little uncontrolled.
> The whole point of this patch is to ensure that at any time only one
> backend is being woken up and calling asyncQueueAdvanceTail() at a
> time.

I spent some more time thinking about this, and I'm still not too
satisfied with this patch's approach. It seems to me the key insights
we're trying to make use of are:

1. We don't really need to keep the global tail pointer exactly
up to date. It's bad if it falls way behind, but a few pages back
is fine.

2. When sending notifies, only listening backends connected to our
own database need be awakened immediately. Backends connected to
other DBs will need to advance their queue pointer sometime, but
again it doesn't need to be right away.

3. It's bad for multiple processes to all be trying to do
asyncQueueAdvanceTail concurrently: they'll contend for exclusive
access to the AsyncQueueLock. Therefore, having the listeners
do it is really the wrong thing, and instead we should do it on
the sending side.

However, the patch as presented doesn't go all the way on point 3,
instead having listeners maybe-or-maybe-not do asyncQueueAdvanceTail
in asyncQueueReadAllNotifications. I propose that we should go all
the way and just define tail-advancing as something that happens on
the sending side, and only once every few pages. I also think we
can simplify the handling of other-database listeners by including
them in the set signaled by SignalBackends, but only if they're
several pages behind. So that leads me to the attached patch;
what do you think?

BTW, in my hands it seems like point 2 (skip wakening other-database
listeners) is the only really significant win here, and of course
that only wins when the notify traffic is spread across a fair number
of databases. Which I fear is not the typical use-case. In single-DB
use-cases, point 2 helps not at all. I had a really hard time measuring
any benefit from point 3 --- I eventually saw a noticeable savings
when I tried having one notifier and 100 listen-only backends, but
again that doesn't seem like a typical use-case. I could not replicate
your report of lots of time spent in asyncQueueAdvanceTail's lock
acquisition. I wonder whether you're using a very large max_connections
setting and we already fixed most of the problem with that in bca6e6435.
Still, this patch doesn't seem to make any cases worse, so I don't mind
if it's just improving unusual use-cases.

regards, tom lane

Attachment Content-Type Size
Improve-performance-of-async-notifications-v4.patch text/x-diff 13.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2019-09-16 01:33:33 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Tomas Vondra 2019-09-15 20:02:51 Re: (Re)building index using itself or another index of the same table