Quick Links

Re: [PATCH] Improve performance of NOTIFY over many databases (issue blocking on AccessExclusiveLock on object 0 of class 1262 of database 0)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Martijn van Oosterhout <kleptog(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [PATCH] Improve performance of NOTIFY over many databases (issue blocking on AccessExclusiveLock on object 0 of class 1262 of database 0)
Date:	2019-07-23 17:21:14
Message-ID:	6483.1563902474@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Martijn van Oosterhout <kleptog(at)gmail(dot)com> writes:
> There are a number of possible improvements here:

> 1. Do what sinval does and separate the reader and writer locks so
> they can't block each other. This is the ultimate solution, but it's a
> significant refactor and it's not clear that's actually worthwhile
> here. This would almost be adopting the sinvaladt structure wholesale.

I agree that that's probably more ambitious than is warranted.

> 2. Add a field to AsyncQueueEntry which points to the next listening
> backend. This would allow the loops over all listening backends to
> complete much faster, especially in the normal case where there are
> not many listeners relative to the number of backends. The downside is
> this requires an exclusive lock to remove listeners, but that doesn't
> seem a big problem.

I don't understand how that would work? The sending backend doesn't
know what the "next listening backend" is. Having to scan the whole
queue when a listener unlistens seems pretty awful too, especially
if you need exclusive lock while doing so.

> 3. The other idea from sinval where you only wake up one worker at a
> time is a good one as you point out. This seems quite doable, however,
> it seems wasteful to try and wake everyone up the moment we switch to
> a new page. The longer you delay the lower the chance you need to wake
> anyone at all because they've because they'll have caught up by
> themselves. A single SLRU page can hold hundreds, or even thousands of
> messages.

Not entirely following your comment here either. The point of the change
is exactly that we'd wake up only one backend at a time (and only the
furthest-behind one, so that anyone who catches up of their own accord
stops being a factor). Also, "hundreds or thousands" seems
over-optimistic given that the minimum size of AsyncQueueEntry is 20
bytes --- in practice it'll be more because people don't use empty
strings as notify channel names. I think a few hundred messages per
page is the upper limit, and it could be a lot less.

> Do 2 & 3 seem like a good direction to go? I can probably work something up.

I'm on board with 3, obviously. Not following what you have in mind
for 2.

regards, tom lane

In response to

Re: [PATCH] Improve performance of NOTIFY over many databases (issue blocking on AccessExclusiveLock on object 0 of class 1262 of database 0) at 2019-07-23 14:46:37 from Martijn van Oosterhout

Responses

Re: [PATCH] Improve performance of NOTIFY over many databases (issue blocking on AccessExclusiveLock on object 0 of class 1262 of database 0) at 2019-07-23 19:48:14 from Martijn van Oosterhout

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2019-07-23 17:28:47	Re: stress test for parallel workers
Previous Message	Tom Lane	2019-07-23 17:06:48	Re: [bug fix] Produce a crash dump before main() on Windows