Re: proposal: make NOTIFY list de-duplication optional

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Filip Rembiałkowski <filip(dot)rembialkowski(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Catalin Iacob <iacobcatalin(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>
Subject: Re: proposal: make NOTIFY list de-duplication optional
Date: 2019-08-14 18:04:27
Message-ID: 6767.1565805867@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I think that we'd probably be better off fixing the root performance issue
> than adding semantic complexity to bypass it. ...
> Accordingly, I looked into making a hash table when there are more than
> a small number of notifications pending, and attached is a lightly-tested
> version of that. This seems to be more or less similar speed to the
> existing code for up to 100 or so distinct notifies, but it soon pulls
> away above that.

I noticed that the cfbot was unhappy with this, because it (intentionally)
changes the results of the async-notify isolation tests I added awhile
ago. So here's an updated version that adjusts that test, and also
changes the NOTIFY documentation to remove the old weasel wording about
whether we de-dup or not.

> I also noticed that as things stand, it costs us two or three pallocs to
> construct a Notification event. It wouldn't be terribly hard to reduce
> that to one palloc, and maybe it'd be worthwhile if we're thinking that
> transactions with many many notifies are a case worth optimizing.
> But I didn't do that here; it seems like a separable patch.

I also did that, attached as the second patch below. This way ends up
requiring us to palloc the Notification event and then pfree it again,
if it turns out to be a dup. Despite that, it's faster than the first
patch alone, and also faster than HEAD in every case I tried. Not
much faster, if there's not a lot of dups, but as far as I can find
there isn't any case where it loses compared to HEAD. Even with
subtransactions, where in principle the time to merge subtransaction
event lists into the parent transaction ought to cost us. You can't
get that to matter unless the subtransaction had a lot of distinct
events, and then HEAD hits its O(N^2) behavior inside the subxact.

So I can't really see any reason not to commit these.

That leaves the question of whether we want to continue pursuing
the proposed feature for user control of de-duping. I'd tend
to vote against, because it seems like semantic complexity we
don't need. While the idea sounds straightforward, I think it
isn't so much when you start to think hard about how notifies
issued with and without "collapse" ought to interact.

regards, tom lane

Attachment Content-Type Size
0001-detect-dup-notifies-by-hashing-2.patch text/x-diff 14.4 KB
0002-better-notification-structure-1.patch text/x-diff 7.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-08-14 18:17:09 Re: "long" type is not appropriate for counting tuples
Previous Message Andres Freund 2019-08-14 18:01:43 Re: Feature: Use DNS SRV records for connecting