Re: Optimize LISTEN/NOTIFY

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Chao Li" <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Matheus Alcantara" <matheusssilv97(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize LISTEN/NOTIFY
Date: 2025-10-08 14:53:33
Message-ID: 2790345f-03c3-4cae-8f14-886ed9079319@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 8, 2025, at 05:43, Chao Li wrote:
> After several rounds of reviewing, the code is already very good. I
> just got a few small comments:

Thanks for feedback!

The below changes have been incorporated into the v12 version
sent in my previous email.

>> On Oct 8, 2025, at 03:26, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>>
>>
>> /Joel<optimize_listen_notify-v11.patch>
>
>
> 1
> ```
> + channels = GetPendingNotifyChannels();
> +
> LWLockAcquire(NotifyQueueLock, LW_EXCLUSIVE);
> - for (ProcNumber i = QUEUE_FIRST_LISTENER; i != INVALID_PROC_NUMBER; i
> = QUEUE_NEXT_LISTENER(i))
> + foreach(lc, channels)
> ```
>
> I don’t see where “channels” is freed. GetPendingNotifyChannels()
> creates a list of Nodes, both the list and Nodes the list points to
> should be freed.

Per suggestion from Tom Lane I reverted back GetPendingNotifyChannels(),
so this comment is not applicable any longer.

> 2
> ```
> + foreach(lc, channels)
> {
> - int32 pid = QUEUE_BACKEND_PID(i);
> - QueuePosition pos;
> + char *channel = strVal(lfirst(lc));
> + ChannelEntry *entry;
> + ProcNumber *listeners;
> + ChannelHashKey key;
>
> - Assert(pid != InvalidPid);
> - pos = QUEUE_BACKEND_POS(i);
> - if (QUEUE_BACKEND_DBOID(i) == MyDatabaseId)
> + if (channel_hash == NULL)
> + entry = NULL;
> + else
> ```
>
> I wonder whether or not “channel_hash” can be NULL here? Maybe possible
> if a channel is un-listened while the event is pending?

Yes, I think channelHash can be NULL here if doing a NOTIFY
when there hasn't been a LISTEN yet.

> So, maybe add a comment here to explain the logic.

Not sure I think that's necessary.
What do you suggest that comment would say?

> 3
> The same piece of code as 2.
>
> I think the code can be optimized a little bit. First, we can
> initialize entry to NULL, then we don’t the if-else. Second, “key” is
> only used for dshash_find(), so it can defined where it is used.
>
> foreach(lc, channels)
> {
> char *channel = strVal(lfirst(lc));
> ChannelEntry *entry = NULL;
> ProcNumber *listeners;
> //ChannelHashKey key;
>
> if (channel_hash != NULL)
> {
> ChannelHashKey key;
> ChannelHashPrepareKey(&key, MyDatabaseId, channel);
> entry = dshash_find(channel_hash, &key, false);
> }
>
> if (entry == NULL)
> continue; /* No listeners registered for this channel */

Nice, I agree that's more readable, I changed it like that.

> 4
> ```
> + if (signaled[i] || QUEUE_BACKEND_WAKEUP_PENDING(i))
> + continue;
> ```
>
> I wonder if “signaled[i]” is a duplicate flag of
> "QUEUE_BACKEND_WAKEUP_PENDING(i)”?
>
> I understand signaled is local, and QUEUE_BACKEND_WAKEUP_PENDING is in
> shared memory and may be set by other processes, but in local, when
> signaled[I] is set, QUEUE_BACKEND_WAKEUP_PENDING(i) is also set. And
> because of NotifyQueueLock, other process should not be able to cleanup
> the flag.
>
> But if “signals” is really needed, maybe we can use Bitmapset
> (src/backend/nodes/bitmapset.c), that would use 1/8 of memories
> comparing to the bool array.

I agree, since we're holding an exclusive lock, the signaled array is reundant.
I've removed it, so that we rely only on the wakeupPending flag.

> 5
> ```
> /*
> @@ -1865,6 +2087,7 @@ asyncQueueReadAllNotifications(void)
> LWLockAcquire(NotifyQueueLock, LW_SHARED);
> /* Assert checks that we have a valid state entry */
> Assert(MyProcPid == QUEUE_BACKEND_PID(MyProcNumber));
> + QUEUE_BACKEND_WAKEUP_PENDING(MyProcNumber) = false;
> ```
>
> This piece of code originally only read the shared memory, so it can
> use LW_SHARED lock mode, but now it writes to the shared memory, do we
> need to change the lock mode to “exclusive”?

No, LW_SHARED is sufficient here, since the backend only modifies its own state,
and no other backend could do that, without holding an exclusive lock.

> 6
> ```
> +static inline void
> +ChannelHashPrepareKey(ChannelHashKey *key, Oid dboid, const char *channel)
> +{
> + memset(key, 0, sizeof(ChannelHashKey));
> + key->dboid = dboid;
> + strlcpy(key->channel, channel, NAMEDATALEN);
> +}
> ```
>
> Do we really need the memset()? If “channel” is of length NAMEDATALEN,
> then it still results in a non-0 terminated key->channel; if channel is
> shorter than NAMEDATALEN, strlcpy will auto add a tailing ‘\0’. I think
> previous code should have ensured length of channel should be less than
> NAMEDATALEN.

Yes, I think we need memset, since I fear that when the hash table keys
are compared, every byte of the struct might be inspected, so without
zero-initializing it, there could be unused bytes after the null
terminator, that could then cause logically identical keys to be wrongly
considered different.

I haven't checked the implementation though, but my gut feeling says
it's better to be a bit paranoid here.

> 7
> ```
> *
> * Resist the temptation to make this really large. While that would save
> * work in some places, it would add cost in others. In particular, this
> @@ -246,6 +280,7 @@ typedef struct QueueBackendStatus
> Oid dboid; /* backend's database OID, or InvalidOid */
> ProcNumber nextListener; /* id of next listener, or INVALID_PROC_NUMBER */
> QueuePosition pos; /* backend has read queue up to here */
> + bool wakeup_pending; /* signal sent but not yet processed */
> } QueueBackendStatus;
> ```
>
> In the same structure, rest of fields are all in camel case, I think
> it’s better to rename the new field to “wakeupPending”.
>
> 8
> ```
> @@ -288,11 +323,91 @@ typedef struct AsyncQueueControl
> ProcNumber firstListener; /* id of first listener, or
> * INVALID_PROC_NUMBER */
> TimestampTz lastQueueFillWarn; /* time of last queue-full msg */
> + dsa_handle channel_hash_dsa;
> + dshash_table_handle channel_hash_dsh;
> QueueBackendStatus backend[FLEXIBLE_ARRAY_MEMBER];
> ```
>
> Same as 7, but in this case, type names are not camel case, maybe okay
> for field names. I don’t have a strong opinion here.

I've did a major renaming of all new code, to better match the casing style.
It seems like helper functions and fields areNamedLikeThis, while
API-functions AreNamedLikeThis.

If we don't like this naming, I'm happy to change it again, please advise.

/Joel

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ants Aasma 2025-10-08 15:14:20 Re: Should we update the random_page_cost default value?
Previous Message Robert Haas 2025-10-08 14:45:12 Re: Eager aggregation, take 3