Re: Support to define custom wait events for extensions

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Tristan Partin <tristan(at)neon(dot)tech>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Support to define custom wait events for extensions
Date: 2023-08-09 23:45:43
Message-ID: ZNQlJ8pQIIWjv/he@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 09, 2023 at 08:10:42PM +0900, Masahiro Ikeda wrote:
> * create second hash table to find a event id from a name to
> identify uniquness. It enable extensions which don't use share
> memory for their use to define custom wait events because
> WaitEventExtensionNew() will not allocate duplicate wait events.

Okay, a second hash table to check if events are registered works for
me.

> * create PoC patch to show that extensions, which don't use shared
> memory for their use, can define custom wait events.
> (v2-0002-poc-custom-wait-event-for-dblink.patch)
>
> I'm worrying about
> * Is 512(wee_hash_max_size) the maximum number of the custom wait
> events sufficient?

Thanks for sending a patch!

I'm OK to start with that. This could always be revisited later, but
even for a server loaded with a bunch of extensions that looks more
than enough to me.

> * Is there any way to not force extensions that don't use shared
> memory for their use like dblink to acquire AddinShmemInitLock?;

Yes, they don't need it at all as the dynahashes are protected with
their own LWLocks.

+++ b/src/backend/storage/lmgr/lwlocknames.txt
@@ -53,3 +53,4 @@ XactTruncationLock 44
# 45 was XactTruncationLock until removal of BackendRandomLock
WrapLimitsVacuumLock 46
NotifyQueueTailLock 47
+WaitEventExtensionLock 48

This new LWLock needs to be added to wait_event_names.txt, or it won't
be reported to pg_stat_activity and it would not be documented when
the sgml docs are generated from the txt data.

-extern uint32 WaitEventExtensionNew(void);
-extern void WaitEventExtensionRegisterName(uint32 wait_event_info,
- const char *wait_event_name);
+extern uint32 WaitEventExtensionNew(const char *wait_event_name);
Looks about right, and the docs are refreshed.

+static const int wee_hash_init_size = 128;
+static const int wee_hash_max_size = 512;
I would use a few #defines with upper-case characters here instead as
these are constants for us.

Now that it is possible to rely on LWLocks for the hash tables, more
cleanup is possible in worker_spi, with the removal of
worker_spi_state, the shmem hooks and their routines. The only thing
that should be needed is something like that at the start of
worker_spi_main() (same position as worker_spi_shmem_init now):
+static uint32 wait_event = 0;
[...]
+ if (wait_event == 0)
+ wait_event = WaitEventExtensionNew("worker_spi_main");

The updates in 001_worker_spi.pl look OK.

+ * The entry must be stored because it's registered in
+ * WaitEventExtensionNew().
*/
- eventId -= NUM_BUILTIN_WAIT_EVENT_EXTENSION;
+ if (!entry)
+ ereport(ERROR,
+ errmsg("could not find the name for custom wait event ID %u", eventId));
Yeah, I think that's better than just falling back to "extension". An
ID reported in pg_stat_activity should always have an entry, or we
have race conditions. This should be an elog(ERROR), as in
this-error-shall-never-happen. No need to translate the error string,
as well (the docs have been updated with this change. thanks!).

Additionally, LWLockHeldByMeInMode(AddinShmemInitLock) in
WaitEventExtensionNew() should not be needed, thanks to
WaitEventExtensionLock.

+ * WaitEventExtensionNameHash is used to find the name from a event id.
+ * It enables all backends look up them without additional processing
+ * per backend like LWLockRegisterTranche().

It does not seem necessary to mention LWLockRegisterTranche().

+ * WaitEventExtensionIdHash is used to find the event id from a name.
+ * Since it can identify uniquness by the names, extensions that do not
+ * use shared memory also be able to define custom wait events without
+ * defining duplicate wait events.

Perhaps this could just say that this table is necessary to ensure
that we don't have duplicated entries when registering new strings
with their IDs? s/uniquness/uniqueness/. The second part of the
sentence about shared memory does not seem necessary now.

+ sz = add_size(sz, hash_estimate_size(wee_hash_init_size,
+ sizeof(WaitEventExtensionNameEntry)));
+ sz = add_size(sz, hash_estimate_size(wee_hash_init_size,
+ sizeof(WaitEventExtensionIdEntry)));

Err, this should use the max size, and not the init size for the size
estimation, no?

+ if (strlen(wait_event_name) >= NAMEDATALEN)
+ ereport(ERROR,
+ errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("wait event name is too long"));
This could just be an elog(ERROR), I assume, as that could only be
reached by developers. The string needs to be rewritten, like "cannot
use custom wait event string longer than %u characters", or something
like that.

+ if (wait_event_info == NULL)
+ {
+ wait_event_info = (uint32 *) MemoryContextAlloc(TopMemoryContext, sizeof(uint32));
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+ *wait_event_info = WaitEventExtensionNew("dblink_get_con");
+ LWLockRelease(AddinShmemInitLock);
+ }
+ conn = libpqsrv_connect(connstr, *wait_event_info)

In 0002. Caching the value statically in the backend is what you
should do, but a pointer, an allocation to the TopMemoryContext and a
dependency to AddinShmemInitLock should not be necessary when dealing
with a uint32. You could use an initial value of 0, for example, or
just PG_WAIT_EXTENSION but the latter is not really necessary and
would bypass the sanity checks.

+ /* Register the new custom wait event in the shared hash table */
+ LWLockAcquire(WaitEventExtensionLock, LW_EXCLUSIVE);
+
+ name_entry = (WaitEventExtensionNameEntry *)
+ hash_search(WaitEventExtensionNameHash, &eventId, HASH_ENTER, &found);
+ Assert(!found);
+ strlcpy(name_entry->wait_event_name, wait_event_name, sizeof(name_entry->wait_event_name));
+
+ id_entry = (WaitEventExtensionIdEntry *)
+ hash_search(WaitEventExtensionIdHash, &wait_event_name, HASH_ENTER, &found);
+ Assert(!found);
+ id_entry->event_id = eventId;
+
+ LWLockRelease(WaitEventExtensionLock);

The logic added to WaitEventExtensionNew() is a bit racy, where it
would be possible with the same entry to be added multiple times.
Imagine for example the following:
- Process 1 does WaitEventExtensionNew("foo1"), does not find the
entry by name in hash_search, gets an eventId of 1, releases the
spinlock.
- Process 2 calls as well WaitEventExtensionNew("foo1"), does not find
the entry by name because it has not been added by process 1 yet,
allocates an eventId of 2
- Process 2 takes first WaitEventExtensionLock in LW_EXCLUSIVE to add
entry "foo1", there is no entry by name, so one is added for the ID.
WaitEventExtensionLock is released
- Process 1, that was waiting on WaitEventExtensionLock, can now take
it in exclusive mode. It finds an entry by name for "foo1", fails the
assertion because an entry is found.

I think that the ordering of WaitEventExtensionNew() should be
reworked a bit. This order should be safer.
- Take WaitEventExtensionLock in shared mode, look if there's an entry
by name, release the lock. The patch does that.
- If an entry is found, return, we're OK. The patch does that.
- Take again WaitEventExtensionLock in exclusive mode.
- Look again at the hash table with the name given, in case somebody
has inserted an equivalent entry in the short window where the lock
was not held.
-- If an entry is found, release the lock and leave, we're OK.
-- If an entry is not found, keep the lock.
- Acquire the spinlock, and get a new event ID. Release spinlock.
- Add the new entries to both tables, both assertions on found are OK
to have.
- Release LWLock and leave.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jimmy Yih 2023-08-10 00:00:44 Should the archiver process always make sure that the timeline history files exist in the archive?
Previous Message Jacob Champion 2023-08-09 23:10:39 Re: pg_dump needs SELECT privileges on irrelevant extension table