Re: Refactoring backend fork+exec code

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: reid(dot)thompson(at)crunchydata(dot)com, Tristan Partin <tristan(at)neon(dot)tech>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Refactoring backend fork+exec code
Date: 2024-02-08 11:19:53
Message-ID: 2136fb4f-1630-43e9-9cf6-b32772b4b0a0@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/02/2024 20:25, Andres Freund wrote:
> On 2024-01-30 02:08:36 +0200, Heikki Linnakangas wrote:
>> From 54f22231bb2540fc5957c14005956161e6fc9dac Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
>> Date: Wed, 24 Jan 2024 23:15:55 +0200
>> Subject: [PATCH v8 1/5] Remove superfluous 'pgprocno' field from PGPROC
>>
>> It was always just the index of the PGPROC entry from the beginning of
>> the proc array. Introduce a macro to compute it from the pointer
>> instead.
>
> Hm. The pointer math here is bit more expensive than in some other cases, as
> the struct is fairly large and sizeof(PGPROC) isn't a power of two. Adding
> more math into loops like in TransactionGroupUpdateXidStatus() might end up
> showing up.

I added a MyProcNumber global variable that is set to
GetNumberFromPGProc(MyProc). I'm not really concerned about the extra
math, but with MyProcNumber it should definitely not be an issue. The
few GetNumberFromPGProc() invocations that remain are in less
performance-critical paths.

(In later patch, I switch backend ids to 0-based indexing, which
replaces MyProcNumber references with MyBackendId)

> Is this really related to the rest of the series?

It's not strictly necessary, but it felt prudent to remove it now, since
I'm removing the backendID field too.

>> You can now convert a backend ID into an index into the PGPROC array
>> simply by subtracting 1. We still use 0-based "pgprocnos" in various
>> places, for indexes into the PGPROC array, but the only difference now
>> is that backend IDs start at 1 while pgprocnos start at 0.
>
> Why aren't we using 0-based indexing for both? InvalidBackendId is -1, so
> there'd not be a conflict, right?

Correct. I was being conservative and didn't dare to change the old
convention. The backend ids are visible in a few places like "pg_temp_0"
schema names, and pg_stat_get_*() functions.

One alternative would be to reserve and waste allProcs[0]. Then pgprocno
and backend ID could both be direct indexes to the array, but 0 would
not be used.

If we switch to 0-based indexing, it begs the question: why don't we
merge the concepts of "pgprocno" and "BackendId" completely and call it
the same thing everywhere? It probably would be best in the long run. It
feels like a lot of churn though.

Anyway, I switched to 0-based indexing in the attached new version, to
see what it looks like.

>> @@ -457,7 +442,7 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid,
>> Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
>>
>> Assert(gxact != NULL);
>> - proc = &ProcGlobal->allProcs[gxact->pgprocno];
>> + proc = GetPGProcByNumber(gxact->pgprocno);
>>
>> /* Initialize the PGPROC entry */
>> MemSet(proc, 0, sizeof(PGPROC));
>
> This set of changes is independent of this commit, isn't it?

Yes. It's just for symmetry, now that we use GetNumberFromPGProc() to
get the pgprocno.

>> diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
>> index ab86e802f21..39171fea06b 100644
>> --- a/src/backend/postmaster/auxprocess.c
>> +++ b/src/backend/postmaster/auxprocess.c
>> @@ -107,17 +107,7 @@ AuxiliaryProcessMain(AuxProcType auxtype)
>>
>> BaseInit();
>>
>> - /*
>> - * Assign the ProcSignalSlot for an auxiliary process. Since it doesn't
>> - * have a BackendId, the slot is statically allocated based on the
>> - * auxiliary process type (MyAuxProcType). Backends use slots indexed in
>> - * the range from 1 to MaxBackends (inclusive), so we use MaxBackends +
>> - * AuxProcType + 1 as the index of the slot for an auxiliary process.
>> - *
>> - * This will need rethinking if we ever want more than one of a particular
>> - * auxiliary process type.
>> - */
>> - ProcSignalInit(MaxBackends + MyAuxProcType + 1);
>> + ProcSignalInit();
>
> Now that we don't need the offset here, we could move ProcSignalInit() into
> BsaeInit() I think?

Hmm, doesn't feel right to me. BaseInit() is mostly concerned with
setting up backend-private structures, and it's also called for a
standalone backend.

I feel the process initialization codepaths could use some cleanup in
general. Not sure what exactly.

>> +/*
>> + * BackendIdGetProc -- get a backend's PGPROC given its backend ID
>> + *
>> + * The result may be out of date arbitrarily quickly, so the caller
>> + * must be careful about how this information is used. NULL is
>> + * returned if the backend is not active.
>> + */
>> +PGPROC *
>> +BackendIdGetProc(int backendID)
>> +{
>> + PGPROC *result;
>> +
>> + if (backendID < 1 || backendID > ProcGlobal->allProcCount)
>> + return NULL;
>
> Hm, doesn't calling BackendIdGetProc() with these values a bug? That's not
> about being out of date or such.

Perhaps. I just followed the example of the old implementation, which
also returns NULL on bogus inputs.

>> +/*
>> + * BackendIdGetTransactionIds -- get a backend's transaction status
>> + *
>> + * Get the xid, xmin, nsubxid and overflow status of the backend. The
>> + * result may be out of date arbitrarily quickly, so the caller must be
>> + * careful about how this information is used.
>> + */
>> +void
>> +BackendIdGetTransactionIds(int backendID, TransactionId *xid,
>> + TransactionId *xmin, int *nsubxid, bool *overflowed)
>> +{
>> + PGPROC *proc;
>> +
>> + *xid = InvalidTransactionId;
>> + *xmin = InvalidTransactionId;
>> + *nsubxid = 0;
>> + *overflowed = false;
>> +
>> + if (backendID < 1 || backendID > ProcGlobal->allProcCount)
>> + return;
>> + proc = GetPGProcByBackendId(backendID);
>> +
>> + /* Need to lock out additions/removals of backends */
>> + LWLockAcquire(ProcArrayLock, LW_SHARED);
>> +
>> + if (proc->pid != 0)
>> + {
>> + *xid = proc->xid;
>> + *xmin = proc->xmin;
>> + *nsubxid = proc->subxidStatus.count;
>> + *overflowed = proc->subxidStatus.overflowed;
>> + }
>> +
>> + LWLockRelease(ProcArrayLock);
>> +}
>
> Hm, I'm not sure about the locking here. For one, previously we weren't
> holding ProcArrayLock. For another, holding ProcArrayLock guarantees that the
> backend doesn't end its transaction, but it can still assign xids etc. And,
> for that matter, the backendid could have been recycled between the caller
> acquiring the backendId and calling BackendIdGetTransactionIds().

Yeah, the returned values could be out-of-date and even inconsistent
with each other. I just faithfully copied the old implementation.

Perhaps this should just skip the ProcArrayLock altogether.

>> --- a/src/backend/utils/error/elog.c
>> +++ b/src/backend/utils/error/elog.c
>> @@ -3074,18 +3074,18 @@ log_status_format(StringInfo buf, const char *format, ErrorData *edata)
>> break;
>> case 'v':
>> /* keep VXID format in sync with lockfuncs.c */
>> - if (MyProc != NULL && MyProc->backendId != InvalidBackendId)
>> + if (MyProc != NULL)
>
> Doesn't this mean we'll include a vxid in more cases now, particularly
> including aux processes? That might be ok, but I also suspect that it'll never
> have meaningful values...

Fixed. (I thought I changed that back already in the last patch version,
but apparently I only did it in jsonlog.c)

>> From 94fd46c9ef30ba5e8ac1a8873fce577a4be425f4 Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
>> Date: Mon, 29 Jan 2024 22:57:49 +0200
>> Subject: [PATCH v8 3/5] Replace 'lastBackend' with an array of in-use slots
>>
>> Now that the procState array is indexed by pgprocno, the 'lastBackend'
>> optimization is useless, because aux processes are assigned PGPROC
>> slots and hence have numbers higher than max_connection. So
>> 'lastBackend' was always set to almost the end of the array.
>>
>> To replace that optimization, mantain a dense array of in-use
>> indexes. This's redundant with ProgGlobal->procarray, but I was afraid
>> of adding any more contention to ProcArrayLock, and this keeps the
>> code isolated to sinvaladt.c too.
>
> I think it'd be good to include that explanation and justification in the code
> as well.

Added a comment.

Attached is a new version of these BackendId changes. I kept it as three
separate patches to highlight the changes from switching to 0-based
indexing, but I think they should be squashed together before pushing.

I think the last remaining question here is about the 0- vs 1-based
indexing of BackendIds. Is it a good idea to switch to 0-based indexing?
And if we do it, should we reserve PGPROC 0. I'm on the fence on this one.

And if we switch to 0-based indexing, should we do a more comprehensive
search & replace of "pgprocno" to "backendId", or something like that.
My vote is no, the code churn doesn't feel worth it. And it can also be
done separately later if we want to.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
v10-0001-Remove-superfluous-pgprocno-field-from-PGPROC.patch text/x-patch 16.5 KB
v10-0002-Redefine-backend-ID-to-be-an-index-into-the-proc.patch text/x-patch 54.7 KB
v10-0003-Use-0-based-indexing-for-BackendIds.patch text/x-patch 25.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2024-02-08 11:26:42 Re: Thoughts about NUM_BUFFER_PARTITIONS
Previous Message Peter Eisentraut 2024-02-08 11:18:17 Re: pg_get_expr locking