Re: ResourceOwner refactoring

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ResourceOwner refactoring
Date: 2022-11-01 10:39:39
Message-ID: 9a3245db-ca03-4a47-5706-73404cfe5108@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/11/2022 02:15, Andres Freund wrote:
> On 2022-10-31 10:51:36 +0100, Heikki Linnakangas wrote:
>> These are functions where quite a lot of things happen between the
>> ResourceOwnerEnlarge and ResourceOwnerRemember calls. It's important that
>> there are no unrelated ResourceOwnerRemember() calls in the code in
>> between, otherwise the entry reserved by the ResourceOwnerEnlarge() call
>> might be used up by the intervening ResourceOwnerRemember() and not be
>> available at the intended ResourceOwnerRemember() call anymore. The longer
>> the code path between them is, the harder it is to verify that.
>
> This seems to work towards a future where only one kind of resource can be
> reserved ahead of time. That doesn't strike me as great.

True. It is enough for all the current callers AFAICS, though. I don't
remember wanting to reserve multiple resources at the same time.
Usually, the distance between the Enlarge and Remember calls is very
short. If it's not, it's error-prone anyway, so we should try to keep
the distance short.

While we're at it, who says that it's enough to reserve space for only
one resource of a kind either? For example, what if you want to pin two
buffers?

If we really need to support that, I propose something like this:

/*
* Reserve a slot in the resource owner.
*
* This is separate from actually inserting an entry because if we run out
* of memory, it's critical to do so *before* acquiring the resource.
*/
ResOwnerReservation *
ResourceOwnerReserve(ResourceOwner owner)

/*
* Remember that an object is owner by a ResourceOwner
*
* 'reservation' is a slot in the resource owner that was pre-reserved
* by ResOwnerReservation.
*/
void
ResOwnerRemember(ResOwnerReservaton *reservation, Datum value,
ResourceOwnerFuncs *kind)

>> Instead of having a separate array/hash for each resource kind, use a
>> single array and hash to hold all kinds of resources. This makes it
>> possible to introduce new resource "kinds" without having to modify the
>> ResourceOwnerData struct. In particular, this makes it possible for
>> extensions to register custom resource kinds.
>
> As a goal I like this.
>
> However, I'm not quite sold on the implementation. Two main worries:
>
> 1) As far as I can tell, the way ResourceOwnerReleaseAll() now works seems to
> assume that within a phase the reset order does not matter. I don't think
> that's a good assumption. I e.g. have a patch to replace InProgressBuf with
> resowner handling, and in-progress IO errors should be processed before
> before pins are released.

Hmm. Currently, you're not supposed to hold any resources at commit. You
get warnings about resource leaks if a resource owner is not empty on
ResourceOwnerReleaseAll(). On abort, does the order matter? I'm not
familiar with your InProgressBuf patch, but I guess you could handle the
in-progress IO errors in ReleaseBuffer().

If we do need to worry about release order, perhaps add a "priority" or
"phase" to each resource kind, and release them in priority order. We
already have before- and after-locks as phases, but we could generalize
that.

However, I feel that trying to enforce a particular order moves the
goalposts. If we need that, let's add it as a separate patch later.

> 2) There's quite a few resource types where we actually don't need an entry in
> an array, because we can instead add a dlist_node to the resource -
> avoiding memory overhead and making removal cheap. I have a few pending
> patches that use that approach, and this doesn't really provide a path for
> that anymore.

Is that materially better than using the array? The fast path with an
array is very fast. If it is better, perhaps we should bite the bullet
and require a dlist node and use that mechanism for all resource types?

> I did try out the benchmark from
> https://postgr.es/m/20221029200025.w7bvlgvamjfo6z44%40awork3.anarazel.de and
> the patches performed well, slightly better than my approach of allocating
> some initial memory for each resarray.

Thank you, glad to hear that!

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-11-01 10:39:40 Re: real/float example for testlibpq3
Previous Message Anton A. Melnikov 2022-11-01 10:36:15 Re: [PATCH] Backport perl tests for pg_upgrade from 322becb60