Re: advance local xmin more aggressively

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: advance local xmin more aggressively
Date: 2014-12-10 12:34:16
Message-ID: 54883DC8.7070508@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/09/2014 10:35 PM, Robert Haas wrote:
> On Mon, Dec 8, 2014 at 9:31 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Mon, Dec 8, 2014 at 4:56 AM, Heikki Linnakangas
>> <hlinnakangas(at)vmware(dot)com> wrote:
>>> I don't immediately see the problem either, but I have to say that
>>> grovelling through all the resource owners seems ugly anyway. Resource
>>> owners are not meant to be traversed like that. And there could be a lot of
>>> them, and you have to visit every one of them. That could get expensive if
>>> there are a lot of resource owners.
>>
>> 1. I don't really see why resource owners shouldn't be traversed like
>> that. They are clearly intended to form a hierarchy, and there's
>> existing code that recurses through the hierarchy from a given level
>> downward. What's ugly about that?

I can't exactly point my finger on it, but it just feels wrong from a
modularity point of view. Their purpose is to make sure that we don't
leak resources on abort, by allowing easy an "release everything"
operation. It's not designed for finding objects based on some other
criteria.

There is similar double bookkeeping of many other things that are
tracked by resource owners. Heavy-weight locks are tracked by LOCALLOCK
structs, buffer pins in PrivateRefCount array etc. Those things existed
before resource owners were invented, but if we were starting from
scratch, that design would still make sense, as different objects have
different access criteria. fd.c needs to be able to find the
least-recently-used open file, for example, and you need to find the
snapshot with the lowest xmin.

> Upthread, I suggested keeping a tally of the number of snapshots with
> the advertised xmin and recomputing the xmin to advertise only when it
> reaches 0. This patch doesn't implementation that optimization, but
> it does have code that aborts the traversal of the resource owner
> hierarchy as soon as we see an xmin that will preclude advancing our
> advertised xmin. Releasing N resource owners could therefore cost
> O(N^2) in the worst case, but note that releasing N resource owners is
> *already* an O(N^2) operation in the worst case, because the list of
> children of a particular parent resource owner is singly linked, and
> thus deleting a resource owner is O(N). It's been that way for an
> awfully long time without anyone complaining, probably because (a)
> it's not particularly common to have large numbers of cursors open
> simultaneously and (b) even if you do have that, the constant factor
> is pretty low.

I think you're confusing the N and the N above. It's true that deleting
a resource owner is O(M), where the M is the number of children of that
resource owner. It does not follow that releasing N resource owners is
O(N^2), where N is the number of resource owners released. Calling
ResourceOwnerDelete(x) will only visit each resource owner in that tree
once, so it's O(N), where N is the total number of resource owners in
the tree.

I did some testing of the worst case scenario. The attached script first
creates a lot of cursors, then a lot of savepoints, and finally closes
the cursors in FIFO order. When the number of savepoints is high enough,
this actually segfaults with your patch, because you run out of stack
space when recursing the subxact resource owners. That's hardly this
patch's fault, I'm actually surprised it doesn't crash without it,
because we recurse into all resource owners in ResourceOwnerRelease too.
Apparently the subxacts are closed in LIFO order at commit, but there
might be are other cases where you could trigger that. In any case, a
stack-depth check would be nice.

- Heikki

Attachment Content-Type Size
cursors.sh application/x-sh 319 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-12-10 12:50:31 Re: logical column ordering
Previous Message Stephen Frost 2014-12-10 12:26:18 Re: GSSAPI, SSPI - include_realm default