Re: [PATCH] Infinite loop while acquiring new TOAST Oid

From: Andres Freund <andres(at)anarazel(dot)de>
To: Nikita Malakhov <hukutoc(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Infinite loop while acquiring new TOAST Oid
Date: 2022-11-28 20:36:55
Message-ID: 20221128203655.xhbfxkgz5td5acrs@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-28 18:34:20 +0300, Nikita Malakhov wrote:
> While working on Pluggable TOAST we've detected a defective behavior
> on tables with large amounts of TOASTed data - queries freeze and DB stalls.
> Further investigation led us to the loop with GetNewOidWithIndex function
> call - when all available Oid already exist in the related TOAST table this
> loop continues infinitely. Data type used for value ID is the UINT32, which
> is
> unsigned int and has a maximum value of *4294967295* which allows
> maximum 4294967295 records in the TOAST table. It is not a very big amount
> for modern databases and is the major problem for productive systems.

I don't think the absolute number is the main issue - by default external
toasting will happen only for bigger datums. 4 billion external datums
typically use a lot of space.

If you hit this easily with your patch, then you likely broke the conditions
under which external toasting happens.

IMO the big issue is the global oid counter making it much easier to hit oid
wraparound. Due to that we end up assigning oids that conflict with existing
toast oids much sooner than 4 billion toasted datums.

I think the first step to improve the situation is to not use a global oid
counter for toasted values. One way to do that would be to use the sequence
code to do oid assignment, but we likely can find a more efficient
representation.

Eventually we should do the obvious thing and make toast ids 64bit wide - to
combat the space usage we likely should switch to representing the ids as
variable width integers or such, otherwise the space increase would likely be
prohibitive.

> Quick fix for this problem is limiting GetNewOidWithIndex loops to some
> reasonable amount defined by related macro and returning error if there is
> still no available Oid. Please check attached patch, any feedback is
> appreciated.

This feels like the wrong spot to tackle the issue. For one, most of the
looping will be in GetNewOidWithIndex(), so limiting looping in
toast_save_datum() won't help much. For another, if the limiting were in the
right place, it'd break currently working cases. Due to oid wraparound it's
pretty easy to hit "ranges" of allocated oids, without even getting close to
2^32 toasted datums.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2022-11-28 20:46:28 Re: fixing CREATEROLE
Previous Message Mark Dilger 2022-11-28 20:33:47 Re: fixing CREATEROLE