Re: reorderbuffer: memory overconsumption with medium-size subxacts

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Petr Jelinek <petr(at)2ndquadrant(dot)com>
Subject: Re: reorderbuffer: memory overconsumption with medium-size subxacts
Date: 2018-12-16 18:32:22
Message-ID: 8f0ed23c-b297-1ae7-48d0-d9663eadf604@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/16/18 4:06 PM, Alvaro Herrera wrote:
> Hello
>
> Found this on Postgres 9.6, but I think it affects back to 9.4.
>
> I've seen a case where reorderbuffer keeps very large amounts of memory
> in use, without spilling to disk, if the main transaction does little or
> no changes and many subtransactions execute changes just below the
> threshold to spill to disk.
>
> The particular case we've seen is the main transaction does one UPDATE,
> then a subtransaction does something between 300 and 4000 changes.
> Since all these are below max_changes_in_memory, nothing gets spilled to
> disk. (To make matters worse: even if there are some subxacts that do
> more than max_changes_in_memory, only that subxact is spilled, not the
> whole transaction.) This was causing a 16GB-machine to die, unable to
> process the long transaction; had to add additional 16 GB of physical
> RAM for the machine to be able to process the transaction.
>

Yeah. We do the check for each xact separately, so it's vulnerable to
this scenario (subxact large just below max_changes_in_memory).

> I think there's a one-line fix, attached: just add the number of changes
> in a subxact to nentries_mem when the transaction is assigned to the
> parent. Since a wal ASSIGNMENT records happens once every 32 subxacts,
> this accumulates just that number of subxact changes in memory before
> spilling, which is much more reasonable. (Hmm, I wonder why this
> happens every 32 subxacts, if the code seems to be using
> PGPROC_MAX_CACHED_SUBXIDS which is 64.)
>

Not sure, for a couple of reasons ...

Doesn't that essentially mean we'd always evict toplevel xact(s),
including all subxacts, no matter how tiny those subxacts are? That
seems like it might easily cause regressions for the common case with
many small subxact and one huge subxact (which is the only one we'd
currently spill, I think). That seems annoying.

But even if we decide it's the right approach, isn't the proposed patch
a couple of bricks shy? It redefines the nentries_mem field from "per
(sub)xact" to "total" for the toplevel xact, but then updates it only in
when assigning the child. But the subxacts may receive changes after
that, so ReorderBufferQueueChange() probably needs to update the value
for toplevel xact too, I guess. And ReorderBufferCheckSerializeTXN()
should probably check the toplevel xact too ...

Maybe a simpler solution would be to simply track total number of
changes in memory (from all xacts), and then evict the largest one. But
I doubt that would be backpatchable - it's pretty much what the
logical_work_mem patch does.

And of course, addressing this on pg11 is a bit more complicated due to
the behavior of Generation context (see the logical_work_mem thread for
details).

> Hmm, while writing this I am wonder if this affects cases with many
> levels of subtransactions. Not sure how are nested subxacts handled by
> reorderbuffer.c, but reading code I think it is okay.
>

That should be OK. The assignments don't care about the intermediate
subxacts, they only care about the toplevel xact and current subxact.

> Of course, there's Tomas logical_work_mem too, but that's too invasive
> to backpatch.
>

Yep.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-12-16 19:53:07 Re: Should new partitions inherit their tablespace from their parent?
Previous Message Tom Lane 2018-12-16 18:10:42 Re: select limit error in file_fdw