Re: Too many files in pg_replslot folder

From: Dmitriy Sarafannikov <d(dot)sarafannikov(at)bk(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Too many files in pg_replslot folder
Date: 2016-03-18 08:12:44
Message-ID: 56EBB87C.3000606@bk.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


On 03/18/2016 02:21 AM, Andres Freund wrote:
> Hm, interesting. I wonder, could you try this with a source checkout
> of 9.4? There's some fixes in the upcoming minor release that might be
> related. If not, I'll look into it. - Andres
Thank you for you attention!

It's reproducible for REL9_4_STABLE
(17a250b189a94a470e37ce14d0ebf72390c86d4d).

I looked a little at the code. And i have reproduced this case with
debugger in case of 5000 rows.

There was xact with 5000 subxacts and 5000 ->nentries.
Because 5000 > max_changes_in_memory (4096) then xact was spilled to
disk partially.
Then in ReorderBufferCommit xact was spilled to disk with all their
subxacts by
ReorderBufferSerializeTXN and had ->nentries = 5000, ->nentries_mem = 0.

if (txn->nentries_mem != txn->nentries)
ReorderBufferSerializeTXN(rb, txn);

Each subxact had ->nentries = 1 and ->nentries_mem = 1 before it was
spilled and ->nentries = 1, ->nentries_mem = 0 after.

Then xact with all subxacts was restored in ReorderBufferIterTXNInit.
And each subxact had ->nentries = 1 and ->nentries_mem = 1, and main
xact had ->nentries = 5000, ->nentries_mem = 4096
after.

As i understood, it implied that all files will be deleted in
ReorderBufferCleanupTXN
which recursively called for each subxact.
And for subxacts ReorderBufferRestoreCleanup has not been called because
they had ->nentries = 1 == ->nentries_mem = 1.

/* remove entries spilled to disk */
if (txn->nentries != txn->nentries_mem)
ReorderBufferRestoreCleanup(rb, txn);

I have a suggestion, but i'm not sure that it will be fully correct,
because i don't know yet this logic fully:
add condition for subxact to "if"

/* remove entries spilled to disk */
if (txn->nentries != txn->nentries_mem || txn->is_known_as_subxact)
ReorderBufferRestoreCleanup(rb, txn);

And, perhaps, i found a typo in ReorderBufferIterTXNInit:

/* add subtransactions if they contain changes */
dlist_foreach(cur_txn_i, &txn->subtxns)
{
ReorderBufferTXN *cur_txn;

cur_txn = dlist_container(ReorderBufferTXN, node, cur_txn_i.cur);

if (cur_txn->nentries > 0)
{
ReorderBufferChange *cur_change;

if (txn->nentries != txn->nentries_mem)
ReorderBufferRestoreChanges(rb, cur_txn,
&state->entries[off].fd,
&state->entries[off].segno);

cur_change = dlist_head_element(ReorderBufferChange, node,
&cur_txn->changes);

state->entries[off].lsn = cur_change->lsn;
state->entries[off].change = cur_change;
state->entries[off].txn = cur_txn;

binaryheap_add_unordered(state->heap, Int32GetDatum(off++));
}
}

maybe, it was implied as

if (cur_txn->nentries != cur_txn->nentries_mem)
ReorderBufferRestoreChanges(rb, cur_txn,
&state->entries[off].fd,
&state->entries[off].segno);
instead of

if (txn->nentries != txn->nentries_mem)
ReorderBufferRestoreChanges(rb, cur_txn,
&state->entries[off].fd,
&state->entries[off].segno);

Or i'm wrong?

Just in case, the patch for REL9_4_STABLE attached.

--
Best Regards,
Dmitriy Sarafannikov

Attachment Content-Type Size
cleanup_subxacts_94.patch text/x-patch 920 bytes

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Ilya Matveychikov 2016-03-18 09:54:20 Incorrect accounting (n_tup_ins) of non-inserted rows
Previous Message Jim Nasby 2016-03-18 07:15:50 Re: BUG #13750: Autovacuum slows down with large numbers of tables. More workers makes it slower.