Re: Failed to delete old ReorderBuffer spilled files

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: sawada(dot)mshk(at)gmail(dot)com
Cc: torikoshi_atsushi_z2(at)lab(dot)ntt(dot)co(dot)jp, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Failed to delete old ReorderBuffer spilled files
Date: 2017-11-22 02:48:14
Message-ID: 20171122.114814.59091351.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

At Wed, 22 Nov 2017 10:10:27 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoDkPbCNX-d_VqKrW4rDt5W5Y3=LQr7zYbbxF=uVDayt-A(at)mail(dot)gmail(dot)com>
> >> Using last changing LSN might work but I'm afraid that that fails
> >> to remove the last snap file if the crash happens at the very
> >> start of a segment.
>
> I think it works even in the case because the we can compute the
> largest WAL segment number that we need to remove by using the lsn of
> the last change in old transaction. Am I missing something?

I'm concerned by the window that can leave an empty file in
ReorderBufferSerializeTXN. But I had a closer look and found that
the snap files for old transactions are not of the previous run,
but rebuilt from WAL records after restart. So there cannot be an
empty file there.

I'm convinced that it is the proper way to deal with this problem.

> >> Anyway all files of the transaction is no longer useless at the
> >> time, but it seems that the last_lsn is required to avoid
> >> directory scanning at every transaction end.
> >>
> >> Letting ReorderBufferAbortOld scan the directory and determine
> >> the first and last LSN then set to the txn would work but it
> >> might be an overkill. Using the beginning LSN of the next segment
> >> of the last_change->lsn could surely work... really?
> >> (ReorderBufferRestoreCleanup doesn't complain on ENOENT.)
> >
> > Somehow I deleted exessively while editing. One more possible
> > solution is making ReorderBufferAbortOld take final LSN and
> > DecodeStandbyOp passes the LSN of XLOG_RUNNING_XACTS record to
> > it.
> >
>
> Setting final_lsn in ReorderBufferAbortOld seems good to me but I'm
> not sure we can use the lsn of XLOG_RUNNING_XACTS record. Doesn't
> ReorderBufferRestoreCleanup() raise an error due to ENOENT if the wal

It no longer matters but the function does *not* raise an error
on ENOENT.

> segment having XLOG_RUNNING_XACTS records doesn't have any changes of
> the old transaction?

Since the transaction doesn't meet abort record any larger LSN
can work as final_lsn and the record is guaranteed to be so. But
anyway I agree that the last_change->lsn is more proper than it.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-11-22 02:49:54 Re: Failed to delete old ReorderBuffer spilled files
Previous Message Masahiko Sawada 2017-11-22 02:32:30 Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager