Re: 011_crash_recovery.pl intermittently fails

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pg(at)bowt(dot)ie, pgsql-hackers(at)lists(dot)postgresql(dot)org, craig(dot)ringer(at)enterprisedb(dot)com, robertmhaas(at)gmail(dot)com
Subject: Re: 011_crash_recovery.pl intermittently fails
Date: 2023-01-24 23:40:02
Message-ID: CA+hUKGJ9p2JPPMA4eYAKq=r9d_4_8vziet_tS1LEBbiny5-ypA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 8, 2021 at 9:32 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
> At Sun, 07 Mar 2021 20:09:33 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in
> > Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > > Thanks! I'm afraid I wouldn't get around to it for a few weeks, so if
> > > you have time, please do. (I'm not sure if it's strictly necessary to
> > > log *this* xid, if a higher xid has already been logged, considering
> > > that the goal is just to avoid getting confused about an xid that is
> > > recycled after crash recovery, but coordinating that might be more
> > > complicated; I don't know.)
> >
> > Yeah, ideally the patch wouldn't add any unnecessary WAL flush,
> > if there's some cheap way to determine that our XID must already
> > have been written out. But I'm not sure that it's worth adding
> > any great amount of complexity to avoid that. For sure I would
> > not advocate adding any new bookkeeping overhead in the mainline
> > code paths to support it.
>
> We need to *write* an additional record if the current transaction
> haven't yet written one (EnsureTopTransactionIdLogged()). One
> annoyance is the possibly most-common usage of calling
> pg_current_xact_id() at the beginning of a transaction, which leads to
> an additional 8 byte-long log of XLOG_XACT_ASSIGNMENT. We could also
> avoid that by detecting any larger xid is already flushed out.

Yeah, that would be very expensive for users doing that.

> I haven't find a simple and clean way to tracking the maximum
> flushed-out XID. The new cooperation between xlog.c and xact.c
> related to XID and LSN happen on shared variable makes things
> complex...
>
> So the attached doesn't contain the max-flushed-xid tracking feature.

I guess that would be just as expensive if the user does that
sequentially with small transactions (ie allocating xids one by one).

I remembered this thread after seeing the failure of Michael's new
build farm animal "tanager". I think we need to solve this somehow...
according to our documentation "Applications might use this function,
for example, to determine whether their transaction committed or
aborted after the application and database server become disconnected
while a COMMIT is in progress.", but it's currently useless or
dangerous for that purpose.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-01-24 23:45:08 Re: suppressing useless wakeups in logical/worker.c
Previous Message Amin 2023-01-24 23:34:59 Getting relations accessed by a query using the raw query string