Re: PANIC in pg_commit_ts slru after crashes

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PANIC in pg_commit_ts slru after crashes
Date: 2017-04-15 20:30:12
Message-ID: CAMkU=1xqfE3=O8v7AexGk+L17+A9dwRyhW8QJ=k-cuC7Gi=vWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 14, 2017 at 9:33 PM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
wrote:

> Since all those offsets fall on a page boundary, my guess is that we're
> somehow failing to handle a new page correctly.
>
> Looking at the patch itself, my feeling is that the following code
> in src/backend/access/transam/twophase.c might be causing the problem.
>
> 1841
> 1842 /* update nextXid if needed */
> 1843 if (TransactionIdFollowsOrEquals(maxsubxid,
> ShmemVariableCache->nextXid))
> 1844 {
> 1845 LWLockAcquire(XidGenLock, LW_EXCLUSIVE);
> 1846 ShmemVariableCache->nextXid = maxsubxid;
> 1847 TransactionIdAdvance(ShmemVariableCache->nextXid);
> 1848 LWLockRelease(XidGenLock);
> 1849 }
>
> The function PrescanPreparedTransactions() gets called at the start of
> the redo recovery and this specific block will get exercised irrespective
> of whether there are any prepared transactions or not. What I find
> particularly wrong here is that we are initialising maxsubxid to current
> value of ShmemVariableCache->nextXid when the function enters, but this
> block would then again increment ShmemVariableCache->nextXid, when there
> are no prepared transactions in the system.
>
> I wonder if we should do as in attached patch.
>

That solves it for me.

Thanks,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2017-04-15 20:43:17 Re: Cutting initdb's runtime (Perl question embedded)
Previous Message Tom Lane 2017-04-15 18:49:42 Re: logical replication launcher crash on buildfarm