On Thu, 2010-12-02 at 12:41 +0200, Heikki Linnakangas wrote:
> On 02.12.2010 11:02, Simon Riggs wrote:
> > The cause of the issue is that replay starts at one LSN and there is a
> > delay until the RunningXacts WAL record occurs. If there was no delay,
> > there would be no issue at all. In CreateCheckpoint() we start by
> > grabbing the WAInsertLock and later recording that pointer as part of
> > the checkpoint record. My proposal is to replace the "grab the lock"
> > code with the insert of the RunningXacts WAL record (when wal_level
> > set), so that recovery always starts with that record type.
> Oh, interesting idea. But AFAICS closing the gap between acquiring the
> running-xacts snapshot and writing it to the log is sufficient, I don't
> see what moving the running-xacts record buys us. Does it allow some
> further simplifications somewhere?
Your patch is quite long and you do a lot more than just alter the
locking. I don't think we need those changes at all and especially would
not wish to backpatch that.
Earlier on this thread, we discussed:
On Wed, 2010-11-24 at 15:19 +0000, Simon Riggs wrote:
> On Wed, 2010-11-24 at 12:48 +0200, Heikki Linnakangas wrote:
> > When recovery starts, we fetch the oldestActiveXid from the checkpoint
> > record. Let's say that it's 100. We then start replaying WAL records
> > from the Redo pointer, and the first record (heap insert in your case)
> > contains an Xid that's much larger than 100, say 10000. We call
> > RecordKnownAssignedXids() to make note that all xids between that
> > range are in-progress, but there isn't enough room in the array for
> > that.
The current code fails because of the gap between the redo pointer and
the XLOG_RUNNING_XACTS WAL record. If there is no gap, there is no
So my preferred solution would:
* Log XLOG_RUNNING_XACTS while holding XidGenLock, as you suggest
* Move logging to occur at the Redo pointer
That is a much smaller patch with a smaller footprint.
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services
In response to
pgsql-hackers by date
|Next:||From: Heikki Linnakangas||Date: 2010-12-02 11:31:42|
|Subject: Re: Hot Standby: too many KnownAssignedXids|
|Previous:||From: Heikki Linnakangas||Date: 2010-12-02 11:19:20|
|Subject: Re: WIP patch for parallel pg_dump|