Re: Bug #613: Sequence values fall back to previously checkpointed

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: bgrimm(at)zaeon(dot)com, pgsql-bugs(at)postgresql(dot)org, Vadim Mikheev <vmikheev(at)sectorbase(dot)com>
Subject: Re: Bug #613: Sequence values fall back to previously checkpointed
Date: 2002-03-12 05:17:20
Message-ID: 28000.1015910240@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Yikes! I have reproduced this bug.

I believe I see the problem: MyLastRecPtr is being used in incompatible
ways.

The issue is that sequence operations are logged as "outside transaction
control", which I believe is intended to mark XLOG records that should
be redone whether or not the generating transaction commits. (Or if we
ever do xlog UNDO, records that should not be undone at xact abort.)
This classification is clearly right as far as it goes. Now
MyLastRecPtr is used to chain together the XLOG records that are
*within* xact control, so it doesn't get updated when an
outside-the-xact record is written. (At each record insert,
MyLastRecPtr is used to fill the previous-record-of-xact backlink.)
This is also fine.

The trouble is that at xact commit, we test to see if the current xact
made any loggable changes by checking MyLastRecPtr != 0. Therefore,
if we do an xact consisting ONLY of "select nextval()", this test will
mistakenly think that no xlog records were written. It will not
generate a commit record --- which is no big problem --- and will not
write or flush the xlog --- which is a big problem. An immediately
following crash will leave the sequence un-advanced.

The "no commit record" part of the logic seems okay to me, but we need
an independent test to decide whether to write/flush XLog. If we have
reported a nextval() value to the client then it seems to me we'd better
be certain that XLOG record is flushed to XLog before we report commit
to the client.

This is certainly fixable. However, here's the kicker: essentially what
this means is that we are not treating *reporting a nextval() value to
the client* as a commit-worthy event. I do not think this bug explains
the past reports that claim a nextval() value *inserted into the
database* has been rolled back. Seems to me that a subsequent tuple
insertion would create a normal XLog record which we'd flush before
commit, and thereby also flush the sequence-update XLog record.

Can anyone see a way that this mechanism explains the prior reports?

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Justin 2002-03-12 06:28:59 Re: Bug #613: Sequence values fall back to previously checkpointed
Previous Message Bruce Momjian 2002-03-11 23:02:25 Re: Bug #613: Sequence values fall back to previously checkpointed