Re: BUG #6748: sequence value may be conflict in some cases

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: meixiangming(at)huawei(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: BUG #6748: sequence value may be conflict in some cases
Date: 2012-07-23 18:43:34
Message-ID: 4360.1343069014@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

meixiangming(at)huawei(dot)com writes:
> [ freshly-created sequence has wrong state after crash ]

I didn't believe this at first, but sure enough, it fails just as
described if you force a crash between the first and second nextval
calls for the sequence. This used to work ...

The change that broke it turns out to be the ALTER SEQUENCE OWNED BY
call that we added to serial-column creation circa 8.2; although on
closer inspection I think any ALTER SEQUENCE before the first nextval
call would be problematic. The real issue is the ancient kluge in
sequence creation that writes something different into the WAL log
than what it leaves behind in shared buffers:

/* We do not log first nextval call, so "advance" sequence here */
/* Note we are scribbling on local tuple, not the disk buffer */
newseq->is_called = true;
newseq->log_cnt = 0;

The tuple in buffers has log_cnt = 1, is_called = false, but the initial
XLOG_SEQ_LOG record shows log_cnt = 0, is_called = true. So if we crash
at this point, after recovery it looks like one nextval() has already
been done. However, AlterSequence generates another XLOG_SEQ_LOG record
based on what's in shared buffers, so after replay of that, we're back
to the "original" state where it does not appear that any nextval() has
been done.

I'm of the opinion that this kluge needs to be removed; it's just insane
that we're not logging the same state we leave in our buffers. To do
that, we need to fix nextval() so that the first nextval call generates
an xlog entry; that is, if we are changing is_called to true we ought to
consider that as a reason to force an xlog entry. I think way back when
we thought it was a good idea to avoid making two xlog entries during
creation and immediate use of a sequence, but considering all the other
xlog entries involved in creation of a sequence object, this is a pretty
silly "optimization". (Besides, it merely postpones the first
nextval-driven xlog entry from the first to the second nextval call.)

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Davis 2012-07-24 00:16:01 event triggers patch breaks with -DCLOBBER_CACHE_ALWAYS
Previous Message Pavel Stehule 2012-07-23 06:33:18 Re: Duplicate rows primary key bug

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-07-23 18:45:01 Re: pgbench -i order of vacuum
Previous Message Adam Crews 2012-07-23 18:23:32 postgres 9 bind address for replication