Re: Bug #613: Sequence values fall back to previously checkpointed

From: Ben Grimm <bgrimm(at)zaeon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org, Vadim Mikheev <vmikheev(at)sectorbase(dot)com>
Subject: Re: Bug #613: Sequence values fall back to previously checkpointed
Date: 2002-03-13 22:32:28
Message-ID: 20020313163228.A2112@zaeon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, 13 Mar 2002, Tom Lane wrote:
>
> I don't think that can work. AFAICT what your patch does is to ensure
> a WAL record is written by the first nextval() in any given backend
> session.

That's exactly what it does, yes. It forces the WAL record to be
written at least once. I think the reason this works is because the
WAL record that's written seems to be one behind what should be on
disk. So doing it once gets it ahead of the game. I'm sure it's
a very naive approach, but before yesterday I had never looked at
the source for postgresql. All I can say for my patch it is that if
it does not indeed fix the problem it masks it well enough that I
can't reproduce it.

> But what we need is to ensure a WAL record from the first
> nextval() after a checkpoint.
>
> The problem in the scenario Bruce exhibits is that the CHECKPOINT
> forces out both the latest sequence WAL record and the current state
> of the sequence relation itself. The subsequent nextval()'s advance
> the sequence relation in-memory but generate no disk writes and no
> WAL records. On restart, you lose: the sequence relation is back
> to where it was checkpointed, and the latest WAL record for the
> sequence is before the checkpoint *so it won't get rescanned*.
> Thus, the sequence doesn't get pushed forward like it's supposed to.

This isn't quite true... because until you select enough values to
get to log < fetch it won't even have inserted a WAL record to
CHECKPOINT to so it falls back to the unmodified state which means
that the 'last_value' on disk never moves forward, in theory the
value on disk should *always* be equal or greater (up to 32) than the
value being returned to the client and when you load it off disk
it isn't.

More (possibly redundant) examples:
select * from your_seq and note the values. Then select nextval a few
times, kill -9 your backend, and reconnect. select * from your_seq
again and you should see that it's identical to the previous values.

Now, try the same thing again, but force a checkpoint before killing
your backend, then select again... same values as initially.

Now, select nextval the number of times needed to get log_cnt to
loop past 0 and back up to 32, then select * from your_seq again,
note the values and checkpoint. crash the backend, reconnect and
select again... it saved it this time because it got through enough
code to do the xloginsert.

> The failure cases for your patch would
> involve backends that have been running for longer than one checkpoint
> cycle ...

I haven't been able to reproduce that, even checkpointing multiple
times on several open backends. But I also found a couple mistakes
in my patch that make it a little better. I can forward the new
patch if you'd like to see it.

-- Ben

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Mikheev, Vadim 2002-03-13 22:34:41 Re: Bug #613: Sequence values fall back to previously chec
Previous Message Tom Lane 2002-03-13 22:29:08 Re: Bug #613: Sequence values fall back to previously chec kpointed