BUG #7710: Xid epoch is not updated properly during checkpoint

From: tarvip(at)gmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #7710: Xid epoch is not updated properly during checkpoint
Date: 2012-11-27 19:52:09
Message-ID: E1TdRCT-00008s-Tc@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 7710
Logged by: Tarvi Pillessaar
Email address: tarvip(at)gmail(dot)com
PostgreSQL version: 9.1.6
Operating system: linux
Description:

This happens only if wal_level=hot_standby.

Here are the steps to reproduce this issue.

We have following db cluster:

postgres(at)sbox /usr/local/pgsql $ pg_controldata data|grep NextXID
Latest checkpoint's NextXID: 0/4294966303
postgres(at)sbox /usr/local/pgsql $

Basically we have less than 1000 XIDs to epoch boundary.

Modify following parameters in conf:
checkpoint_segments = 16
checkpoint_completion_target = 0.9
checkpoint_timeout = 2min
log_checkpoints = on
log_line_prefix = '%t %r %p %d %u '
wal_level = hot_standby

Let's start up the cluster:
postgres(at)sbox /usr/local/pgsql $ postgres -D /usr/local/pgsql/data
2012-11-27 20:44:43 EET 26353 LOG: database system was shut down at
2012-11-27 18:43:12 EET
2012-11-27 20:44:43 EET 26352 LOG: database system is ready to accept
connections
...

In another session:
postgres(at)sbox /usr/local/pgsql $ psql -c "select now(),txid_current(),
txid_current()-2^32"
now | txid_current | ?column?
-------------------------------+--------------+----------
2012-11-27 20:45:01.394324+02 | 4294966303 | -993

Now let's consume some XIDs, otherwise we have nothing to checkpoint.

postgres(at)sbox /usr/local/pgsql $ pgbench -c 1 -t 700
...
postgres(at)sbox /usr/local/pgsql $ psql -c "select now(),txid_current(),
txid_current()-2^32"
now | txid_current | ?column?
-------------------------------+--------------+----------
2012-11-27 20:45:27.256096+02 | 4294967005 | -291

After a while, checkpoint starts:
2012-11-27 20:46:43 EET 26354 LOG: checkpoint starting: time

Now let's cross the epoch boundary:

postgres(at)sbox /usr/local/pgsql $ pgbench -c 1 -t 700
...

postgres(at)sbox /usr/local/pgsql $ psql -c "select now(),txid_current(),
txid_current()-2^32"
now | txid_current | ?column?
-------------------------------+--------------+----------
2012-11-27 20:46:51.205384+02 | 4294967713 | 417

Seems that we have successfully crossed the boundary.

When checkpoint completes:
2012-11-27 20:47:32 EET 26354 LOG: checkpoint complete: wrote 779
buffers (25.4%); 0 transaction log file(s) added, 0 removed, 9 recycled;
write=49.441 s, sync=0.170 s, total=49.636 s; sync files=16, longest=0.131
s, average=0.010 s

postgres(at)sbox /usr/local/pgsql $ psql -c "select now(),txid_current(),
txid_current()-2^32"
now | txid_current | ?column?
-------------------------------+--------------+-------------
2012-11-27 20:47:42.031007+02 | 421 | -4294966875

It seems that epoch bump was rolled back.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Noah Misch 2012-11-28 04:27:28 Re: PITR potentially broken in 9.2
Previous Message Jeff Janes 2012-11-27 18:08:12 PITR potentially broken in 9.2