| From: | Daisuke Higuchi <higuchi(dot)daisuke11(at)gmail(dot)com> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | [PATCH] Fix sequence value may rollback after CREATE DATABASE TEMPLATE with WAL_LOG strategy |
| Date: | 2026-06-16 04:01:48 |
| Message-ID: | CAEVT6c9mOHUR9N4UPryq=j997Az00SFO3zjgD5-a4R500tDCkA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi
I found that "CREATE DATABASE ... TEMPLATE" with WAL_LOG strategy can
cause sequence values to rollback after a streaming replication failover.
When creating a new database with "CREATE DATABASE ... TEMPLATE" from
a template database that has sequences, calling nextval() on those
sequences in the new database causes the primary to have a higher
last_value than the standby temporarily.
When a failover occurs and the standby is promoted in this state, the
sequence value appears to have rollbacked on the new primary.
I found this issue in PostgreSQL version 15, but I've confirmed that it also
happens on the master branch.
I show the example as follows.
After creating a database with "CREATE DATABASE ... TEMPLATE",
the sequence initially has last_value = 5:
-- Initial state on both Primary and Standby:
=====
test_db=# SELECT last_value, log_cnt, is_called FROM test_seq;
last_value | log_cnt | is_called
------------+---------+-----------
5 | 28 | t
(1 row)
=====
When I run nextval() 5 times, the values of primary and standby are
as follows. The values in the standby are not changed.
-- Primary:
=====
test_db=# SELECT last_value, log_cnt, is_called FROM test_seq;
last_value | log_cnt | is_called
------------+---------+-----------
10 | 23 | t
(1 row)
=====
-- Standby:
=====
test_db=# SELECT last_value, log_cnt, is_called FROM test_seq;
last_value | log_cnt | is_called
------------+---------+-----------
5 | 28 | t
(1 row)
=====
When I run nextval() in this state, it returns 6 even though 6 - 10
were already returned on the previous primary. It means that the
sequence rollback after a failover.
I understand that sequences cannot be used to obtain "gapless", but I
think it is an unintended behavior that sequence values rollback.
I attached the script to reproduce this issue and the patch to fix.
In the attached patch, during "CREATE DATABASE ... TEMPLATE" with
WAL_LOG strategy, the sequence's log_cnt is reset to 0 so that
SEQ_LOG_VALS (32) values are fetched in advance on the next nextval()
call. This ensures that the standby's last_value is greater than or
equal to the primary's last_value.
And, this patch opens sequence page using readBufferWithoutRelcache()
without relying on relcache entries and resets log_cnt. This follows
the same approach as ScanSourceDatabasePGClass().
Note that this issue does not occur with FILE_COPY strategy because
this performs a checkpoint internally and a WAL record to fetch
SEQ_LOG_VALS in advance is always emitted on the next nextval() call.
However, I could not apply the same approach because I understand the
WAL_LOG strategy is designed to avoid checkpoints.
Regards,
Daisuke, Higuchi
| Attachment | Content-Type | Size |
|---|---|---|
| reproduce_script_sequence_rollback_issue.sh | application/x-sh | 1.8 KB |
| v1_0002_fix_sequence_rollback_by_WAL_LOG_test.patch | application/x-patch | 3.7 KB |
| v1_0001_fix_sequence_rollback_by_WAL_LOG.patch | application/x-patch | 3.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tender Wang | 2026-06-16 04:06:28 | Re: assertion failure with unique index + partitioning + join |
| Previous Message | vignesh C | 2026-06-16 03:57:19 | Re: Proposal: Conflict log history table for Logical Replication |