Re: pg_rewind WAL segments deletion pitfall

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: cyberdemn(at)gmail(dot)com
Cc: bungina(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_rewind WAL segments deletion pitfall
Date: 2022-08-26 08:04:32
Message-ID: 20220826.170432.669257260859851674.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

(Moved to -hackers)

At Thu, 25 Aug 2022 10:34:40 +0200, Alexander Kukushkin <cyberdemn(at)gmail(dot)com> wrote in
> > # killall -9 postgres
> > # rm -r oldprim newprim oldarch newarch oldprim.log newprim.log
> > mkdir newarch oldarch
> > initdb -k -D oldprim
> > echo "archive_mode = 'always'">> oldprim/postgresql.conf
> >
>
> With archive_mode = always you can't reproduce it.
> It is very rarely people set it to always in production due to the overhead.
...
> The archive_mode has to be set to on and the archive_command should be
> failing when you do pg_ctl -D oldprim stop

Ah, I see.

What I don't still understand is why pg_rewind doesn't work for the
old primary in that case. When archive_mode=on, the old primary has
the complete set of WAL files counting both pg_wal and its archive. So
as the same to the privious repro, pg_rewind -c ought to work (but it
uses its own archive this time). In that sense the proposed solution
is still not needed in this case.

A bit harder situation comes after the server successfully rewound; if
the new primary goes so far that the old primary cannot connect. Even
in that case, you can copy-in the requried WAL files or configure
restore_command of the old pimary so that it finds required WAL files
there.

As the result the system in total doesn't lose a WAL file.

So.. I might still be missing something..

###############################
# killall -9 postgres
# rm -r oldprim newprim oldarch newarch oldprim.log newprim.log
mkdir newarch oldarch
initdb -k -D oldprim
echo "archive_mode = 'on'">> oldprim/postgresql.conf
echo "archive_command = 'cp %p `pwd`/oldarch/%f'">> oldprim/postgresql.conf
pg_ctl -D oldprim -o '-p 5432' -l oldprim.log start
psql -p 5432 -c 'create table t(a int)'
pg_basebackup -D newprim -p 5432
echo "primary_conninfo='host=/tmp port=5432'">> newprim/postgresql.conf
echo "archive_command = 'cp %p `pwd`/newarch/%f'">> newprim/postgresql.conf
touch newprim/standby.signal
pg_ctl -D newprim -o '-p 5433' -l newprim.log start

# the last common checkpoint
psql -p 5432 -c 'checkpoint'

# record approx. diverging WAL segment
start_wal=`psql -p 5433 -Atc 'select pg_walfile_name(pg_last_wal_replay_lsn() - (select setting from pg_settings where name = 'wal_segment_size')::int);
`
psql -p 5432 -c 'insert into t values(0); select pg_switch_wal();'
pg_ctl -D newprim promote
psql -p 5433 -c 'checkpoint'

# old rprimary loses diverging WAL segment
for i in $(seq 1 4); do psql -p 5432 -c 'insert into t values(0); select pg_switch_wal();'; done

# old primary cannot archive any more
echo "archive_command = 'false'">> oldprim/postgresql.conf
pg_ctl -D oldprim reload
pg_ctl -D oldprim stop

# rewind the old primary, using its own archive
# pg_rewind -D oldprim --source-server='port=5433' # should fail
echo "restore_command = 'cp `pwd`/oldarch/%f %p'">> oldprim/postgresql.conf
pg_rewind -D oldprim --source-server='port=5433' -c

# advance WAL on the old primary; new primary loses the launching WAL seg
for i in $(seq 1 4); do psql -p 5433 -c 'insert into t values(0); select pg_switch_wal();'; done
psql -p 5433 -c 'checkpoint'
echo "primary_conninfo='host=/tmp port=5433'">> oldprim/postgresql.conf
touch oldprim/standby.signal

postgres -D oldprim # fails with "WAL file has been removed"

# The alternative of copying-in
# echo "restore_command = 'cp `pwd`/newarch/%f %p'">> oldprim/postgresql.conf

# copy-in WAL files from new primary's archive to old primary
(cd newarch;
for f in `ls`; do
if [[ "$f" > "$start_wal" ]]; then echo copy $f; cp $f ../oldprim/pg_wal; fi
done)

postgres -D oldprim

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alexander Kukushkin 2022-08-26 08:57:25 Re: pg_rewind WAL segments deletion pitfall
Previous Message Maxim Boguk 2022-08-26 08:04:12 Re: BUG #17594: conditional hash indexes size (hash index ignore WHERE condition during CREATE INDEX?)

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema 2022-08-26 08:05:02 Re: [PATCH] Fix alter subscription concurrency errors
Previous Message John Naylor 2022-08-26 08:03:48 Re: [PATCH] Optimize json_lex_string by batching character copying