Re: BUG #17393: Delete database after recovery with point-in-time is still missing datafiles

From: Pierre Forstmann <pierre(dot)forstmann(at)gmail(dot)com>
To: "Lorenz, Christopher" <Christopher(dot)Lorenz(at)zit-bb(dot)brandenburg(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17393: Delete database after recovery with point-in-time is still missing datafiles
Date: 2022-03-30 19:07:40
Message-ID: CAM-sOH8NgXknOEAEp09zyx73Ve-VgH83uxc=BeY4VrTmrSg-yA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,

I can reproduce your issue with your base_recovery.sh with PG 13.6 on Alma
Linux 8.5.

If I modify your script to display transaction numbers before and after
DROP DATABASE with:

echo -e $rt `date` "drop the database test ..." $wt
# Drop DB
psql -c 'select txid_current();'
psql -c 'DROP DATABASE test'
psql -c 'select txid_current();'

I get:

Wed Mar 30 20:56:27 CEST 2022 drop the database test ...
txid_current
--------------
486
(1 row)

DROP DATABASE
txid_current
--------------
488
(1 row)

and database test is successfully restored:

You are now connected to database "test" as user "postgres".

Wed Mar 30 20:56:40 CEST 2022 Check datafiles of test ... - there are
missing datafiles
total 52
drwx------. 6 postgres postgres 54 Mar 30 20:56 .
drwx------. 20 postgres postgres 4096 Mar 30 20:56 ..
drwx------. 2 postgres postgres 8192 Mar 30 20:56 1
drwx------. 2 postgres postgres 8192 Mar 30 20:56 13433
drwx------. 2 postgres postgres 8192 Mar 30 20:56 13434
drwx------. 2 postgres postgres 8192 Mar 30 20:56 16384

Wed Mar 30 20:56:40 CEST 2022 Check the logs what's happened ...
2022-03-30 20:56:14.542 CEST [15788] LOG: starting PostgreSQL 13.6 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat
8.5.0-4), 64-bit
2022-03-30 20:56:14.542 CEST [15788] LOG: listening on IPv6 address "::1",
port 5555
2022-03-30 20:56:14.542 CEST [15788] LOG: listening on IPv4 address
"127.0.0.1", port 5555
2022-03-30 20:56:14.548 CEST [15788] LOG: listening on Unix socket
"/var/lib/pgsql/pg_test/run/.s.PGSQL.5555"
2022-03-30 20:56:14.555 CEST [15790] LOG: database system was shut down at
2022-03-30 20:56:12 CEST
2022-03-30 20:56:14.561 CEST [15788] LOG: database system is ready to
accept connections
2022-03-30 20:56:33.243 CEST [15853] LOG: starting PostgreSQL 13.6 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat
8.5.0-4), 64-bit
2022-03-30 20:56:33.244 CEST [15853] LOG: listening on IPv6 address "::1",
port 5555
2022-03-30 20:56:33.244 CEST [15853] LOG: listening on IPv4 address
"127.0.0.1", port 5555
2022-03-30 20:56:33.249 CEST [15853] LOG: listening on Unix socket
"/var/lib/pgsql/pg_test/run/.s.PGSQL.5555"
2022-03-30 20:56:33.256 CEST [15855] LOG: database system was interrupted;
last known up at 2022-03-30 20:56:15 CEST
cp: cannot stat '/var/lib/pgsql/pg_test/wal_backup/00000002.history': No
such file or directory
2022-03-30 20:56:35.554 CEST [15855] LOG: starting point-in-time recovery
to 2022-03-30 20:56:22+02
2022-03-30 20:56:35.561 CEST [15855] LOG: restored log file
"000000010000000000000002" from archive
2022-03-30 20:56:35.581 CEST [15855] LOG: redo starts at 0/2000028
2022-03-30 20:56:35.584 CEST [15855] LOG: consistent recovery state
reached at 0/2000100
2022-03-30 20:56:35.585 CEST [15853] LOG: database system is ready to
accept read only connections
2022-03-30 20:56:35.592 CEST [15855] LOG: restored log file
"000000010000000000000003" from archive
2022-03-30 20:56:35.604 CEST [15855] LOG: recovery stopping before commit
of transaction 486, time 2022-03-30 20:56:27.81682+02
2022-03-30 20:56:35.604 CEST [15855] LOG: pausing at the end of recovery
2022-03-30 20:56:35.604 CEST [15855] HINT: Execute pg_wal_replay_resume()
to promote.

Wed Mar 30 20:56:40 CEST 2022 Postgresql knows the Database test but don't
have the datafiles anymore.

I cannot really explain what is really going on when the script fails.
It looks like that in some cases recovery goes too far maybe because there
are not enough transactions in the WAL files ?

Le mer. 30 mars 2022 à 13:50, Lorenz, Christopher <
Christopher(dot)Lorenz(at)zit-bb(dot)brandenburg(dot)de> a écrit :

> Hello,
>
> I created a script which do the complete procedure of create database
> backup and reproduce the bug while recovery.
> I created this used by rpm from official postgresql yum repo. I used a
> RedHat 7 OS. I also add the output of the script.
> You need to set your path to binary in the script: Line 9 pg_bin, as
> described before I also can reproduce this behavior in Versions 12 to 14
>
> The line with the wrong stop in bug report seams to be a copy&paste error
> and was done correctly.
>
> The WAL recovery is done by recovery_command which restores all previous
> saved files. In the log you can see that the WAL files are used for
> recovery:
> 2022-03-29 10:30:19.686 CEST [15169] LOG: restored log file
> "000000010000000000000003" from archive
> 2022-03-29 10:30:19.727 CEST [15169] LOG: recovery stopping before commit
> of transaction 486, time 2022-03-29 10:30:08.739965+02
> 2022-03-29 10:30:19.727 CEST [15169] LOG: pausing at the end of recovery
>
> Regards,
>
> Christopher Lorenz
>
>
> -----Ursprüngliche Nachricht-----
> Von: Julien Rouhaud <rjuju123(at)gmail(dot)com>
> Gesendet: Sonntag, 6. März 2022 08:42
> An: Lorenz, Christopher <Christopher(dot)Lorenz(at)ZIT-BB(dot)Brandenburg(dot)de>;
> pgsql-bugs(at)lists(dot)postgresql(dot)org
> Betreff: Re: BUG #17393: Delete database after recovery with point-in-time
> is still missing datafiles
>
> ...
>
> I can't reproduce the problem. Now, given what you seem to be using, this
> is
> likely an operator error:
>
> ...
>
> > pg_ctl -D $PGDATAstop
>
> this won't stop the instance. If that's what you're really doing it's
> clearly
> going to be broken.
> ...
>
> no WAL recovery?
>
> I recommend reading
> https://www.postgresql.org/docs/current/continuous-archiving.html to see
> what
> are the correct steps to restore a PITR backup, and if you still have a
> problem
> please show a script that is self contained (ie. provide all parameter
> initialization), safe, and actually works.
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Julien Rouhaud 2022-03-31 04:59:08 Re: BUG #17448: In Windows 10, version 1703 and later, huge_pages doesn't work.
Previous Message Lorenz, Christopher 2022-03-30 11:49:49 AW: BUG #17393: Delete database after recovery with point-in-time is still missing datafiles