Re: 9.3 pg_archivecleanup broken?

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.3 pg_archivecleanup broken?
Date: 2012-11-19 00:17:49
Message-ID: CAHGQGwH+M630UnYOXiicXikL_gyFxv=bR=o_BSS_6xLMsv4DKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 19, 2012 at 12:43 AM, Erik Rijkers <er(at)xs4all(dot)nl> wrote:
> (In a test setup) I can't get pg_archivecleanup to remove WALfiles in 9.3devel. (A very similar
> setup in 9.2 works fine).
>
> In 9.3 pg_archivecleanup just keeps repeating lines like:
>
> pg_archivecleanup: keep WAL file "/home/aardvark/pg_stuff/archive_dir93/000000000000000000000000"
> and later
>
> (and does not delete any files.)
>
> Configuration:
>
> # master pgsql.93_1/data/postgresql.conf:
> data_directory = '/home/aardvark/pg_stuff/pg_installations/pgsql.93_1/data'
> listen_addresses = '*'
> max_connections = 100
> shared_buffers = 128MB
> wal_level = hot_standby
> synchronous_commit = on
> checkpoint_segments = 3
> archive_mode = on
> archive_command = 'cp %p /home/aardvark/pg_stuff/archive_dir93/%f < /dev/null'
> max_wal_senders = 3
> synchronous_standby_names = '*'
>
> # slave pgsql.93_2/data/postgresql.conf:
> data_directory = '/home/aardvark/pg_stuff/pg_installations/pgsql.93_2/data'
> listen_addresses = '*'
> port = 6665
> max_connections = 100
> shared_buffers = 128MB
> wal_level = hot_standby
> synchronous_commit = on
> checkpoint_segments = 3
> max_wal_senders = 3
> synchronous_standby_names = ''
> hot_standby = on
> wal_receiver_status_interval = 59
>
> # pgsql.93_2/data/recovery.conf
> primary_conninfo = 'host=127.0.0.1 port=6664 user=aardvark password=sekr1t
> application_name=wal_receiver_01'
> standby_mode = 'on'
> restore_command = 'cp /home/aardvark/pg_stuff/archive_dir93/%f %p < /dev/null'
> archive_cleanup_command = 'pg_archivecleanup -d /home/aardvark/pg_stuff/archive_dir93 %r'
>
>
> Seeing that the same setup in 9.2 has pg_archivecleanup deleting files, it would seem that some
> bug exists but I haven't followed changes regarding WAL too closely.

Thanks for the report! I was able to reproduce this problem.

What's broken is not pg_archivecleanup itself but %r in archive_cleanup_command
which is replaced by the name of the file containing the last valid
restart point.
In 9.3dev, %r is always replaced by an invalid WAL filename (i.e., 0000....0000)
wrongly.

This bug is derived from the commit d5497b95f3ca2fc50c6eef46d3394ab6e6855956.
This commit changed ExecuteRecoveryCommand() so that it calculates the
the last valid
retart file by using GetOldestRestartPoint(), even though
GetOldestRestartPoint() only
works in the startup process and only while WAL replay is in progress
(i.e., InRedo = true).
In archive_cleanup_command, ExecuteRecoveryCommand() is executed by the
checkpointer process, so the problem happened.

I found recovery_end_command also has the same bug because it calls
ExecuteRecoveryComand() after WAL replay is completed.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2012-11-19 00:21:12 Re: [HACKERS] Parser - Query Analyser
Previous Message Tom Lane 2012-11-18 23:47:58 Re: [RFC] Fix div/mul crash and more undefined behavior