Re: thinko in basic_archive.c

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: thinko in basic_archive.c
Date: 2022-10-15 04:49:05
Message-ID: CALj2ACVMf0kv=MnYq3ctB-EfOuhpO+X_cN2CM2LYimUDigq79g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 15, 2022 at 12:03 AM Nathan Bossart
<nathandbossart(at)gmail(dot)com> wrote:
>
> On Fri, Oct 14, 2022 at 02:15:19PM +0530, Bharath Rupireddy wrote:
> > Given that temp file name includes WAL file name, epoch to
> > milliseconds scale and MyProcPid, can there be name collisions after a
> > server crash or even when multiple servers with different pids are
> > archiving/copying the same WAL file to the same directory?
>
> While unlikely, I think it's theoretically possible.

Can you please help me understand how name collisions can happen with
temp file names including WAL file name, timestamp to millisecond
scale, and PID? Having the timestamp is enough to provide a non-unique
temp file name when PID wraparound occurs, right? Am I missing
something here?

> > What happens to the left-over temp files after a server crash? Will
> > they be lying around in the archive directory? I understand that we
> > can't remove such files because we can't distinguish left-over files
> > from a crash and the temp files that another server is in the process
> > of copying.
>
> The temporary files are not automatically removed after a crash. The
> documentation for basic archive has a note about this [0].

Hm, we cannot remove the temp file for all sorts of crashes, but
having on_shmem_exit() or before_shmem_exit() or atexit() or any such
callback removing it would help us cover some crash scenarios (that
exit with proc_exit() or exit()) at least. I think the basic_archive
module currently leaves temp files around even when the server is
restarted legitimately while copying to or renaming the temp file, no?

I can quickly find these exit callbacks deleting the files:
atexit(cleanup_directories_atexit);
atexit(remove_temp);
before_shmem_exit(ReplicationSlotShmemExit, 0);
before_shmem_exit(logicalrep_worker_onexit, (Datum) 0);
before_shmem_exit(BeforeShmemExit_Files, 0);

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shay Rojansky 2022-10-15 05:27:29 Re: CREATE COLLATION must be specified
Previous Message Michael Paquier 2022-10-15 03:11:28 Re: Add regular expression testing for user name mapping in the peer authentication TAP test