Re: File descriptors inherited by restore_command

From: David Steele <david(at)pgmasters(dot)net>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: File descriptors inherited by restore_command
Date: 2019-06-21 14:09:19
Message-ID: 28fb5e2f-2ed1-ffb4-1206-0f3d2a60cecd@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/21/19 9:45 AM, Tom Lane wrote:
> David Steele <david(at)pgmasters(dot)net> writes:
>> While investigating "Too many open files" errors reported in our
>> parallel restore_command I noticed that the restore_command can inherit
>> quite a lot of fds from the recovery process. This limits the number of
>> fds available in the restore_command depending on the setting of system
>> nofile and Postgres max_files_per_process.
>
> Hm. Presumably you could hit the same issue with things like COPY FROM
> PROGRAM. And the only reason the archiver doesn't hit it is it never
> opens many files to begin with.

Yes. The archiver process is fine because it has ~8 fds open.

>> I was wondering if we should consider closing these fds before calling
>> restore_command? It seems like we could do this by forking first or by
>> setting FD_CLOEXEC using fcntl() or O_CLOEXEC on open() where available.
>
> +1 for using O_CLOEXEC on machines that have it. I don't think I want to
> jump through hoops for machines that don't have it --- POSIX has required
> it for some time, so there should be few machines in that category.

Another possible issue is that if we allow a child process to inherit
all these fds it might accidentally write to them, which would be bad.
I know the child process can go and maliciously open and trash files if
it wants, but it doesn't seem like we should allow it to happen
unintentionally.

Regards,
--
-David
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Keith Fiske 2019-06-21 14:10:27 Re: BUG #15865: ALTER TABLE statements causing "relation already exists" errors when some indexes exist
Previous Message Tom Lane 2019-06-21 14:01:44 Re: using explicit_bzero