Plugging fd leaks (was Re: Switching timeline over streaming replication)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, 'PostgreSQL-development' <pgsql-hackers(at)postgresql(dot)org>
Subject: Plugging fd leaks (was Re: Switching timeline over streaming replication)
Date: 2012-11-23 13:33:07
Message-ID: 50AF7B13.9050404@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15.11.2012 17:16, Heikki Linnakangas wrote:
> On 15.11.2012 16:55, Tom Lane wrote:
>> Heikki Linnakangas<hlinnakangas(at)vmware(dot)com> writes:
>>> This is a fairly general issue, actually. Looking around, I can see at
>>> least two similar cases in existing code, with BasicOpenFile, where we
>>> will leak file descriptors on error:
>>
>> Um, don't we automatically clean those up during transaction abort?
>
> Not the ones allocated with PathNameOpenFile or BasicOpenFile. Files
> allocated with AllocateFile() and OpenTemporaryFile() are cleaned up at
> abort.
>
>> If we don't, we ought to think about that, not about cluttering calling
>> code with certain-to-be-inadequate cleanup in error cases.
>
> Agreed. Cleaning up at end-of-xact won't help walsender or other
> non-backend processes, though, because they don't do transactions. But a
> top-level ResourceOwner that's reset in the sigsetjmp() cleanup routine
> would work.

This is what I came up with. It adds a new function, OpenFile, that
returns a raw file descriptor like BasicOpenFile, but the file
descriptor is associated with the current subtransaction and
automatically closed at end-of-xact, in the same way that AllocateFile
and AllocateDir work. In other words, OpenFile is to open() what
AllocateFile is to fopen(). BasicOpenFile is unchanged, it returns a raw
fd and it's solely the caller's responsibility to close it, but many of
the places that used to call BasicOpenFile now use the safer OpenFile
function instead.

This patch plugs three existing fd (or virtual fd) leaks:

1. copy_file() - fixed by by using OpenFile instead of BasicOpenFile
2. XLogFileLinit() - fixed by adding close() calls to the error cases.
Can't use OpenFile here because the fd is supposed to persist over
transaction boundaries.
3. lo_import/lo_export - fixed by using OpenFile instead of
PathNameOpenFile.

In addition, this replaces many BasicOpenFile() calls with OpenFile()
that were not leaking, because the code meticulously closed the file on
error. That wasn't strictly necessary, but IMHO it's good for robustness.

One thing I'm not too fond of is the naming. I'm calling the new
functions OpenFile and CloseFile. There's some danger of confusion
there, as the function to close a virtual file opened with
PathNameOpenFile is called FileClose. OpenFile is really the same kind
of operation as AllocateFile and AllocateDir, but returns an unbuffered
fd. So it would be nice if it was called AllocateSomething, too. But
AllocateFile is already taken. And I don't much like the Allocate*
naming for these anyway, you really would expect the name to contain "open".

Do we want to backpatch this? We've had zero complaints, but this seems
fairly safe to backpatch, and at least the leak in copy_file() can be
quite annoying. If you run out of disk space in CREATE DATABASE, the
target file is kept open even though it's deleted, so the space isn't
reclaimed until you disconnect.

- Heikki

Attachment Content-Type Size
fd-automatic-cleanup-1.patch text/x-diff 26.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Meskes 2012-11-23 13:42:48 Re: fix ecpg core dump when there's a very long struct variable name in .pgc file
Previous Message Amit Kapila 2012-11-23 13:14:19 Re: [WIP PATCH] for Performance Improvement in Buffer Management