Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds
Date: 2024-02-08 12:06:38
Message-ID: CAPmGK15DF6EE7O6hTLbe5-fHvPDwEx9vm-BOCN3dsKOjZCo7bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Nov 27, 2023 at 12:05 PM Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote:
> On Fri, Nov 24, 2023 at 1:00 PM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
> > Now that the leakage eliminated by 50c67c201/481d7d1c0 we still can observe
> > the assert-triggering half of the bug with something like that:
>
> Will look into this.

I finally had time to look into this.

IIUC I think the assertion failure was caused by an
error-during-error-recovery loop caused by the "epoll_create1 failed:
Too many open files" error raised in WaitLatchOrSocket called from
pgfdw_get_cleanup_result, which is called during abort cleanup. I
think a simple fix to avoid such a loop is to modify the PG_CATCH
block in pgfdw_get_cleanup_result so that it just ignores the passed
error, not re-throwing it, and restores InterruptHoldoffCount and the
memory context, like the attached. In the patch I also modified
callers of pgfdw_get_cleanup_result to issue a warning when ignoring
the error. I might be missing something, though.

Best regards,
Etsuro Fujita

Attachment Content-Type Size
Avoid-error-during-error-recovery-loops-1.patch application/octet-stream 3.7 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-02-08 17:04:24 Re: BUG #17828: postgres_fdw leaks file descriptors on error and aborts aborted transaction in lack of fds
Previous Message jian he 2024-02-08 06:17:53 Re: BUG #18314: PARALLEL UNSAFE function does not prevent parallel index build