Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date: 2020-02-23 21:49:12
Message-ID: 6230.1582494552@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

I wrote:
>> Clearly, we gotta do something about that too. Maybe bumping up
>> NUM_RESERVED_FDS would be a good idea, but I feel like more-honest
>> accounting for permanently-eaten FDs would be a better idea. And
>> in any case we can't suppose that there's a fixed upper limit on
>> the number of postgres_fdw connections. I'm liking the idea I floated
>> earlier of letting postgres_fdw forcibly close the oldest LRU entry.

> A late-night glimmer of an idea: suppose we make fd.c keep count,
> not just of the open FDs under its control, but also of open FDs
> not under its control.

Here's a draft patch that does it like that. There are undoubtedly
more places that need to be taught to report their FD consumption;
one obvious candidate that I didn't touch is dblink. But this is
enough to account for all the long-lived "extra" FDs that are currently
open in a default build, and it passes check-world even with ulimit -n
set to the minimum that set_max_safe_fds will allow.

One thing I noticed is that if you open enough postgres_fdw connections
to cause a failure, that tends to come out as an unintelligible-to-
the-layman "epoll_create1 failed: Too many open files" error. That's
because after postgres_fdw has consumed the last available "external
FD" slot, it tries to use CreateWaitEventSet to wait for input from
the remote server, and now that needs to get another external FD.
I left that alone for the moment, because if you do rejigger the
WaitEventSet code to avoid dynamically creating/destroying epoll sockets,
it will stop being a problem. If that doesn't happen, another
possibility is to reclassify CreateWaitEventSet as a caller that should
ignore "failure" from ReserveExternalFD --- but that would open us up
to problems if a lot of WaitEventSets get created, so it's not a great
answer. It'd be okay perhaps if we added a distinction between
long-lived and short-lived WaitEventSets (ignoring "failure" only for
the latter). But I didn't want to go there in this patch.

Thoughts?

regards, tom lane

Attachment Content-Type Size
account-honestly-for-external-FD-usage-1.patch text/x-diff 21.4 KB

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2020-02-23 23:24:07 Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Previous Message Peter Eisentraut 2020-02-22 12:29:57 pgsql: pg_resetwal: Rename function to avoid potential conflict

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-02-23 22:00:50 Re: Ought binary literals to allow spaces?
Previous Message Thomas Munro 2020-02-23 20:58:18 Re: Yet another fast GiST build