Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Rémi Zara <remi_zara(at)mac(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Date: 2020-02-20 22:05:14
Message-ID: 26174.1582236314@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> One thing I've been planning to do for 13 is to get rid of all the
> temporary create/destroy WaitEventSets from the main backend loops.
> My goal was cutting down on stupid system calls, but this puts a new
> spin on it. I have a patch set to do a bunch of that[1], but now I'm
> thinking that perhaps I need to be even more aggressive about it and
> set up the 'common' long lived WES up front at backend startup, rather
> than doing it on demand, so that there is no chance of failure due to
> lack of fds once you've started up.

+1

> That wouldn't mean that the postgres_fdw.sql can't fail on a ulimit -n
> = 128 system, though, it might just mean that it's postgres_fdw's
> socket() call that hits EMFILE rather than WES's kqueue() call while
> running that test.

Good point.

> I suppose there are two kinds of system: those
> where ulimit -n is higher than max_files_per_process (defaults, on
> Linux: 1024 vs 1000) so you have more allowance for sockets and the
> like, and those where it isn't, like coypu, where NUM_RESERVED_FDS is
> the only thing ensuring we have some spare fds. I don't know the
> history but it looks like NUM_RESERVED_FDS was deliberately or
> accidentally tuned to be just enough to be able to run the regression
> tests (the interesting ones being the ones that use sockets, like
> postgres_fdw.sql), but with a new long lived kqueue() fd in the
> picture, it might have to be increased to cover it, no?

No. NUM_RESERVED_FDS was set decades ago, long before any of those tests
existed, and it has never been changed AFAIK. It is a bit striking that
we just started seeing it be insufficient with this patch. Maybe that's
just happenstance, but I wonder whether there is a plain old FD leak
involved in addition to the design issue? I'll take a closer look at
exactly what's open when we hit the error.

The point about possibly hitting EMFILE in libpq's socket() call is
an interesting one. libpq of course can't do anything to recover
from that (and even if it could, there are lower levels such as a
possible DNS lookup that we're not going to be able to modify).
I'm speculating about having postgres_fdw ask fd.c to forcibly
free one LRU file before it attempts to open a new libpq connection.
That would prevent EMFILE (process-level exhaustion) and it would
also provide some small protection against ENFILE (system-wide
exhaustion), though of course there's no guarantee that someone
else doesn't snap up the FD you so graciously relinquished.

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2020-02-20 23:55:43 Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Previous Message Thomas Munro 2020-02-20 20:50:29 Re: pgsql: Add kqueue(2) support to the WaitEventSet API.

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Leonhard 2020-02-20 22:09:31 Re: Add PGURI env var for passing connection string to psql in Docker
Previous Message Peter Geoghegan 2020-02-20 20:59:43 Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.