Re: WaitLatchOrSocket optimization

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WaitLatchOrSocket optimization
Date: 2018-03-16 08:03:03
Message-ID: c1517e80-bb8f-4d86-cc71-35984a1593f4@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 15.03.2018 20:25, Andres Freund wrote:
> Hi,
>
> On 2018-03-15 19:01:40 +0300, Konstantin Knizhnik wrote:
>> Right now function WaitLatchOrSocket is implemented in very inefficient way:
>> for each invocation it creates epoll instance, registers events and then
>> closes this instance.
> Right, everything performance critical should be migrated to using a
> persistent wait even set.
>
>
>> But there are still lot of places in Postgres where WaitLatchOrSocket or
>> WaitLatch are used.
> Most don't matter performancewise though.
Yes, I didn't see significant difference in performance on workloads not
including postgres_fdw.
But greeping for all occurrences of WaitLatch I found some places which
are used to be called quite frequently, for example
waiting for latch in ProcSleep , waiting for more data in shared memory
message queue, waiting for parallel workers to attach,
waiting for synchronous replica,...
There are about hundred of usages of WaitLatch and I am not sure that
all of them do not have impact on performance.

>> There are two possible ways of fixing this issue:
>> 1. Patch postgres_fdw to store WaitEventSet in connection.
>> 2. Patch WaitLatchOrSocket to cache created wait event sets.
> I'm strongly opposed to 2). The lifetime of sockets will make this an
> absolute mess, it'll potentially explode the number of open file
> handles, and some OSs don't handle a large number of sets at the same
> time.
I understand your concern.
But you are not completely right here.
First of all life time of socket doesn't actually matter: if socket is
closed, then it is automatically removed from event set.
Cite from epoll manual:

      Q6  Will closing a file descriptor cause it to be removed from
all epoll sets automatically?

       A6  Yes,  but  be  aware  of the following point.  A file
descriptor is a reference to an open file description (see open(2)). 
Whenever a file
           descriptor is duplicated via dup(2), dup2(2), fcntl(2)
F_DUPFD, or fork(2), a new file descriptor referring to the same open
file  descrip-
           tion is created.  An open file description continues to
exist until all file descriptors referring to it have been closed. A
file descrip-
           tor is removed from an epoll set only after all the file
descriptors referring to the underlying open file description have been
closed (or
           before  if  the file descriptor is explicitly removed using
epoll_ctl(2) EPOLL_CTL_DEL).  This means that even after a file
descriptor that
           is part of an epoll set has been closed, events may be
reported for that file descriptor if other file descriptors referring 
to  the  same
           underlying file description remain open.

So, only file descriptors created by epoll_create affect number of
descriptors opened by a process.
But this number is limited by size of the cache. Right now I hardcoded
cache size 113, but in most cases even much smaller cache will be enough.
I do not see that extra let's say 13 open file descriptors can cause
open file descriptor limit exhaustion.

From my point of view we should either rewrite WaitLatchOrSocket to not
use epoll at all (use poll instead), either cache created event set
descriptors.
Result of pgbench with poll instead epoll is 140k TPS, which is
certainly much better than 38k TPS with current master, but much worser
than with cached epoll - 225k TPS.

> Greetings,
>
> Andres Freund

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2018-03-16 08:16:54 Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs
Previous Message Pavel Stehule 2018-03-16 07:43:06 Re: missing support of named convention for procedures