Re: PATCH: Keep one postmaster monitoring pipe per process

From: Marco Pfatschbacher <Marco_Pfatschbacher(at)genua(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PATCH: Keep one postmaster monitoring pipe per process
Date: 2016-09-16 07:44:13
Message-ID: 20160916074413.GA15576@genua.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 15, 2016 at 04:24:07PM -0400, Tom Lane wrote:
> Marco Pfatschbacher <Marco_Pfatschbacher(at)genua(dot)de> writes:
> > the current implementation of PostmasterIsAlive() uses a pipe to
> > monitor the existence of the postmaster process.
> > One end of the pipe is held open in the postmaster, while the other end is
> > inherited to all the auxiliary and background processes when they fork.
> > This leads to multiple processes calling select(2), poll(2) and read(2)
> > on the same end of the pipe.
> > While this is technically perfectly ok, it has the unfortunate side
> > effect that it triggers an inefficient behaviour[0] in the select/poll
> > implementation on some operating systems[1]:
> > The kernel can only keep track of one pid per select address and
> > thus has no other choice than to wakeup(9) every process that
> > is waiting on select/poll.
>
> That seems like a rather bad kernel bug.

It's a limitation that has been there since at least BSD4.4.
But yeah, I wished things were better.

> > In our case the system had to wakeup ~3000 idle ssh processes
> > every time postgresql did call PostmasterIsAlive.
>
> Uh, these are processes not even connected to the pipe in question?
> That's *really* a kernel bug.

The kernel does not know if they were selecting on that pipe,
because the only slot to keep that mapping has been already taken.
It's not that bad of a performance hit, If the system doesn't
many processes.

> > Attached patch avoids the select contention by using a
> > separate pipe for each auxiliary and background process.
>
> I think this would likely move the performance problems somewhere else.
> In particular, it would mean that every postmaster child would inherit
> pipes leading to all the older children.

I kept them at a minimum. (See ClosePostmasterPorts)

> We could close 'em again
> I guess, but we have previously found that having to do things that
> way is a rather serious performance drag --- see the problems we had

I think closing a few files doesn't hurt too much, but I see your point.

> with POSIX named semaphores, here for instance:
> https://www.postgresql.org/message-id/flat/3830CBEB-F8CE-4EBC-BE16-A415E78A4CBC%40apple.com
> I really don't want the postmaster to be holding any per-child kernel
> resources.
>
> It'd certainly be nice if we could find another solution besides
> the pipe-based one, but I don't think "more pipes" is the answer.
> There was some discussion of using Linux's prctl(PR_SET_PDEATHSIG)
> when available; do the BSDen have anything like that?

Not that I know of.
But since the WalReceiver process seemed to be the one calling
PostmasterIsAlive way more often than the rest, maybe we could limit
the performance hit by not calling it on every received wal chunk?

Cheers,
Marco

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Marco Pfatschbacher 2016-09-16 07:46:43 Re: PATCH: Keep one postmaster monitoring pipe per process
Previous Message Pavel Stehule 2016-09-16 07:20:23 Re: patch: function xmltable