Re: Our poll() based WaitLatch implementation is broken

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Our poll() based WaitLatch implementation is broken
Date: 2012-01-15 20:23:07
Message-ID: 4F1335AB.3020808@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15.01.2012 09:26, Peter Geoghegan wrote:
> Build Postgres master, on Linux or another platform that will use the
> poll() implementation rather than the older select(). Send the
> Postmaster SIGKILL. Observe that the WAL Writer lives on, representing
> a denial of service as it stays attached to shared memory, busy
> waiting (evident from the fact that it quickly leaks memory).

The poll()-based implementation checked for POLLIN on the
postmaster-alive-pipe, just like we check for the fd to become readable
in the select() implementation. But poll() has a separate POLLHUP event
code for that; it does not set POLLIN on the fd but POLLHUP.

Fixed, to check POLLHUP. I still kept the check POLLIN, the pipe should
never become readable so if it does something is badly wrong. I also
threw in a check for POLLNVAL, which means "Invalid request: fd not
open". That should definitely not happen, but if it does, it seems good
to treat it as postmaster death too. Even if the postmaster isn't dead
yet, we could no longer detect when it does die.

> The rationale for introducing the poll()-based implementation where
> available was that it performed better than a select()-based one. I
> wonder, how compelling a win is that expected to be?

Ganesh Venkitachalam did some micro-benchmarking after the latch patch
was committed:
http://archives.postgresql.org/pgsql-hackers/2010-09/msg01609.php. I
don't think it make any meaningful difference in real applications, but
poll() also doesn't have an arbitrary limit on the range of fd numbers
that can be used.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2012-01-15 21:00:26 Re: pg_basebackup is not checking IDENTIFY_SYSTEM numbre of columns
Previous Message Kevin Grittner 2012-01-15 20:05:42 Re: pg_trigger_depth() v3 (was: TG_DEPTH)