Re: PATCH: pgbench - option to build using ppoll() for larger connection counts

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, "Rady, Doug" <radydoug(at)amazon(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: PATCH: pgbench - option to build using ppoll() for larger connection counts
Date: 2018-09-23 00:11:34
Message-ID: 18803.1537661494@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I'm strongly tempted to just remove the POLL_UNWANTED business
> altogether, as it seems both pointless and unportable on its face.
> Almost by definition, we can't know what "other" bits a given
> implementation might set.
> I'm not entirely following the point of including POLLRDHUP in
> POLL_EVENTS, either. What's wrong with the traditional solution
> of detecting EOF?

So after studying that a bit longer, I think it's just wrong.
It's not the business of this code to be checking for connection
errors at all; that is libpq's province. The libpq API specifies
that callers should wait for read-ready on the socket, and nothing
else. So the only bit we need concern ourselves with is POLLIN.

I also seriously disliked both the details of the abstraction API
and its lack of documentation. (Other people complained about that
upthread, too.) So attached is a rewrite attempt. There's still a
couple of grotty things about it; in particular the ppoll variant of
socket_has_input() knows more than one could wish about how it's being
used. But I couldn't see a way to make it cleaner without significant
changes to the logic in threadRun, and that didn't seem better.

I think that Andres' concern upthread about iterating over a whole
lot of sockets is somewhat misplaced. We aren't going to be iterating
over the entire set of client connections, only those being run by a
particular pgbench thread. So assuming you're using a reasonable ratio
of threads to clients, there won't be very many to look at in any one
thread. In any case, I'm dubious that we could get much of a win from
some other abstraction for waiting: both of these code paths do work
pretty much proportional to the number of connections the current
thread is responsible for, and it's hard to see how to avoid that.

I've tested this on both Linux and FreeBSD, and it seems to work fine.

I'm reasonably happy with this version of the patch, and would be
ready to commit it, but I thought I'd throw it out for another round
of review if anyone wants to.

regards, tom lane

Attachment Content-Type Size
pgbench-ppoll-17.patch text/x-diff 14.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-09-23 00:17:21 Re: testing pg_dump against very old versions
Previous Message Chapman Flack 2018-09-22 21:56:38 Re: vary read_only in SPI calls? or poke at the on-entry snapshot?