Re: pgbench bug / limitation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: "Jawarilal, Manish" <Manish(dot)Jawarilal(at)dell(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: pgbench bug / limitation
Date: 2020-06-02 03:19:10
Message-ID: 518253.1591067950@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> Looks like I didn't dig deep enough. The struct for the fs_set is defined as:
> typedef struct fd_set {
> u_int fd_count; /* how many are SET? */
> SOCKET fd_array[FD_SETSIZE]; /* an array of SOCKETs */
> } fd_set;

Oh, how interesting. So that's a completely different implementation
than what Unix systems use. On my Linux box, for example, <sys/select.h>
has (slightly edited for clarity):

#define __NFDBITS (8 * (int) sizeof (__fd_mask))
/* fd_set for select and pselect. */
typedef struct
{
__fd_mask __fds_bits[__FD_SETSIZE / __NFDBITS];
} fd_set;

so that FD_SETSIZE is constraining the largest FD *value* that can be
represented in the bit-array. Windows' implementation constrains the
number of FDs, but not their numerical values.

It looks like we have to have a different test for fd_set overrun
on Windows than on other platforms.

> The implementation of FD_SET that I see in WinSock2.h is:

> #define FD_SET(fd, set) do { \
> u_int __i; \
> for (__i = 0; __i < ((fd_set FAR *)(set))->fd_count; __i++) { \
> if (((fd_set FAR *)(set))->fd_array[__i] == (fd)) { \
> break; \
> } \
> } \
> if (__i == ((fd_set FAR *)(set))->fd_count) { \
> if (((fd_set FAR *)(set))->fd_count < FD_SETSIZE) { \
> ((fd_set FAR *)(set))->fd_array[__i] = (fd); \
> ((fd_set FAR *)(set))->fd_count++; \
> } \
> } \
> } while(0, 0)

> Which is not so great.

That's putting it mildly :-(. This is O(N^2) in the number of values
that get put into the set --- and since add_socket_to_set is part of
the event loop in pgbench, that means the performance problem is
going to be easily visible with lots of sockets.

I had been wondering earlier whether we should add another,
Windows-specific implementation of add_socket_to_set and friends.
It's starting to look like we should, because there's nothing that's
failing to suck about how Microsoft has done this. Non standards
compliant *and* abysmally performant is a bad combination.

BTW, I see that POSIX says (in <sys/select.h>)

FD_SETSIZE
Maximum number of file descriptors in an fd_set structure.

but the select(2) page says

The behavior of these macros is undefined if the fd argument is less
than 0 or greater than or equal to FD_SETSIZE, or if fd is not a valid
file descriptor, or if any of the arguments are expressions with
side-effects.

So a language-lawyer could make a case that Microsoft's implementation
is within the letter of the spec. Still, we have to keep the range
check as it stands for everyone else, so that doesn't help us.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-06-02 06:20:56 Re: BUG #16473: Marked as broken because of SQLSTATE(08006),ErrorCode(0)
Previous Message David Rowley 2020-06-02 02:47:48 Re: pgbench bug / limitation