Re: pgbench stopped supporting large number of client connections on Windows

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgbench stopped supporting large number of client connections on Windows
Date: 2021-02-01 13:18:08
Message-ID: CAD21AoDiuC2AB2GoqqMoNedWxcQ7Fv-97A31e=xh_14HzmBmdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Jan 7, 2021 at 10:48 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> Hi Marina,
>
> On Sun, Nov 8, 2020 at 11:59 PM Marina Polyakova
> <m(dot)polyakova(at)postgrespro(dot)ru> wrote:
> >
> > On 2020-11-07 01:01, Fabien COELHO wrote:
> > > Hello Marina,
> >
> > Hello, Fabien!
> >
> > Thank you for your comments!
> >
> > >> While trying to test a patch that adds a synchronization barrier in
> > >> pgbench [1] on Windows,
> > >
> > > Thanks for trying that, I do not have a windows setup for testing, and
> > > the sync code I wrote for Windows is basically blind coding:-(
> >
> > FYI:
> >
> > 1) It looks like pgbench will no longer support Windows XP due to the
> > function DeleteSynchronizationBarrier. From
> > https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-deletesynchronizationbarrier
> > :
> >
> > Minimum supported client: Windows 8 [desktop apps only]
> > Minimum supported server: Windows Server 2012 [desktop apps only]
> >
> > On Windows Server 2008 R2 (MSVC 2013) the 6-th version of the patch [1]
> > has compiled without (new) warnings, but when running pgbench I got the
> > following error:
> >
> > The procedure entry point DeleteSynchronizationBarrier could not be
> > located in the dynamic link library KERNEL32.dll.
> >
> > 2) On Windows Server 2019 (MSVC 2019) the 6-th version of the patch [1]
> > with fix_max_client_conn_on_Windows.patch has compiled without (new)
> > warnings. I made a few runs (-M prepared -c 100 -j 10 -T 10 -P1 -S) with
> > and without your patches. On Linux (-M prepared -c 1000 -j 500 -T 10 -P1
> > -S) your patches fix problems with progress reports as in [2], but on
> > Windows I did not notice such changes, see attached
> > pgbench_runs_linux_vs_windows.zip.
> >
> > >> The almost same thing happens with reindexdb and vacuumdb (build on
> > >> commit [3]):
> > >
> > > Windows fd implementation is somehow buggy because it does not return
> > > the smallest number available, and then with the assumption that
> > > select uses a dense array indexed with them (true on linux, less so on
> > > Windows which probably uses a sparse array), so that the number gets
> > > over the limit, even if less are actually used, hence the catch, as
> > > you noted.
> >
> > I agree with you. It looks like the structure fd_set just contains used
> > sockets by this application on Windows, and the macro FD_SETSIZE is used
> > only here.
> >
> > From
> > https://docs.microsoft.com/en-us/windows/win32/api/winsock/ns-winsock-fd_set
> > :
> >
> > typedef struct fd_set {
> > u_int fd_count;
> > SOCKET fd_array[FD_SETSIZE];
> > } fd_set, FD_SET, *PFD_SET, *LPFD_SET;
> >
> > From
> > https://docs.microsoft.com/en-us/windows/win32/winsock/maximum-number-of-sockets-supported-2
> > :
> >
> > The maximum number of sockets that a Windows Sockets application can use
> > is not affected by the manifest constant FD_SETSIZE. This value defined
> > in the Winsock2.h header file is used in constructing the FD_SET
> > structures used with select function.
> >
> > >> IIUC the checks below are not correct on Windows, since on this system
> > >> sockets can have values equal to or greater than FD_SETSIZE (see
> > >> Windows documentation [4] and pgbench debug output in attached
> > >> pgbench_debug.txt).
> > >
> > > Okay.
> > >
> > > But then, how may one detect that there are too many fds in the set?
> > >
> > > I think that an earlier version of the code needed to make assumptions
> > > about the internal implementation of windows (there is a counter
> > > somewhere in windows fd_set struct), which was rejected because if was
> > > breaking the interface. Now your patch is basically resurrecting that.
> >
> > I tried to keep the behaviour "we check if the socket value can be used
> > in select() at runtime", but now I will also read that thread...
> >
> > > Why not if there is no other solution, but this is quite depressing,
> > > and because it breaks the interface it would be broken if windows
> > > changed its internals for some reason:-(
> >
> > It looks like if the internals of the structure fd_set are changed, we
> > will also have problems with the function pgwin32_select from
> > src/backend/port/win32/socket.c, because it uses fd_set.fd_count too?..
> >
> > (I'm writing responses to the rest of your comments but it takes
> > time...)
> >
>
> This patch on Commitfest has been "Waiting on Author" for almost 2
> months. Could you share the current status? Are you updating the
> patch?

Status update for a commitfest entry.

Since this is a bug fix, I've moved this patch to the next commitfest.
I think if this patch is still inactive until the feature freeze we
can remove this entry by setting it to "Returned with Feedback".

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-02-01 13:23:16 Re: Fix typo about generate_gather_paths
Previous Message Yugo NAGATA 2021-02-01 13:13:15 Re: [PATCH] Add extra statistics to explain for Nested Loop