Re: pgbench stopped supporting large number of client connections on Windows

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Marina Polyakova <m(dot)polyakova(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pgbench stopped supporting large number of client connections on Windows
Date: 2020-11-06 22:01:55
Message-ID: alpine.DEB.2.22.394.2011062159530.1605435@pseudo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Marina,

> While trying to test a patch that adds a synchronization barrier in pgbench
> [1] on Windows,

Thanks for trying that, I do not have a windows setup for testing, and the
sync code I wrote for Windows is basically blind coding:-(

> I found that since the commit "Use ppoll(2), if available, to
> wait for input in pgbench." [2] I cannot use a large number of client
> connections in pgbench on my Windows virtual machines (Windows Server 2008 R2
> and Windows 2019), for example:
>
>> bin\pgbench.exe -c 90 -S -T 3 postgres
> starting vacuum...end.

ISTM that 1 thread with 90 clients is a bad idea, see below.

> The almost same thing happens with reindexdb and vacuumdb (build on
> commit [3]):

Windows fd implementation is somehow buggy because it does not return the
smallest number available, and then with the assumption that select uses a
dense array indexed with them (true on linux, less so on Windows which
probably uses a sparse array), so that the number gets over the limit,
even if less are actually used, hence the catch, as you noted.

Another point is windows has a hardcoded number of objects one thread can
really wait for, typically 64, so that waiting for more requires actually
forking threads to do the waiting. But if you are ready to fork threads
just to wait, then probaly you could have started pgbench with more
threads in the first place. Now it would probably not make the problem go
away because fd numbers would be per process, not per thread, but it
really suggests that one should not load a thread is more than 64 clients.

> IIUC the checks below are not correct on Windows, since on this system
> sockets can have values equal to or greater than FD_SETSIZE (see Windows
> documentation [4] and pgbench debug output in attached pgbench_debug.txt).

Okay.

But then, how may one detect that there are too many fds in the set?

I think that an earlier version of the code needed to make assumptions
about the internal implementation of windows (there is a counter somewhere
in windows fd_set struct), which was rejected because if was breaking the
interface. Now your patch is basically resurrecting that. Why not if there
is no other solution, but this is quite depressing, and because it breaks
the interface it would be broken if windows changed its internals for some
reason:-(

Doesn't windows has "ppoll"? Should we implement the stuff above windows
polling capabilities and coldly skip its failed posix portability
attempts? This raises again the issue that you should not have more that
64 clients per thread anyway, because it is an intrinsic limit on windows.

I think that at one point it was suggested to error or warn if
nclients/nthreads is too great, but that was not kept in the end.

> I tried to fix this, see attached fix_max_client_conn_on_Windows.patch (based
> on commit [3]). I checked it for reindexdb and vacuumdb, and it works for
> simple databases (1025 jobs are not allowed and 1024 jobs is ok).
> Unfortunately, pgbench was getting connection errors when it tried to use
> 1000 jobs on my virtual machines, although there were no errors for fewer
> jobs (500) and the same number of clients (1000)...

It seems that the max number of threads you can start depends on available
memory, because each thread is given its own stack, so it would depend on
your vm settings?

> Any suggestions are welcome!

Use ppoll, and start more threads but not too many?

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-11-06 22:07:12 First-draft release notes for back branches are up
Previous Message Sergei Kornilov 2020-11-06 21:36:33 Re: Allow some recovery parameters to be changed with reload