Re: Buildfarm failures on woodlouse (in ecpg-check)

From: Christian Ullrich <chris(at)chrullrich(dot)net>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Buildfarm failures on woodlouse (in ecpg-check)
Date: 2017-06-14 22:22:43
Message-ID: ca4b5af0-c81d-e1b7-5ea6-7d8e53bed560@chrullrich.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* I wrote:

> If there is interest in fixing this issue that is apparently limited
> to VS 2013, I will try and produce a proper patch.

Patch.

Perhaps surprisingly, the bug is in the failing test cases themselves,
not ecpg. The CRT has two modes for its locale implementation: When
called in a "global" mode thread, setlocale() affects all global mode
threads; when called from a per-thread mode thread, it affects that
thread only. [1]

In the threaded test cases, many global mode threads call setlocale()
simultaneously, getting in each other's way. The fix is to switch the
worker threads to per-thread mode first.

Without this patch, I see crashes in approximately 25 percent of runs
(four crashes in 16 cycles of vcregress ecpgcheck, ~10 in 50 runs of
thread.exe alone). With the patch, I have no crashes in 100 ecpgcheck runs.

On the other hand, this affects only the buildfarm VM I mentioned
earlier. I have another VM where it does not ever crash even without the
patch -- such are the joys of multithreading. Both of them should have
plenty of cores, both physical and virtual.

There are some alternatives to fixing it this way, but I think this is
the best approach:

- Selecting per-thread mode in ecpglib takes the choice away from the
developer who might want shared locale.

- Adding locking around setlocale() is difficult because ecpglib already
uses a wrapper around the CRT function, provided by libpgport.

These test cases are the only consumers of the wrapper that have any
concept of multithreading. Supporting it in libpgport for the sole
benefit of threaded ECPG applications on Windows does not seem to
be a good idea, and re-wrapping the wrapper in ecpglib is not only
beyond my abilities to write, but is also going to be unmaintainable.

- Adding locking around every setlocale() call in ecpglib is just ugly.

While working on this, I also noticed that there seem to be two separate
partial implementations of a pthread synchronization emulation for
Win32. One is in ecpglib, uses mutexes and provides
PTHREAD_MUTEX_INITIALIZER and pthread_once(), the other has the header
in src/port and the implementation in libpq, uses critical sections and
does not cover either feature.

Should the two be merged at some point?

[1] https://msdn.microsoft.com/en-us/library/ms235302(v=vs.120).aspx

--
Christian

Attachment Content-Type Size
0001-Make-setlocale-aware-of-multithreading-to-avoid-cras.patch text/plain 21.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2017-06-14 22:28:40 Re: memory fields from getrusage()
Previous Message Andres Freund 2017-06-14 22:20:19 Re: subscription worker signalling wal writer too much