|From:||Armin Schöffmann <armin(dot)schoeffmann(at)aegaeon(dot)de>|
|Subject:||PATCH: pg_restore parallel-execution-deadlock issue|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
I propose the below patches to parallels.c and pg_backup_utils.c fixing deadlocks in pg_restore (windows only) if running more than 2 parallel jobs.
This problem was reported by me earlier this year.
- Winsock's "recv(...)" called in piperead() is a blocking read by default, therefor, signalizing termEvent as used in ShutdownWorkersHard() is not enough to make worker-threads go away.
We need a preceding shutdown(pipeWrite, SD_BOTH), first, to abort blocking IO in this case.
Otherwise, the main-thread will wait forever, if more than one additional worker is active (e.g. option -j3) and a premature EOF occurs in the input-file.
Findings in pg_backup_utils.c/ parallels.c, which could impact other tools, too:
- threads created with _beginthreadex need to be exited by either a "return exitcode" or "_endthreadex(exitcode)". It might be obsolete in fire-and-forget-scenarios, but it matters in other cases.
As of current, pg_backup_utils uses EndThread to retire additional worker-threads., which are spawned by _beginthreadex in parallel.c. The corresponding call for ExitThread would be CreateThread,
nevertheless, _beginthreadex is the correct choice here, as we do call-out into CRT and need to retain the thread-handle for after-death synchronization with the main-thread.
The thread-handle needs to be closed explicitly.
If this is not the correct place to discuss patches, I'd be glad if somebody can notify the tool's maintainer, to take a look into it.
Aegaeon technologies GmbH
@@ -350,7 +350,8 @@ static void
@@ -367,4 +368,7 @@ ShutdownWorkersHard(ParallelState *pstate)
/* The workers monitor this event via checkAborting(). */
+ for (i = 0; i < pstate->numWorkers; i++)
+ shutdown(pstate->parallelSlot[i].pipeWrite, SD_BOTH);
@@ -408,5 +412,8 @@ WaitForTerminatingWorkers(ParallelState *pstate)
for (j = 0; j < pstate->numWorkers; j++)
if (pstate->parallelSlot[j].hThread == hThread)
slot = &(pstate->parallelSlot[j]);
@@ -120,5 +120,5 @@ exit_nicely(int code)
if (parallel_init_done && GetCurrentThreadId() != mainThreadId)
|Next Message||Michael Paquier||2016-04-05 01:00:41||Re: Yet another small patch - reorderbuffer.c:1099|
|Previous Message||Michael Paquier||2016-04-05 00:18:28||Re: pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server|