Re: Parallel pg_dump's error reporting doesn't work worth squat

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Parallel pg_dump's error reporting doesn't work worth squat
Date: 2016-05-31 09:02:04
Message-ID: 20160531.180204.85313767.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

At Fri, 27 May 2016 13:20:20 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in <14603(dot)1464369620(at)sss(dot)pgh(dot)pa(dot)us>
> Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> writes:
> > By the way, the reason of the "invalid snapshot identifier" is
> > that some worker threads try to use it after the connection on
> > the first worker closed.
>
> ... BTW, I don't quite see what the issue is there. The snapshot is
> exported from the master session, so errors in worker sessions should not
> cause such failures in other workers. And I don't see any such failure
> when setting up a scenario that will cause a worker to fail on Linux.
> The "invalid snapshot identifier" bleats would make sense if you had
> gotten a server-side error (and transaction abort) in the master session,
> but I don't see any evidence that that happened in that example. Might be
> worth seeing if that's reproducible.

The master session died from lack of libz and the failure of
compressLevel's propagation already fixed. Some of the children
that started transactions after the master's death will get the
error.

Similary, sudden close of the session of the master child at very
early in its transaction could cause the same symptom but it
seems not likely if master surely arrives at command-waiting, or
"safe", state.

If we want prevent it perfectly, one solution could be that
non-master children explicitly wait the master to arrive at the
"safe" state before starting their transactions. But I suppose it
is not needed here.

Does this make sense?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Borodin 2016-05-31 09:06:03 Re: 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6
Previous Message Tatsuo Ishii 2016-05-31 07:33:09 Re: Statement timeout