Re: pg_restore crash when there is a failure before all child process is created

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_restore crash when there is a failure before all child process is created
Date: 2020-01-31 11:13:07
Message-ID: CALDaNm3-EDA2QzesP2Ltiu8=WVRf5hjoNMBAJZttywoWv-Aw5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 31, 2020 at 1:09 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> vignesh C <vignesh21(at)gmail(dot)com> writes:
> > On Wed, Jan 29, 2020 at 6:54 PM Ahsan Hadi <ahsan(dot)hadi(at)gmail(dot)com> wrote:
> >> Can you share a test case or steps that you are using to reproduce this issue? Are you reproducing this using a debugger?
>
> > I could reproduce with the following steps:
> > Make cluster setup.
> > Create few tables.
> > Take a dump in directory format using pg_dump.
> > Restore the dump generated above using pg_restore with very high
> > number for --jobs options around 600.
>
> I agree this is quite broken. Another way to observe the crash is
> to make the fork() call randomly fail, as per booby-trap-fork.patch
> below (not intended for commit, obviously).
>
> I don't especially like the proposed patch, though, as it introduces
> a great deal of confusion into what ParallelState.numWorkers means.
> I think it's better to leave that as being the allocated array size,
> and instead clean up all the fuzzy thinking about whether workers
> are actually running or not. Like 0001-fix-worker-status.patch below.
>

The patch looks fine to me. The test is also getting fixed by the patch.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexey Kondratov 2020-01-31 11:14:03 Re: Physical replication slot advance is not persistent
Previous Message Mark Charsley 2020-01-31 10:47:24 Re: Data race in interfaces/libpq/fe-exec.c