Re: Parallel copy

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2020-10-09 10:57:19
Message-ID: CAA4eK1JtDCUfi_VZ7Hb-JKLLn98Y-y2cEkgmqAZgXGBxj6SyUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 9, 2020 at 3:50 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Fri, Oct 9, 2020 at 3:26 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Oct 9, 2020 at 2:52 PM Bharath Rupireddy
> > <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Sep 29, 2020 at 6:30 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > From the testing perspective,
> > > > 1. Test by having something force_parallel_mode = regress which means
> > > > that all existing Copy tests in the regression will be executed via
> > > > new worker code. You can have this as a test-only patch for now and
> > > > make sure all existing tests passed with this.
> > > >
> > >
> > > I don't think all the existing copy test cases(except the new test cases added in the parallel copy patch set) would run inside the parallel worker if force_parallel_mode is on. This is because, the parallelism will be picked up for parallel copy only if parallel option is specified unlike parallelism for select queries.
> > >
> >
> > Sure, you need to change the code such that when force_parallel_mode =
> > 'regress' is specified then it always uses one worker. This is
> > primarily for testing purposes and will help during the development of
> > this patch as it will make all exiting Copy tests to use quite a good
> > portion of the parallel infrastructure.
> >
>
> IIUC, firstly, I will set force_parallel_mode = FORCE_PARALLEL_REGRESS
> as default value in guc.c,
>

No need to set this as the default value. You can change it in
postgresql.conf before running tests.

> and then adjust the parallelism related
> code in copy.c such that it always picks 1 worker and spawns it. This
> way, all the existing copy test cases would be run in parallel worker.
> Please let me know if this is okay.
>

Yeah, this sounds fine.

> If yes, I will do this and update
> here.
>

Okay, thanks, but ensure the difference in test execution before and
after your change. After your change, all the 'copy' tests should
invoke the worker to perform a copy.

> >
> > > All the above tests are performed on the latest v6 patch set (attached here in this thread) with custom postgresql.conf[1]. The results are of the triplet form (exec time in sec, number of workers, gain)
> > >
> >
> > Okay, so I am assuming the performance is the same as we have seen
> > with the earlier versions of patches.
> >
>
> Yes. Most recent run on v5 patch set [1]
>

Okay, good to know that.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2020-10-09 10:57:38 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Bharath Rupireddy 2020-10-09 10:20:07 Re: Parallel copy