Re: Parallel copy

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2020-10-09 09:56:55
Message-ID: CAA4eK1Kn7cfqKg79DfCVSrxTNJnouQ3QwkLuzquf4trQDfwocg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 9, 2020 at 2:52 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Tue, Sep 29, 2020 at 6:30 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > From the testing perspective,
> > 1. Test by having something force_parallel_mode = regress which means
> > that all existing Copy tests in the regression will be executed via
> > new worker code. You can have this as a test-only patch for now and
> > make sure all existing tests passed with this.
> >
>
> I don't think all the existing copy test cases(except the new test cases added in the parallel copy patch set) would run inside the parallel worker if force_parallel_mode is on. This is because, the parallelism will be picked up for parallel copy only if parallel option is specified unlike parallelism for select queries.
>

Sure, you need to change the code such that when force_parallel_mode =
'regress' is specified then it always uses one worker. This is
primarily for testing purposes and will help during the development of
this patch as it will make all exiting Copy tests to use quite a good
portion of the parallel infrastructure.

>
> All the above tests are performed on the latest v6 patch set (attached here in this thread) with custom postgresql.conf[1]. The results are of the triplet form (exec time in sec, number of workers, gain)
>

Okay, so I am assuming the performance is the same as we have seen
with the earlier versions of patches.

> Overall, we have below test cases to cover the code and for performance measurements. We plan to run these tests whenever a new set of patches is posted.
>
> 1. csv
> 2. binary

Don't we need the tests for plain text files as well?

> 3. force parallel mode = regress
> 4. toast data csv and binary
> 5. foreign key check, before row, after row, before statement, after statement, instead of triggers
> 6. partition case
> 7. foreign partitions and partitions having trigger cases
> 8. where clause having parallel unsafe and safe expression, default parallel unsafe and safe expression
> 9. temp, global, local, unlogged, inherited tables cases, foreign tables
>

Sounds like good coverage. So, are you doing all this testing
manually? How are you maintaining these tests?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2020-10-09 10:01:17 Re: powerpc pg_atomic_compare_exchange_u32_impl: error: comparison of integer expressions of different signedness (Re: pgsql: For all ppc compilers, implement compare_exchange and) fetch_add
Previous Message Amit Kapila 2020-10-09 09:42:05 Re: Parallel INSERT (INTO ... SELECT ...)