Re: Parallel copy

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2022-03-07 07:56:46
Message-ID: CALj2ACWUhk8wCWwCzP4pHa3352e8XNyvOi_FXUx+J5wVnvPMhQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 28, 2020 at 3:14 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> Attached is a patch that was used for the same. The patch is written
> on top of the parallel copy patch.
> The design Amit, Andres & myself voted for that is the leader
> identifying the line bound design and sharing it in shared memory is
> performing better.

Hi Hackers, I see following are some of the problem with parallel copy feature:

1) Leader identifying the line/tuple boundaries from the file, letting
the workers pick, insert parallelly vs leader reading the file and
letting workers identify line/tuple boundaries, insert
2) Determining parallel safety of partitioned tables
3) Bulk extension of relation while inserting i.e. adding more than
one extra blocks to the relation in RelationAddExtraBlocks

Please let me know if I'm missing anything.

For (1) - from Vignesh's experiments above, it shows that the " leader
identifying the line/tuple boundaries from the file, letting the
workers pick, insert parallelly" fares better.
For (2) - while it's being discussed in another thread (I'm not sure
what's the status of that thread), how about we take this feature
without the support for partitioned tables i.e. parallel copy is
disabled for partitioned tables? Once the other discussion gets to a
logical end, we can come back and enable parallel copy for partitioned
tables.
For (3) - we need a way to extend or add new blocks fastly - fallocate
might help here, not sure who's working on it, others can comment
better here.

Can we take the "parallel copy" feature forward of course with some
restrictions in place?

Thoughts?

Regards,
Bharath Rupireddy.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-03-07 08:14:39 Re: Handle infinite recursion in logical replication setup
Previous Message Kyotaro Horiguchi 2022-03-07 07:42:26 Re: pg_tablespace_location() failure with allow_in_place_tablespaces