Re: faster ETL / bulk data load for heap tables

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Luc Vlaming <luc(at)swarm64(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: faster ETL / bulk data load for heap tables
Date: 2021-01-02 07:36:51
Message-ID: CAA4eK1J3jVgYUjCz+4_Om-6ZJsc3DYHsVWWHYwncKs-r_BTVew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 1, 2021 at 7:37 PM Luc Vlaming <luc(at)swarm64(dot)com> wrote:
>
> Hi,
>
> In an effort to speed up bulk data loading/transforming I noticed that
> considerable time is spent in the relation extension lock.
>

We already do extend the relation in bulk when there is a contention
on relation extension lock via RelationAddExtraBlocks. I wonder why is
that not able to address this kind of workload. On a quick look at
your patch, it seems you are always trying to extend the relation by
128 blocks for copy operation after acquiring the lock whereas the
current mechanism has some smarts where it decides based on the number
of waiters. Now, it is possible that we should extend by a larger
number of blocks as compared to what we are doing now but using some
ad-hoc number might lead to space wastage. Have you tried to fiddle
with the current scheme of bulk-extension to see if that addresses
some of the gains you are seeing? I see that you have made quite a few
other changes that might be helping here but still, it is better to
see how much bottleneck is for relation extension lock and if that can
be addressed with the current mechanism rather than changing the
things in a different way.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2021-01-02 08:15:59 Re: Asynchronous Append on postgres_fdw nodes.
Previous Message Andrey Borodin 2021-01-02 07:31:45 Re: Spurious "apparent wraparound" via SimpleLruTruncate() rounding