Re: Parallel INSERT (INTO ... SELECT ...)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: "houzj(dot)fnst(at)cn(dot)fujitsu(dot)com" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, "tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-10 08:50:19
Message-ID: CAA4eK1LVdyXfA=9UTDVtv2zrt5sNJj7LG11QqeqGFrfApo84Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 10, 2021 at 11:13 AM tsunakawa(dot)takay(at)fujitsu(dot)com
<tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:
>
>
> The loss is probably due to 1) more index page splits, 2) more buffer writes (table and index), and 3) internal locks for things such as relation extension and page content protection. To investigate 3), we should want something like [1], which tells us the wait event statistics (wait count and time for each wait event) per session or across the instance like Oracke, MySQL and EDB provides. I want to continue this in the near future.
>
>

I think we might mitigate such losses with a different patch where we
can do a bulk insert for Insert .. Select something like we are
discussing in the other thread [1]. I wonder if these performance
characteristics are due to the reason of underlying bitmap heap scan.
What are the results if disable the bitmap heap scan(Set
enable_bitmapscan = off)? If that happens to be true, then we might
also want to consider if in some way we can teach parallel insert to
cost more in such cases. Another thing we can try is to integrate a
parallel-insert patch with the patch on another thread [1] and see if
that makes any difference but not sure if we want to go there at this
stage unless it is simple to try that out?

[1] - https://www.postgresql.org/message-id/20200508072545.GA9701%40telsasoft.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2021-02-10 08:50:59 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Thomas Munro 2021-02-10 08:26:49 Re: WIP: WAL prefetch (another approach)