Re: Determine parallel-safety of partition relations for Inserts

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Determine parallel-safety of partition relations for Inserts
Date: 2021-01-18 03:17:06
Message-ID: CAA4eK1Kwa6euoqRU1_Tfp5Eb3jMngEHtOfH3i1qOWRm8qnG1Fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 18, 2021 at 6:08 AM tsunakawa(dot)takay(at)fujitsu(dot)com
<tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:
>
> From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > I think it would be good if the parallelism works by default when
> > required but I guess if we want to use something on these lines then
> > we can always check if the parallel_workers option is non-zero for a
> > relation (with RelationGetParallelWorkers). So users can always say
> > Alter Table <tbl_name> Set (parallel_workers = 0) if they don't want
> > to enable write parallelism for tbl and if someone is bothered that
> > this might impact Selects as well because the same option is used to
> > compute the number of workers for it then we can invent a second
> > option parallel_dml_workers or something like that.
>
> Yes, if we have to require some specification to enable parallel DML, I agree that parallel query and parall DML can be separately enabled. With that said, I'm not sure if the user, and PG developers, want to allow specifying degree of parallelism for DML.
>

We already allow users to specify the degree of parallelism for all
the parallel operations via guc's max_parallel_maintenance_workers,
max_parallel_workers_per_gather, then we have a reloption
parallel_workers and vacuum command has the parallel option where
users can specify the number of workers that can be used for
parallelism. The parallelism considers these as hints but decides
parallelism based on some other parameters like if there are that many
workers available, etc. Why the users would expect differently for
parallel DML?

>
> > > As an aside, (1) and (2) has a potential problem with memory consumption.
> > >
> >
> > I can see the memory consumption argument for (2) because we might end
> > up generating parallel paths (partial paths) for reading the table but
> > don't see how it applies to (1)?
>
> I assumed that we would still open all partitions for parallel safety check in (1) and (2). In (1), parallel safety check is done only when parallel DML is explicitly enabled by the user. Just opening partitions keeps CacheMemoryContext bloated even after they are closed.
>

Which memory specific to partitions are you referring to here and does
that apply to the patch being discussed?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-01-18 03:33:56 Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit
Previous Message Greg Nancarrow 2021-01-18 03:12:07 Re: Parallel INSERT (INTO ... SELECT ...)