Re: Cost model for parallel CREATE INDEX

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cost model for parallel CREATE INDEX
Date: 2017-03-09 01:45:33
Message-ID: CAH2-Wzk0wkHLTrzp-f30Q1rBEB2pspCk+emTcXEd-t_DfniTEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 8, 2017 at 5:33 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> pg_restore will avoid parallelism (that will happen by setting
>> "max_parallel_workers_maintenance = 0" when it runs), not because it
>> cannot trust the cost model, but because it prefers to parallelize
>> things its own way (with multiple restore jobs), and because execution
>> speed may not be the top priority for pg_restore, unlike a live
>> production system.
>
> This part I'm not sure about. I think people care quite a lot about
> pg_restore speed, because they are often down when they're running it.
> And they may have oodles mode CPUs that parallel restore can use
> without help from parallel query. I would be inclined to leave
> pg_restore alone and let the chips fall where they may.

I thought that we might want to err on the side of preserving the
existing behavior, but arguably that's actually what I failed to do.
That is, since we don't currently have a pg_restore flag that controls
the maintenance_work_mem used by pg_restore, "let the chips fall where
they may" is arguably the standard that I didn't uphold.

It might still make sense to take a leaf out of the parallel query
book on this question. That is, add an open item along the lines of
"review behavior of pg_restore with parallel CREATE INDEX" that we
plan to deal with close to the release of Postgres 10.0, when feedback
from beta testing is in. There are a number of options, none of which
are difficult to write code for. The hard part is determining what
makes most sense for users on balance.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-09 01:52:32 Re: Parallel Append implementation
Previous Message Tsunakawa, Takayuki 2017-03-09 01:36:35 Re: Supporting huge pages on Windows