Re: Cost model for parallel CREATE INDEX

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cost model for parallel CREATE INDEX
Date: 2017-03-04 08:58:57
Message-ID: CAH2-Wzn3O=1NFP3epKsuuLXGuChmzcLVBSJeBDvgYZZRmHmm8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 4, 2017 at 12:50 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> If you think parallelism isn't worthwhile unless the sort was going to
> be external anyway,

I don't -- that's just when it starts to look like a safe bet that
parallelism is worthwhile. There are quite a few cases where an
external sort is faster than an internal sort these days, actually.

> then it seems like the obvious thing to do is
> divide the projected size of the sort by maintenance_work_mem, round
> down, and cap the number of workers to the result.

I'm sorry, I don't follow.

> If the result of
> compute_parallel_workers() based on min_parallel_table_scan_size is
> smaller, then use that value instead. I must be confused, because I
> actually though that was the exact algorithm you were describing, and
> it sounded good to me.

It is, but I was using that with index size, not table size. I can
change it to be table size, based on what you said. But the workMem
related cap, which probably won't end up being applied all that often
in practice, *should* still do something with projected index size,
since that really is what we're sorting, which could be very different
(e.g. with partial indexes).

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-04 08:59:37 Re: WAL Consistency checking for hash indexes
Previous Message Robert Haas 2017-03-04 08:56:54 Re: Patch to implement pg_current_logfile() function