Re: maintenance_work_mem and CREATE INDEX time

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: maintenance_work_mem and CREATE INDEX time
Date: 2013-07-23 18:20:49
Message-ID: CAMkU=1yCH=K4a9frVODK=QqU73TmqqbjtBA0s1Wh7H8jm6wevg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Mon, Jul 22, 2013 at 9:11 PM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> Hello,
>
> While understanding the effect of maintenance_work_mem on time taken
> by CREATE INDEX, I observed that for the values of
> maintenance_work_mem less than the value for which an internal sort is
> performed, the time taken by CREATE INDEX increases as
> maintenance_work_increases.
>
> My guess is that for all those values an external sort is chosen at
> some point and larger the value of maintenance_work_mem, later the
> switch to external sort would be made causing CREATE INDEX to take
> longer. That is a smaller value of maintenance_work_mem would be
> preferred for when external sort is performed anyway. Does that make
> sense?

The heap structure used in external sorts is cache-unfriendly. The
bigger the heap used, the more this unfriendliness becomes apparent.
And the bigger maintenance_work_mem, the bigger the heap used.

The bigger heap also means you have fewer "runs" to merge in the
external sort. However, as long as the number of runs still fits in
the same number of merge passes, this is generally not a meaningful
difference.

Ideally the planner (or something) would figure out how much memory
would be needed to complete an external sort in just one external
pass, and then lower the effective maintenance_work_mem to that
amount. But that would be hard to do with complete precision, and the
consequences of getting it wrong are asymmetric.

(More thoroughly, it would figure out the number of passes needed for
the given maintenance_work_mem, and then lower the effective
maintenance_work_mem to the smallest value that gives the same number
of passes. But for nearly all practical situations, I think the number
of passes for an index build is going to be 0 or 1.)

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Goess 2013-07-23 19:54:20 process deadlocking on its own transactionid?
Previous Message pg noob 2013-07-23 15:08:50 Fwd: odd locking behaviour

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-07-23 18:24:32 Re: getting rid of SnapshotNow
Previous Message Alvaro Herrera 2013-07-23 18:16:20 Re: new "row-level lock" error messages