Re: maintenance_work_mem and CREATE INDEX time

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: maintenance_work_mem and CREATE INDEX time
Date: 2013-07-24 02:30:30
Message-ID: CA+HiwqE_iksCRKZ=udU-0R1MiDpb_S2bgM-R9tz+d8rjy=FHVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wed, Jul 24, 2013 at 6:02 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Tue, Jul 23, 2013 at 1:23 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>> On Tue, Jul 23, 2013 at 1:11 PM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>>> Hello,
>>>
>>> While understanding the effect of maintenance_work_mem on time taken
>>> by CREATE INDEX, I observed that for the values of
>>> maintenance_work_mem less than the value for which an internal sort is
>>> performed, the time taken by CREATE INDEX increases as
>>> maintenance_work_increases.
>>>
>>> My guess is that for all those values an external sort is chosen at
>>> some point and larger the value of maintenance_work_mem, later the
>>> switch to external sort would be made causing CREATE INDEX to take
>>> longer. That is a smaller value of maintenance_work_mem would be
>>> preferred for when external sort is performed anyway. Does that make
>>> sense?
>>>
>>
>> Upon further investigation, it is found that the delay to switch to
>> external sort caused by a larger value of maintenance_work_mem is
>> small compared to the total time of CREATE INDEX.
>
> If you are using trace_sort to report that, it reports the switch as
> happening as soon as it runs out of memory.
>
> At point, all we have been doing is reading tuples into memory. The
> time it takes to do that will depend on maintenance_work_mem, because
> that affects how many tuples fit in memory. But all the rest of the
> tuples need to be read sooner or later anyway, so pushing more of them
> to later doesn't improve things overall it just shifts timing around.
>
> After it reports the switch, it still needs to heapify the existing
> in-memory tuples before the tapesort proper can begin. This is where
> the true lost opportunities start to arise, as the large heap starts
> driving cache misses which would not happen at all under different
> settings.
>
> Once the existing tuples are heapified, it then continues to use the
> heap to pop tuples from it to write out to "tape", and to push newly
> read tuples onto it. This also suffers lost opportunities.
>
> Once all the tuples are written out and it starts merging, then the
> large maintenance_work_mem is no longer a penalty as the new heap is
> limited by the number of tapes, which is almost always much smaller.
> In fact this stage will actually be faster, but not by enough to make
> up for the earlier slow down.
>
> So it is not surprising that the time before the switch is reported is
> a small part of the overall time difference.
>

So, is it the actual sorting (before merging) that suffers with larger
maintenance_work_mem? I am sorry but I can not grasp the complexity of
external sort code at this point, so all I can say is that during an
external sort a smaller value of maintenance_work_mem is beneficial
(based on my observations in tests). But how that follows from what is
going on in the implementation of external sort is still something I
am working on understanding.

--
Amit Langote

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Amit Langote 2013-07-24 02:34:54 Re: [HACKERS] maintenance_work_mem and CREATE INDEX time
Previous Message Andrew Sullivan 2013-07-24 01:56:55 Re: Why are stored procedures looked on so negatively?

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2013-07-24 02:34:54 Re: [HACKERS] maintenance_work_mem and CREATE INDEX time
Previous Message Noah Misch 2013-07-24 02:01:56 Re: Proposal/design feedback needed: WITHIN GROUP (sql standard ordered set aggregate functions)