Re: Vacuum: allow usage of more than 1GB of work mem

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2016-11-21 17:15:20
Message-ID: CAD21AoC2eoTijmuVCjb=235Bn9T5L5SqRnMdTb4XkEjm31=BEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 18, 2016 at 6:54 AM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> On Thu, Nov 17, 2016 at 6:34 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, Nov 17, 2016 at 1:42 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>>> Attached is patch 0002 with pgindent applied over it
>>>
>>> I don't think there's any other formatting issue, but feel free to
>>> point a finger to it if I missed any
>>
>> Hmm, I had imagined making all of the segments the same size rather
>> than having the size grow exponentially. The whole point of this is
>> to save memory, and even in the worst case you don't end up with that
>> many segments as long as you pick a reasonable base size (e.g. 64MB).
>
> Wastage is bound by a fraction of the total required RAM, that is,
> it's proportional to the amount of required RAM, not the amount
> allocated. So it should still be fine, and the exponential strategy
> should improve lookup performance considerably.

I'm concerned that it could use repalloc for large memory area when
vacrelstats->dead_tuples.dead_tuple is bloated. It would be overhead
and slow. What about using semi-fixed memroy space without repalloc;
Allocate the array of ItemPointerData array, and each ItemPointerData
array stores the dead tuple locations. The size of ItemPointerData
array starts with small size (e.g. 16MB or 32MB). After we used an
array up, we then allocate next segment with twice size as previous
segment. But to prevent over allocating memory, it would be better to
set a limit of allocating size of ItemPointerData array to 512MB or
1GB. We could expand array of array using repalloc if needed, but it
would not be happend much. Other design is similar to current your
patch; the offset of the array of array and the offset of a
ItemPointerData array control current location, which are cleared
after finished reclaiming garbage on heap and index. And allocated
array is re-used by subsequent scanning heap.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-11-21 17:17:14 Re: delta relations in AFTER triggers
Previous Message Robert Haas 2016-11-21 17:14:21 Re: [sqlsmith] Parallel worker crash on seqscan