Re: Vacuum: allow usage of more than 1GB of work mem

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2016-09-15 18:48:09
Message-ID: 0a26f387-ec66-00e6-5468-e3b7afe4202a@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/15/2016 06:40 PM, Robert Haas wrote:
> On Thu, Sep 15, 2016 at 12:22 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>>> On 09/14/2016 07:57 PM, Tom Lane wrote:
>>>> People who are vacuuming because they are out of disk space will be very
>>>> very unhappy with that solution.
>>
>>> The people are usually running out of space for data, while these files
>>> would be temporary files placed wherever temp_tablespaces points to. I'd
>>> argue if this is a source of problems, the people are already in deep
>>> trouble due to sorts, CREATE INDEX, ... as those commands may also
>>> generate a lot of temporary files.
>>
>> Except that if you are trying to recover disk space, VACUUM is what you
>> are doing, not CREATE INDEX. Requiring extra disk space to perform a
>> vacuum successfully is exactly the wrong direction to be going in.
>> See for example this current commitfest entry:
>> https://commitfest.postgresql.org/10/649/
>> Regardless of what you think of the merits of that patch, it's trying
>> to solve a real-world problem. And as Robert has already pointed out,
>> making this aspect of VACUUM more complicated is not solving any
>> pressing problem. "But we made it faster" is going to be a poor answer
>> for the next person who finds themselves up against the wall with no
>> recourse.
>
> I very much agree.
>

How does VACUUM alone help with recovering disk space? AFAIK it only
makes the space available for new data, it does not reclaim the disk
space at all. Sure, we truncate empty pages at the end of the last
segment, but how likely is that in practice? What I do see people doing
is usually either VACUUM FULL (which is however doomed for obvious
reasons) or VACUUM + reindexing to get rid of index bloat (which however
leads to CREATE INDEX using temporary files).

I'm not sure I agree with your claim there's no pressing problem. We do
see quite a few people having to do VACUUM with multiple index scans
(because the TIDs don't fit into m_w_m), which certainly has significant
impact on production systems - both in terms of performance and it also
slows down reclaiming the space. Sure, being able to set m_w_m above 1GB
is an improvement, but perhaps using a more efficient TID storage would
improve the situation further. Writing the TIDs to a temporary file may
not the right approach, but I don't see why that would make the original
problem less severe?

For example, we always allocate the TID array as large as we can fit
into m_w_m, but maybe we don't need to wait with switching to the bitmap
until filling the whole array - we could wait as long as the bitmap fits
into the remaining part of the array, build it there and then copy it to
the beginning (and use the bitmap from that point).

regards
Tomas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2016-09-15 18:54:41 Re: Vacuum: allow usage of more than 1GB of work mem
Previous Message Robert Haas 2016-09-15 18:33:31 Re: RLS related docs