Re: Berserk Autovacuum (let's save next Mandrill)

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2019-03-28 15:42:26
Message-ID: CAD21AoDTX9JoUVpy1gGSQPDM6ne-epJtp_VZAtHhzw34X1oWPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 28, 2019 at 8:58 PM Darafei "Komяpa" Praliaskouski
<me(at)komzpa(dot)net> wrote:
>
> Hi,
>
> > > Why not select a table that has inserts, updates and deletes for autovacuum just like we do for autoanalyze, not only deletes and updates like we do now?
>>
>> >
>> > Sounds like a good idea, although I do agree with Alvaro when he
>> > mentions that it would be good to only invoke a worker that was only
>> > going to freeze tuples and not look at the indexes.
>>
>> The invoking autovacuum on table based on inserts, not only deletes
>> and updates, seems good idea to me. But in this case, I think that we
>> can not only freeze tuples but also update visibility map even when
>> setting all-visible. Roughly speaking I think vacuum does the
>> following operations.
>>
>> 1. heap vacuum
>>
>> 2. HOT pruning
>
> Is it worth skipping it if we're writing a page anyway for the sake of hint bits and new xids? This will all be no-op anyway on append-only tables and happen only when we actually need something?
>

Yeah, these operations are required only when the table has actual
garbage. IOW, append-only tables never require them.

>>
>> 3. freezing tuples
>> 4. updating visibility map (all-visible and all-frozen)
>
> These two are needed, and current autovacuum launch process does not take into account that this is also needed for non-dead tuples.
>
>>
>> 5. index vacuum/cleanup
>
> There is a separate patch for that. But, since https://commitfest.postgresql.org/16/952/ for almost a year already Postgres skips index cleanup on tables without new dead tuples, so this case is taken care of already?

I think that's not enough. The feature "GUC for cleanup index
threshold" allows us to skip only index cleanup when there are less
insertion than the fraction of the total number of heap tuples since
last index cleanup. Therefore it helps only append-only tables (and
supporting only btree index for now). We still have to do index
vacuuming even if the table has just a few dead tuple. The proposed
patch[1] helps this situation; vacuum can run while skipping index
vacuuming and index cleanup.

>
>>
>> 6. truncation
>
> This shouldn't be a heavy operation?
>

I don't think so. This could take AccessExclusiveLock on the table and
take a long time with large shared buffer as per reported on that
thread[2].

>>
>>
>> With the proposed patch[1] we can control to do 5 or not. In addition
>> to that, another proposed patch[2] allows us to control 6.
>>
>> For append-only tables (and similar tables), what we periodically want
>> to do would be 3 and 4 (possibly we can do 2 as well). So maybe we
>> need to have both an option of (auto)vacuum to control whether to do 1
>> and something like a new autovacuum threshold (or an option) to invoke
>> the vacuum that disables 1, 5 and 6. The vacuum that does only 2, 3
>> and 4 would be much cheaper than today's vacuum and anti-wraparound
>> vacuum would be able to skip almost pages.
>
>
> Why will we want to get rid of 1? It's a noop from write perspective and saves a scan to do it if it's not noop.
>

Because that's for tables that have many inserts but have some
updates/deletes. I think that this strategy would help not only
append-only tables but also such tables.

> Why make it faster in emergency situations when situation can be made non-emergency from the very beginning instead?
>

I don't understand the meaning of "situation can be made non-emergency
from the very beginning". Could you please elaborate on that?

>> [1] https://commitfest.postgresql.org/22/1817/
>> [2] https://commitfest.postgresql.org/22/1981/

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2019-03-28 16:07:21 Re: monitoring CREATE INDEX [CONCURRENTLY]
Previous Message Alvaro Herrera 2019-03-28 15:39:18 Re: monitoring CREATE INDEX [CONCURRENTLY]