Re: autovacuum: change priority of the vacuumed tables

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Grigory Smolkin <g(dot)smolkin(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: autovacuum: change priority of the vacuumed tables
Date: 2018-02-16 08:42:34
Message-ID: CAD21AoC9fwtneY00pRTBasFbyDPA=gcv-31pZhTqkPHrg9VA6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 15, 2018 at 10:16 PM, Grigory Smolkin
<g(dot)smolkin(at)postgrespro(dot)ru> wrote:
> On 02/15/2018 09:28 AM, Masahiko Sawada wrote:
>
>> Hi,
>>
>> On Thu, Feb 8, 2018 at 11:01 PM, Ildus Kurbangaliev
>> <i(dot)kurbangaliev(at)postgrespro(dot)ru> wrote:
>>>
>>> Hi,
>>>
>>> Attached patch adds 'autovacuum_table_priority' to the current list of
>>> automatic vacuuming settings. It's used in sorting of vacuumed tables in
>>> autovacuum worker before actual vacuum.
>>>
>>> The idea is to give possibility to the users to prioritize their tables
>>> in autovacuum process.
>>>
>> Hmm, I couldn't understand the benefit of this patch. Would you
>> elaborate it a little more?
>>
>> Multiple autovacuum worker can work on one database. So even if a
>> table that you want to vacuum first is the back of the list and there
>> other worker would pick up it. If the vacuuming the table gets delayed
>> due to some big tables are in front of that table I think you can deal
>> with it by increasing the number of autovacuum workers.
>>
>> Regards,
>>
>> --
>> Masahiko Sawada
>> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>> NTT Open Source Software Center
>>
>
> Database can contain thousands of tables and often updates/deletes
> concentrate mostly in only a handful of tables.
> Going through thousands of less bloated tables can take ages.
> Currently autovacuum know nothing about prioritizing it`s work with respect
> to user`s understanding of his data and application.

Understood. I have a question; please imagine the following case.

Suppose that there are 1000 tables in a database, and one table of
them (table-A) has the highest priority while other 999 tables have
same priority. Almost tables (say 800 tables) including table-A need
to get vacuumed at some point, so with your patch an AV worker listed
800 tables and table-A will be at the head of the list. Table-A will
get vacuumed first but this AV worker has to vacuum other 799 tables
even if table-A requires vacuum later again.

If an another AV worker launches during table-A being vacuumed, the
new AV worker would include table-A but would not process it because
concurrent AV worker is processing it. So it would vacuum other tables
instead. Similarly, this AV worker can not get the new table list
until finish to vacuum all other tables. (Note that it might skip some
tables if they are already vacuumed by other AV worker.) On the other
hand, if another new AV worker launches after table-A got vacuumed and
requires vacuuming again, the new AV worker puts the table-A at the
head of list. It processes table-A first but, again, it has to vacuum
other tables before getting new table list next time that might
include table-A.

Is this the expected behavior? I'd rather expect postgres to vacuum it
before other lower priority tables whenever the table having the
highest priority requires vacuuming, but it wouldn't.

> Also It`s would be great to sort tables according to dead/live tuple ratio
> and relfrozenxid.

Yeah, for anti-wraparound vacuum on the database, it would be good
idea to sort the list by relfrozenxid as discussed on another
thread[1],

[1] https://www.postgresql.org/message-id/CA%2BTgmobT3m%3D%2BdU5HF3VGVqiZ2O%2Bv6P5wN1Gj%2BPrq%2Bhj7dAm9AQ%40mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michail Nikolaev 2018-02-16 08:59:17 Re: Contention preventing locking
Previous Message Amit Langote 2018-02-16 08:36:58 Re: reorganizing partitioning code