Re: Per-tablespace autovacuum settings

From: Oleksii Kliukin <alexk(at)hintbits(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Per-tablespace autovacuum settings
Date: 2019-02-14 18:14:49
Message-ID: ADE5BB1E-5370-4A13-BC1F-20A462BC8D73@hintbits.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2019-02-14 17:56:17 +0100, Oleksii Kliukin wrote:
>> Is there any interest in making autovacuum parameters available on a
>> tablespace level in order to apply those to all vacuumable objects in the
>> tablespace?
>>
>> We have a set of tables running on ZFS, where autovacuum does almost no good
>> to us (except for preventing anti-wraparound) due to the nature of ZFS (FS
>> fragmentation caused by copy-on-write leads to sequential scans doing random
>> access) and the fact that our tables there are append-only. Initially, the
>> team in charge of the application just disabled autovacuum globally, but
>> that lead to a huge system catalog bloat.
>>
>> At present, we have to re-enable autovacuum globally and then disable it
>> per-table using table storage parameters, but that is inelegant and requires
>> doing it once for existing tables and modifying the script that periodically
>> creates new ones (the whole system is a Postgres-based replacement of an
>> ElasticSearch cluster and we have to create new partitions regularly).
>
> Won't that a) lead to periodic massive anti-wraparound sessions? b)
> prevent any use of index only scans?

The wraparound is hardly an issue there, as the data is transient and only
exist for 14 days (I think the entire date-based partition is dropped,
that’s how we ended up with pg_class catalog bloat). The index-only scan can
be an issue, although, IIRC, there is some manual vacuum that runs from time
to time, perhaps following your advice below.

> ISTM you'd be better off running vacuum rarely, with large
> thresholds. That way it'd do most of the writes in one pass, hopefully
> leading to less fragementation, and it'd set the visibilitymap bits to
> prevent further need to touch those. By doing it only rarely, vacuum
> should process pages sequentially, reducing the fragmentation.
>
>
>> Grouping tables by tablespaces for the purpose of autovacuum configuration
>> seems natural, as tablespaces are often placed on another filesystems/device
>> that may require changing how often does autovacuum run, make it less/more
>> aggressive depending on the I/O performance or require disabling it
>> altogether as in my example above. Furthermore, given that we allow
>> cost-based options per-tablespace the infrastructure is already there and
>> the task is mostly to teach autovacuum to look at tablespaces in addition to
>> the relation storage options (in case of a conflict, relation options should
>> always take priority).
>
> While I don't buy the reasoning above, I think this'd be useful for
> other cases.

Even if we don’t want to disable autovacuum completely, we might want to
make it much less frequent by increasing the thresholds or costs/delays to
reduce the I/O strain for a particular tablespace.

Regards,
Oleksii Kliukin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-02-14 18:16:16 Re: libpq debug log
Previous Message Andres Freund 2019-02-14 18:07:58 Re: libpq host/hostaddr/conninfo inconsistencies