Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently
Date: 2018-02-06 07:56:53
Message-ID: CAD21AoDJqvyC-NcV7-efgA2Jp4XFiwynduyA0fQv2tvzcmF+Ng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 6, 2018 at 2:55 AM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> On Mon, Feb 5, 2018 at 1:53 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Fri, Feb 2, 2018 at 11:13 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>>> After autovacuum gets cancelled, the next time it wakes up it will
>>> retry vacuuming the cancelled relation. That's because a cancelled
>>> autovacuum doesn't update the last-vacuumed stats.
>>>
>>> So the timing between an autovacuum work item and the next retry for
>>> that relation is more or less an autovacuum nap time, except perhaps
>>> in the case where many vacuums get cancelled, and they have to be
>>> queued.
>>
>> I think that's not true if there are multiple databases.
>
> I'd have to re-check.
>
> The loop is basically, IIRC:
>
> while(1) { vacuum db ; work items ; nap }
>
> Now, if that's vacuum one db, not all, and if the decision on the
> second run doesn't pick the same db because that big table failed to
> be vacuumed, then you're right.
>
> In that case we could add the FSM vacuum as a work item *in addition*
> to what this patch does. If the work item queue is full and the FSM
> vacuum doesn't happen, it'd be no worse than with the patch as-is.
>
> Is that what you suggest?

That's one of the my suggestion. I might had to consider this issue
for each case. To be clear let me summarize for each case.

For table with indices, vacuum on fsm of heap is done after
lazy_vacuum_heap(). Therefore I think that the case where a table got
vacuumed but fsm couldn't get vacuumed doesn't happen unless the
autovacuum gets cancelled before or while vacuuming fsm. So it can
solve the problem in most cases. Also we can use a vacuum progress
information to check whether we should vacuum fsm of table after got
cancelled. For vacuuming fsm of index, we might have to consider to
vacuum fsm of index after lazy_vacuum_index.

For table with no index, it would be more complicated; similar to
table with indices we can vacuum fsm of table more frequently but
table bloating still can happen. Considering a way to surely vacuum
fsm of table there are some approaches:
(1) using autovacuum work-item, (2) vacuuming fsm of table in
PG_CATCH, (3) remembering the tables got cancelled and vacuuming them
after finished a loop of table_oids.

For (1), we have a issue that is work-item queue will be full when
many tables get cancelled and it's not good idea to queue many
redundant work-items. For (2), it would not be a good way to vacuum
fsm of table immediately after cancelled because immediately after got
cancelled the table still likely to being locked by others. For (3),
that might work fine but it can happen that other autovacum worker
vacuums the fsm of table before processing the list. But it would be
better than we always vacuums them at beginning of vacuum. So I'm in
favor of (3). Even when processing tables in the list, we should take
a lock on the table conditionally so that the autovacuum doesn't block
any foreground work. However, it's quite possible that I'm not seeing
the whole picture here.

>
> With that in mind, I'm noticing WorkItems have a avw_database that
> isn't checked by do_autovacuum. Is that right? Shouldn't only work
> items that belong to the database being autovacuumed be processed?
>>> That's why an initial FSM vacuum makes sense. It has a similar timing
>>> to the autovacuum work item, it has the benefit that it can be
>>> triggered manually (manual vacuum), and it's cheap and efficient.
>>>
>>>> Also the patch always vacuums fsm at beginning of the vacuum with a
>>>> threshold but it is useless if the table has been properly vacuumed. I
>>>> don't think it's good idea to add an additional step that "might be"
>>>> efficient, because current vacuum is already heavy.
>>>
>>> FSMs are several orders of magnitude smaller than the tables
>>> themselves. A TB-sized table I have here has a 95M FSM. If you add
>>> threshold skipping, that initial FSM vacuum *will* be efficient.
>>>
>>> By definition, the FSM will be less than 1/4000th of the table, so
>>> that initial FSM pass takes less than 1/4000th of the whole vacuum.
>>> Considerably less considering the simplicity of the task.
>>
>> I agree the fsm is very smaller than heap and vacuum on fsm will not
>> be comparatively heavy but I'm afraid that the vacuum will get more
>> heavy in the future if we pile up such improvement that are small but
>> might not be efficient. For example, a feature for reporting the last
>> vacuum status has been proposed[1]. I wonder if we can use it to
>> determine whether we do the fsm vacuum at beginning of vacuum.
>
> Yes, such a feature would allow skipping that initial FSM vacuum. That
> can be improved in a future patch if that proposed feature gets
> merged. This patch can be treated independently from that in the
> meanwhile, don't you think?

IMO, I'd like to have it after we could have such an optimization.
Otherwise it will result in adding extra steps for every vacuums until
we get the optimization. But, as you said, we also can think it's no
problem because the vacuum fsm of table at beginning of the vacuum has
the threshold. I'd like to hear more opinions about this.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Zakirov 2018-02-06 09:17:17 Re: [HACKERS] Bug in to_timestamp().
Previous Message Marc Cousin 2018-02-06 07:48:51 best way to check identical constraint between databases