Quick Links

Re: autovac issue with large number of tables

From:	Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To:	Kasahara Tatsuhito <kasahara(dot)tatsuhito(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <nasbyj(at)amazon(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: autovac issue with large number of tables
Date:	2020-11-26 16:43:25
Message-ID:	dfcb587d-3f46-28d9-7b19-1ea33303155e@oss.nttdata.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>
>> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
>>>
>>> Hi,
>>>
>>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>>>
>>>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
>>>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
>>>>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
>>>>>>> I wonder if we could have table_recheck_autovac do two probes of the stats
>>>>>>> data. First probe the existing stats data, and if it shows the table to
>>>>>>> be already vacuumed, return immediately. If not, *then* force a stats
>>>>>>> re-read, and check a second time.
>>>>>> Does the above mean that the second and subsequent table_recheck_autovac()
>>>>>> will be improved to first check using the previous refreshed statistics?
>>>>>> I think that certainly works.
>>>>>>
>>>>>> If that's correct, I'll try to create a patch for the PoC
>>>>>
>>>>> I still don't know how to reproduce Jim's troubles, but I was able to reproduce
>>>>> what was probably a very similar problem.
>>>>>
>>>>> This problem seems to be more likely to occur in cases where you have
>>>>> a large number of tables,
>>>>> i.e., a large amount of stats, and many small tables need VACUUM at
>>>>> the same time.
>>>>>
>>>>> So I followed Tom's advice and created a patch for the PoC.
>>>>> This patch will enable a flag in the table_recheck_autovac function to use
>>>>> the existing stats next time if VACUUM (or ANALYZE) has already been done
>>>>> by another worker on the check after the stats have been updated.
>>>>> If the tables continue to require VACUUM after the refresh, then a refresh
>>>>> will be required instead of using the existing statistics.
>>>>>
>>>>> I did simple test with HEAD and HEAD + this PoC patch.
>>>>> The tests were conducted in two cases.
>>>>> (I changed few configurations. see attached scripts)
>>>>>
>>>>> 1. Normal VACUUM case
>>>>> - SET autovacuum = off
>>>>> - CREATE tables with 100 rows
>>>>> - DELETE 90 rows for each tables
>>>>> - SET autovacuum = on and restart PostgreSQL
>>>>> - Measure the time it takes for all tables to be VACUUMed
>>>>>
>>>>> 2. Anti wrap round VACUUM case
>>>>> - CREATE brank tables
>>>>> - SELECT all of these tables (for generate stats)
>>>>> - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
>>>>> - Consumes a lot of XIDs by using txid_curent()
>>>>> - Measure the time it takes for all tables to be VACUUMed
>>>>>
>>>>> For each test case, the following results were obtained by changing
>>>>> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
>>>>> Also changing num of tables to 1000, 5000, 10000 and 20000.
>>>>>
>>>>> Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
>>>>> but I think it's enough to ask for a trend.
>>>>>
>>>>> ===========================================================================
>>>>> [1.Normal VACUUM case]
>>>>> tables:1000
>>>>> autovacuum_max_workers 1: (HEAD) 20 sec VS (with patch) 20 sec
>>>>> autovacuum_max_workers 2: (HEAD) 18 sec VS (with patch) 16 sec
>>>>> autovacuum_max_workers 3: (HEAD) 18 sec VS (with patch) 16 sec
>>>>> autovacuum_max_workers 5: (HEAD) 19 sec VS (with patch) 17 sec
>>>>> autovacuum_max_workers 10: (HEAD) 19 sec VS (with patch) 17 sec
>>>>>
>>>>> tables:5000
>>>>> autovacuum_max_workers 1: (HEAD) 77 sec VS (with patch) 78 sec
>>>>> autovacuum_max_workers 2: (HEAD) 61 sec VS (with patch) 43 sec
>>>>> autovacuum_max_workers 3: (HEAD) 38 sec VS (with patch) 38 sec
>>>>> autovacuum_max_workers 5: (HEAD) 45 sec VS (with patch) 37 sec
>>>>> autovacuum_max_workers 10: (HEAD) 43 sec VS (with patch) 35 sec
>>>>>
>>>>> tables:10000
>>>>> autovacuum_max_workers 1: (HEAD) 152 sec VS (with patch) 153 sec
>>>>> autovacuum_max_workers 2: (HEAD) 119 sec VS (with patch) 98 sec
>>>>> autovacuum_max_workers 3: (HEAD) 87 sec VS (with patch) 78 sec
>>>>> autovacuum_max_workers 5: (HEAD) 100 sec VS (with patch) 66 sec
>>>>> autovacuum_max_workers 10: (HEAD) 97 sec VS (with patch) 56 sec
>>>>>
>>>>> tables:20000
>>>>> autovacuum_max_workers 1: (HEAD) 338 sec VS (with patch) 339 sec
>>>>> autovacuum_max_workers 2: (HEAD) 231 sec VS (with patch) 229 sec
>>>>> autovacuum_max_workers 3: (HEAD) 220 sec VS (with patch) 191 sec
>>>>> autovacuum_max_workers 5: (HEAD) 234 sec VS (with patch) 147 sec
>>>>> autovacuum_max_workers 10: (HEAD) 320 sec VS (with patch) 113 sec
>>>>>
>>>>> [2.Anti wrap round VACUUM case]
>>>>> tables:1000
>>>>> autovacuum_max_workers 1: (HEAD) 19 sec VS (with patch) 18 sec
>>>>> autovacuum_max_workers 2: (HEAD) 14 sec VS (with patch) 15 sec
>>>>> autovacuum_max_workers 3: (HEAD) 14 sec VS (with patch) 14 sec
>>>>> autovacuum_max_workers 5: (HEAD) 14 sec VS (with patch) 16 sec
>>>>> autovacuum_max_workers 10: (HEAD) 16 sec VS (with patch) 14 sec
>>>>>
>>>>> tables:5000
>>>>> autovacuum_max_workers 1: (HEAD) 69 sec VS (with patch) 69 sec
>>>>> autovacuum_max_workers 2: (HEAD) 66 sec VS (with patch) 47 sec
>>>>> autovacuum_max_workers 3: (HEAD) 59 sec VS (with patch) 37 sec
>>>>> autovacuum_max_workers 5: (HEAD) 39 sec VS (with patch) 28 sec
>>>>> autovacuum_max_workers 10: (HEAD) 39 sec VS (with patch) 29 sec
>>>>>
>>>>> tables:10000
>>>>> autovacuum_max_workers 1: (HEAD) 139 sec VS (with patch) 138 sec
>>>>> autovacuum_max_workers 2: (HEAD) 130 sec VS (with patch) 86 sec
>>>>> autovacuum_max_workers 3: (HEAD) 120 sec VS (with patch) 68 sec
>>>>> autovacuum_max_workers 5: (HEAD) 96 sec VS (with patch) 41 sec
>>>>> autovacuum_max_workers 10: (HEAD) 90 sec VS (with patch) 39 sec
>>>>>
>>>>> tables:20000
>>>>> autovacuum_max_workers 1: (HEAD) 313 sec VS (with patch) 331 sec
>>>>> autovacuum_max_workers 2: (HEAD) 209 sec VS (with patch) 201 sec
>>>>> autovacuum_max_workers 3: (HEAD) 227 sec VS (with patch) 141 sec
>>>>> autovacuum_max_workers 5: (HEAD) 236 sec VS (with patch) 88 sec
>>>>> autovacuum_max_workers 10: (HEAD) 309 sec VS (with patch) 74 sec
>>>>> ===========================================================================
>>>>>
>>>>> The cases without patch, the scalability of the worker has decreased
>>>>> as the number of tables has increased.
>>>>> In fact, the more workers there are, the longer it takes to complete
>>>>> VACUUM to all tables.
>>>>> The cases with patch, it shows good scalability with respect to the
>>>>> number of workers.
>>>>
>>>> It seems a good performance improvement even without the patch of
>>>> shared memory based stats collector.

Sounds great!

>>>>
>>>>>
>>>>> Note that perf top results showed that hash_search_with_hash_value,
>>>>> hash_seq_search and
>>>>> pgstat_read_statsfiles are dominant during VACUUM in all patterns,
>>>>> with or without the patch.
>>>>>
>>>>> Therefore, there is still a need to find ways to optimize the reading
>>>>> of large amounts of stats.
>>>>> However, this patch is effective in its own right, and since there are
>>>>> only a few parts to modify,
>>>>> I think it should be able to be applied to current (preferably
>>>>> pre-v13) PostgreSQL.
>>>>
>>>> +1
>>>>
>>>> +
>>>> + /* We might be better to refresh stats */
>>>> + use_existing_stats = false;
>>>> }
>>>> + else
>>>> + {
>>>>
>>>> - heap_freetuple(classTup);
>>>> + heap_freetuple(classTup);
>>>> + /* The relid has already vacuumed, so we might be better to
>>>> use exiting stats */
>>>> + use_existing_stats = true;
>>>> + }
>>>>
>>>> With that patch, the autovacuum process refreshes the stats in the
>>>> next check if it finds out that the table still needs to be vacuumed.
>>>> But I guess it's not necessarily true because the next table might be
>>>> vacuumed already. So I think we might want to always use the existing
>>>> for the first check. What do you think?
>>> Thanks for your comment.
>>>
>>> If we assume the case where some workers vacuum on large tables
>>> and a single worker vacuum on small tables, the processing
>>> performance of the single worker will be slightly lower if the
>>> existing statistics are checked every time.
>>>
>>> In fact, at first I tried to check the existing stats every time,
>>> but the performance was slightly worse in cases with a small number of workers.

Do you have this benchmark result?

>>> (Checking the existing stats is lightweight , but at high frequency,
>>> it affects processing performance.)
>>> Therefore, at after refresh statistics, determine whether autovac
>>> should use the existing statistics.
>>
>> Yeah, since the test you used uses a lot of small tables, if there are
>> a few workers, checking the existing stats is unlikely to return true
>> (no need to vacuum). So the cost of existing stats check ends up being
>> overhead. Not sure how slow always checking the existing stats was,
>> but given that the shared memory based stats collector patch could
>> improve the performance of refreshing stats, it might be better not to
>> check the existing stats frequently like the patch does. Anyway, I
>> think it’s better to evaluate the performance improvement with other
>> cases too.
> Yeah, I would like to see how much the performance changes in other cases.
> In addition, if the shared-based-stats patch is applied, we won't need to reload
> a huge stats file, so we will just have to check the stats on
> shared-mem every time.
> Perhaps the logic of table_recheck_autovac could be simpler.
>
>>> BTW, I found some typos in comments, so attache a fixed version.

The patch adds some duplicated codes into table_recheck_autovac().
It's better to make the common function performing them and make
table_recheck_autovac() call that common function, to simplify the code.

+ /*
+ * Get the applicable reloptions. If it is a TOAST table, try to get the
+ * main table reloptions if the toast table itself doesn't have.
+ */
+ avopts = extract_autovac_opts(classTup, pg_class_desc);
+ if (classForm->relkind == RELKIND_TOASTVALUE &&
+ avopts == NULL && table_toast_map != NULL)
+ {
+ av_relation *hentry;
+ bool found;
+
+ hentry = hash_search(table_toast_map, &relid, HASH_FIND, &found);
+ if (found && hentry->ar_hasrelopts)
+ avopts = &hentry->ar_reloptions;
+ }

The above is performed both when using the existing stats and
also when the stats are refreshed. But it's actually required
only at once?

- heap_freetuple(classTup);
+ heap_freetuple(classTup);

With the patch, heap_freetuple() is not called when either doanalyze
or dovacuum is true. But it should be called even in that case,
like it is originally?

>>
>> Thank you for updating the patch! I'll also run the performance test
>> you shared with the latest version patch.

> Thank you!
> It's very helpful.

Agreed.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Re: autovac issue with large number of tables at 2020-11-26 01:41:03 from Kasahara Tatsuhito

Responses

Re: autovac issue with large number of tables at 2020-11-27 08:21:39 from Masahiko Sawada
Re: autovac issue with large number of tables at 2020-11-27 09:38:45 from Kasahara Tatsuhito

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Matthias van de Meent	2020-11-26 16:45:21	Re: [patch] CLUSTER blocks scanned progress reporting
Previous Message	Paul Förster	2020-11-26 16:32:23	Re: configure and DocBook XML