Re: autovac issue with large number of tables

From: Kasahara Tatsuhito <kasahara(dot)tatsuhito(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <nasbyj(at)amazon(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovac issue with large number of tables
Date: 2020-11-29 13:34:16
Message-ID: CAP0=ZVLzskPO3h5L03kAwCC4Mfwd5d+HMR8H+CFeCLnwNZzysQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Thanks for you comments.

On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
>
>
> On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> > Hi,
> >
> > On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> >>
> >>
> >>
> >> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> >>> On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >>>>
> >>>> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >>>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >>>>>>
> >>>>>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >>>>>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >>>>>>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
> >>>>>>>>> I wonder if we could have table_recheck_autovac do two probes of the stats
> >>>>>>>>> data. First probe the existing stats data, and if it shows the table to
> >>>>>>>>> be already vacuumed, return immediately. If not, *then* force a stats
> >>>>>>>>> re-read, and check a second time.
> >>>>>>>> Does the above mean that the second and subsequent table_recheck_autovac()
> >>>>>>>> will be improved to first check using the previous refreshed statistics?
> >>>>>>>> I think that certainly works.
> >>>>>>>>
> >>>>>>>> If that's correct, I'll try to create a patch for the PoC
> >>>>>>>
> >>>>>>> I still don't know how to reproduce Jim's troubles, but I was able to reproduce
> >>>>>>> what was probably a very similar problem.
> >>>>>>>
> >>>>>>> This problem seems to be more likely to occur in cases where you have
> >>>>>>> a large number of tables,
> >>>>>>> i.e., a large amount of stats, and many small tables need VACUUM at
> >>>>>>> the same time.
> >>>>>>>
> >>>>>>> So I followed Tom's advice and created a patch for the PoC.
> >>>>>>> This patch will enable a flag in the table_recheck_autovac function to use
> >>>>>>> the existing stats next time if VACUUM (or ANALYZE) has already been done
> >>>>>>> by another worker on the check after the stats have been updated.
> >>>>>>> If the tables continue to require VACUUM after the refresh, then a refresh
> >>>>>>> will be required instead of using the existing statistics.
> >>>>>>>
> >>>>>>> I did simple test with HEAD and HEAD + this PoC patch.
> >>>>>>> The tests were conducted in two cases.
> >>>>>>> (I changed few configurations. see attached scripts)
> >>>>>>>
> >>>>>>> 1. Normal VACUUM case
> >>>>>>> - SET autovacuum = off
> >>>>>>> - CREATE tables with 100 rows
> >>>>>>> - DELETE 90 rows for each tables
> >>>>>>> - SET autovacuum = on and restart PostgreSQL
> >>>>>>> - Measure the time it takes for all tables to be VACUUMed
> >>>>>>>
> >>>>>>> 2. Anti wrap round VACUUM case
> >>>>>>> - CREATE brank tables
> >>>>>>> - SELECT all of these tables (for generate stats)
> >>>>>>> - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
> >>>>>>> - Consumes a lot of XIDs by using txid_curent()
> >>>>>>> - Measure the time it takes for all tables to be VACUUMed
> >>>>>>>
> >>>>>>> For each test case, the following results were obtained by changing
> >>>>>>> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> >>>>>>> Also changing num of tables to 1000, 5000, 10000 and 20000.
> >>>>>>>
> >>>>>>> Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
> >>>>>>> but I think it's enough to ask for a trend.
> >>>>>>>
> >>>>>>> ===========================================================================
> >>>>>>> [1.Normal VACUUM case]
> >>>>>>> tables:1000
> >>>>>>> autovacuum_max_workers 1: (HEAD) 20 sec VS (with patch) 20 sec
> >>>>>>> autovacuum_max_workers 2: (HEAD) 18 sec VS (with patch) 16 sec
> >>>>>>> autovacuum_max_workers 3: (HEAD) 18 sec VS (with patch) 16 sec
> >>>>>>> autovacuum_max_workers 5: (HEAD) 19 sec VS (with patch) 17 sec
> >>>>>>> autovacuum_max_workers 10: (HEAD) 19 sec VS (with patch) 17 sec
> >>>>>>>
> >>>>>>> tables:5000
> >>>>>>> autovacuum_max_workers 1: (HEAD) 77 sec VS (with patch) 78 sec
> >>>>>>> autovacuum_max_workers 2: (HEAD) 61 sec VS (with patch) 43 sec
> >>>>>>> autovacuum_max_workers 3: (HEAD) 38 sec VS (with patch) 38 sec
> >>>>>>> autovacuum_max_workers 5: (HEAD) 45 sec VS (with patch) 37 sec
> >>>>>>> autovacuum_max_workers 10: (HEAD) 43 sec VS (with patch) 35 sec
> >>>>>>>
> >>>>>>> tables:10000
> >>>>>>> autovacuum_max_workers 1: (HEAD) 152 sec VS (with patch) 153 sec
> >>>>>>> autovacuum_max_workers 2: (HEAD) 119 sec VS (with patch) 98 sec
> >>>>>>> autovacuum_max_workers 3: (HEAD) 87 sec VS (with patch) 78 sec
> >>>>>>> autovacuum_max_workers 5: (HEAD) 100 sec VS (with patch) 66 sec
> >>>>>>> autovacuum_max_workers 10: (HEAD) 97 sec VS (with patch) 56 sec
> >>>>>>>
> >>>>>>> tables:20000
> >>>>>>> autovacuum_max_workers 1: (HEAD) 338 sec VS (with patch) 339 sec
> >>>>>>> autovacuum_max_workers 2: (HEAD) 231 sec VS (with patch) 229 sec
> >>>>>>> autovacuum_max_workers 3: (HEAD) 220 sec VS (with patch) 191 sec
> >>>>>>> autovacuum_max_workers 5: (HEAD) 234 sec VS (with patch) 147 sec
> >>>>>>> autovacuum_max_workers 10: (HEAD) 320 sec VS (with patch) 113 sec
> >>>>>>>
> >>>>>>> [2.Anti wrap round VACUUM case]
> >>>>>>> tables:1000
> >>>>>>> autovacuum_max_workers 1: (HEAD) 19 sec VS (with patch) 18 sec
> >>>>>>> autovacuum_max_workers 2: (HEAD) 14 sec VS (with patch) 15 sec
> >>>>>>> autovacuum_max_workers 3: (HEAD) 14 sec VS (with patch) 14 sec
> >>>>>>> autovacuum_max_workers 5: (HEAD) 14 sec VS (with patch) 16 sec
> >>>>>>> autovacuum_max_workers 10: (HEAD) 16 sec VS (with patch) 14 sec
> >>>>>>>
> >>>>>>> tables:5000
> >>>>>>> autovacuum_max_workers 1: (HEAD) 69 sec VS (with patch) 69 sec
> >>>>>>> autovacuum_max_workers 2: (HEAD) 66 sec VS (with patch) 47 sec
> >>>>>>> autovacuum_max_workers 3: (HEAD) 59 sec VS (with patch) 37 sec
> >>>>>>> autovacuum_max_workers 5: (HEAD) 39 sec VS (with patch) 28 sec
> >>>>>>> autovacuum_max_workers 10: (HEAD) 39 sec VS (with patch) 29 sec
> >>>>>>>
> >>>>>>> tables:10000
> >>>>>>> autovacuum_max_workers 1: (HEAD) 139 sec VS (with patch) 138 sec
> >>>>>>> autovacuum_max_workers 2: (HEAD) 130 sec VS (with patch) 86 sec
> >>>>>>> autovacuum_max_workers 3: (HEAD) 120 sec VS (with patch) 68 sec
> >>>>>>> autovacuum_max_workers 5: (HEAD) 96 sec VS (with patch) 41 sec
> >>>>>>> autovacuum_max_workers 10: (HEAD) 90 sec VS (with patch) 39 sec
> >>>>>>>
> >>>>>>> tables:20000
> >>>>>>> autovacuum_max_workers 1: (HEAD) 313 sec VS (with patch) 331 sec
> >>>>>>> autovacuum_max_workers 2: (HEAD) 209 sec VS (with patch) 201 sec
> >>>>>>> autovacuum_max_workers 3: (HEAD) 227 sec VS (with patch) 141 sec
> >>>>>>> autovacuum_max_workers 5: (HEAD) 236 sec VS (with patch) 88 sec
> >>>>>>> autovacuum_max_workers 10: (HEAD) 309 sec VS (with patch) 74 sec
> >>>>>>> ===========================================================================
> >>>>>>>
> >>>>>>> The cases without patch, the scalability of the worker has decreased
> >>>>>>> as the number of tables has increased.
> >>>>>>> In fact, the more workers there are, the longer it takes to complete
> >>>>>>> VACUUM to all tables.
> >>>>>>> The cases with patch, it shows good scalability with respect to the
> >>>>>>> number of workers.
> >>>>>>
> >>>>>> It seems a good performance improvement even without the patch of
> >>>>>> shared memory based stats collector.
> >>
> >> Sounds great!
> >>
> >>
> >>>>>>
> >>>>>>>
> >>>>>>> Note that perf top results showed that hash_search_with_hash_value,
> >>>>>>> hash_seq_search and
> >>>>>>> pgstat_read_statsfiles are dominant during VACUUM in all patterns,
> >>>>>>> with or without the patch.
> >>>>>>>
> >>>>>>> Therefore, there is still a need to find ways to optimize the reading
> >>>>>>> of large amounts of stats.
> >>>>>>> However, this patch is effective in its own right, and since there are
> >>>>>>> only a few parts to modify,
> >>>>>>> I think it should be able to be applied to current (preferably
> >>>>>>> pre-v13) PostgreSQL.
> >>>>>>
> >>>>>> +1
> >>>>>>
> >>>>>> +
> >>>>>> + /* We might be better to refresh stats */
> >>>>>> + use_existing_stats = false;
> >>>>>> }
> >>>>>> + else
> >>>>>> + {
> >>>>>>
> >>>>>> - heap_freetuple(classTup);
> >>>>>> + heap_freetuple(classTup);
> >>>>>> + /* The relid has already vacuumed, so we might be better to
> >>>>>> use exiting stats */
> >>>>>> + use_existing_stats = true;
> >>>>>> + }
> >>>>>>
> >>>>>> With that patch, the autovacuum process refreshes the stats in the
> >>>>>> next check if it finds out that the table still needs to be vacuumed.
> >>>>>> But I guess it's not necessarily true because the next table might be
> >>>>>> vacuumed already. So I think we might want to always use the existing
> >>>>>> for the first check. What do you think?
> >>>>> Thanks for your comment.
> >>>>>
> >>>>> If we assume the case where some workers vacuum on large tables
> >>>>> and a single worker vacuum on small tables, the processing
> >>>>> performance of the single worker will be slightly lower if the
> >>>>> existing statistics are checked every time.
> >>>>>
> >>>>> In fact, at first I tried to check the existing stats every time,
> >>>>> but the performance was slightly worse in cases with a small number of workers.
> >>
> >> Do you have this benchmark result?
> >>
> >>
> >>>>> (Checking the existing stats is lightweight , but at high frequency,
> >>>>> it affects processing performance.)
> >>>>> Therefore, at after refresh statistics, determine whether autovac
> >>>>> should use the existing statistics.
> >>>>
> >>>> Yeah, since the test you used uses a lot of small tables, if there are
> >>>> a few workers, checking the existing stats is unlikely to return true
> >>>> (no need to vacuum). So the cost of existing stats check ends up being
> >>>> overhead. Not sure how slow always checking the existing stats was,
> >>>> but given that the shared memory based stats collector patch could
> >>>> improve the performance of refreshing stats, it might be better not to
> >>>> check the existing stats frequently like the patch does. Anyway, I
> >>>> think it’s better to evaluate the performance improvement with other
> >>>> cases too.
> >>> Yeah, I would like to see how much the performance changes in other cases.
> >>> In addition, if the shared-based-stats patch is applied, we won't need to reload
> >>> a huge stats file, so we will just have to check the stats on
> >>> shared-mem every time.
> >>> Perhaps the logic of table_recheck_autovac could be simpler.
> >>>
> >>>>> BTW, I found some typos in comments, so attache a fixed version.
> >>
> >> The patch adds some duplicated codes into table_recheck_autovac().
> >> It's better to make the common function performing them and make
> >> table_recheck_autovac() call that common function, to simplify the code.
> > Thanks for your comment.
> > Hmm.. I've cut out the duplicate part.
> > Attach the patch.
> > Could you confirm that it fits your expecting?
>
> Yes, thanks for updataing the patch! Here are another review comments.
>
> + shared = pgstat_fetch_stat_dbentry(InvalidOid);
> + dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
>
> When using the existing stats, ISTM that these are not necessary and
> we can reuse "shared" and "dbentry" obtained before. Right?
Yeah, but unless autovac_refresh_stats() is called, these functions
read the information from the
local hash table without re-read the stats file, so the process is very light.
Therefore, I think, it is better to keep the current logic to keep the
code simple.

>
> + /* We might be better to refresh stats */
> + use_existing_stats = false;
>
> I think that we should add more comments about why it's better to
> refresh the stats in this case.
>
> + /* The relid has already vacuumed, so we might be better to use existing stats */
> + use_existing_stats = true;
>
> I think that we should add more comments about why it's better to
> reuse the stats in this case.
I added comments.

Attache the patch.

Best regards,

>
> Regards,
>
>
> --
> Fujii Masao
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION

--
Tatsuhito Kasahara
kasahara.tatsuhito _at_ gmail.com

Attachment Content-Type Size
v4_mod_table_recheck_autovac.patch application/octet-stream 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-11-29 14:09:48 Re: proposal: possibility to read dumped table's name from file
Previous Message Michael Paquier 2020-11-29 07:52:59 Re: Printing LSN made easy