Re: Berserk Autovacuum (let's save next Mandrill)

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Andres Freund <andres(at)anarazel(dot)de>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2020-03-29 02:29:51
Message-ID: CAApHDvrX=Zs5sR9V2u390gyc=3CCmVJGLO8+An6NA5Fs8CtOSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 29 Mar 2020 at 10:30, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Sun, 29 Mar 2020 at 06:26, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > > On Sat, 28 Mar 2020 at 17:12, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> > >> In the light of that, I have no objections.
> >
> > > Thank you. Pushed.
> >
> > It seems like this commit has resulted in some buildfarm instability:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lousyjack&dt=2020-03-28%2006%3A33%3A02
> >
> > apparent change of plan
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2020-03-28%2009%3A20%3A05
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2020-03-28%2013%3A20%3A05
> >
> > unstable results in stats_ext test
>
> Yeah, thanks for pointing that out. I'm just doing some tests locally
> to see if I can recreate those results after vacuuming the mcv_list
> table, so far I'm unable to.

I'm considering pushing the attached to try to get some confirmation
that additional autovacuums are the issue. However, I'm not too sure
it's a wise idea to as I can trigger an additional auto-vacuum and
have these new tests fail with make installcheck after setting
autovacuum_naptime to 1s, but I'm not getting the other diffs
experienced by lousyjack and petalura. The patch may just cause more
failures without proving much, especially so with slower machines.

The other idea I had was just to change the
autovacuum_vacuum_insert_threshold relopt to -1 for the problem tables
and see if that stabilises things.

Yet another option would be to see if reltuples varies between runs
and ditch the autovacuum_count column from the attached. There does
not appear to be any part of the tests which would cause any dead
tuples in any of the affected relations, so I'm unsure why reltuples
would vary between what ANALYZE and VACUUM would set it to.

I'm still thinking. Input welcome.

David

Attachment Content-Type Size
temp_telemetry_checks_for_unstable_tests.patch application/octet-stream 5.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2020-03-29 02:47:49 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Bruce Momjian 2020-03-29 01:44:02 Re: Improving connection scalability: GetSnapshotData()