Re: Berserk Autovacuum (let's save next Mandrill)

From: Andres Freund <andres(at)anarazel(dot)de>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2020-03-18 17:08:47
Message-ID: 20200318170847.kxcxdascusttvtvt@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-17 21:58:53 -0400, James Coleman wrote:
> On Tue, Mar 17, 2020 at 9:03 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > Hi,
> >
> > On 2020-03-17 20:42:07 +0100, Laurenz Albe wrote:
> > > > I think Andres was thinking this would maybe be an optimization independent of
> > > > is_insert_only (?)
> > >
> > > I wasn't sure.
> >
> > I'm not sure myself - but I'm doubtful that using a 0 min age by default
> > will be ok.
> >
> > I was trying to say (in a later email) that I think it might be a good
> > compromise to opportunistically freeze if we're dirtying the page
> > anyway, but not optimize WAL emission etc. That's a pretty simple
> > change, and it'd address a lot of the potential performance regressions,
> > while still freezing for the "first" vacuum in insert only workloads.
>
> If we have truly insert-only tables, then doesn't vacuuming with
> freezing every tuple actually decrease total vacuum cost (perhaps
> significantly) since otherwise every vacuum keeps having to scan the
> heap for dead tuples on pages where we know there are none? Those
> pages could conceptually be frozen and ignored, but are not frozen
> because of the default behavior, correct?

Yes.

> If that's all true, it seems to me that removing that part of the
> patch significantly lowers its value.

Well, perfect sometimes is the enemy of the good. We gotta get something
in, and having some automated vacuuming for insert mostly/only tables is
a huge step forward. And avoiding regressions is an important part of
doing so.

I outlined the steps we could take to allow for more aggressive
vacuuming upthread.

> If we opportunistically freeze only if we're already dirtying a page,
> would that help a truly insert-only workload?

Yes.

> E.g., are there hint bits on the page that would need to change the
> first time we vacuum a full page with no dead tuples?

Yes. HEAP_XMIN_COMMITTED.

> I would have assumed the answer was "no" (since if so I think it would
> follow that _all_ pages need updated the first time they're
> vacuumed?).

That is the case. Although they might already be set when the tuples are
accessed for other reasons.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-03-18 17:09:04 Re: proposal: new polymorphic types - commontype and commontypearray
Previous Message Tom Lane 2020-03-18 17:00:12 Re: type of some table storage params on doc