Re: Berserk Autovacuum (let's save next Mandrill)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2020-03-18 01:58:53
Message-ID: CAAaqYe855Zva7N4SNtDZUuwPBBgoiwRUEm93pF9JkrjPrhQt6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 17, 2020 at 9:03 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2020-03-17 20:42:07 +0100, Laurenz Albe wrote:
> > > I think Andres was thinking this would maybe be an optimization independent of
> > > is_insert_only (?)
> >
> > I wasn't sure.
>
> I'm not sure myself - but I'm doubtful that using a 0 min age by default
> will be ok.
>
> I was trying to say (in a later email) that I think it might be a good
> compromise to opportunistically freeze if we're dirtying the page
> anyway, but not optimize WAL emission etc. That's a pretty simple
> change, and it'd address a lot of the potential performance regressions,
> while still freezing for the "first" vacuum in insert only workloads.

If we have truly insert-only tables, then doesn't vacuuming with
freezing every tuple actually decrease total vacuum cost (perhaps
significantly) since otherwise every vacuum keeps having to scan the
heap for dead tuples on pages where we know there are none? Those
pages could conceptually be frozen and ignored, but are not frozen
because of the default behavior, correct?

We have tables that log each change to a business object (as I suspect
many transactional workloads do), and I've often thought that
immediately freeze every page as soon as it fills up would be a real
win for us.

If that's all true, it seems to me that removing that part of the
patch significantly lowers its value.

If we opportunistically freeze only if we're already dirtying a page,
would that help a truly insert-only workload? E.g., are there hint
bits on the page that would need to change the first time we vacuum a
full page with no dead tuples? I would have assumed the answer was
"no" (since if so I think it would follow that _all_ pages need
updated the first time they're vacuumed?). But if that's the case,
then this kind of opportunistic freezing wouldn't help this kind of
workload. Maybe there's something I'm misunderstanding about how
vacuum works though.

Thanks,
James

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-03-18 01:58:56 Re: Collation versioning
Previous Message David Rowley 2020-03-18 01:56:08 Re: [PATCH] Erase the distinctClause if the result is unique by definition