Re: Berserk Autovacuum (let's save next Mandrill)

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2019-03-27 23:36:24
Message-ID: CAKJS1f86--kkm7m2ZJDWZsJ3mVfxTKTnXJ8z70XrDwL0to8HGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 28 Mar 2019 at 11:01, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>
> On 2019-Mar-28, Darafei "Komяpa" Praliaskouski wrote:
> > "Nearing wraparound" is too late already. In Amazon, reading table from gp2
> > after you exhausted your IOPS burst budget is like reading a floppy drive,
> > you have to freeze a lot earlier than you hit several terabytes of unfrozen
> > data, or you're dead like Mandrill's Search and Url tables from the link I
> > shared.
>
> OK, then start freezing tuples in the cheap mode (skip index updates)
> earlier than that. I suppose a good question is when to start.

I thought recently that it would be good to have some sort of
pro-active auto-vacuum mode that made use of idle workers. Probably
there would need to be some mode flag that mentioned which workers
were in proactive mode so that these could be cancelled when more
pressing work came in. I don't have an idea exactly of what
"pro-active" would actually be defined as, but I know that when the
single transaction ID is consumed that causes terra bytes of tables to
suddenly need an anti-wraparound vacuum, then it's not a good
situation to be in. Perhaps getting to some percentage of
autovacuum_freeze_max_age could be classed as pro-active.

> I wonder if Mandrill's problem is related to Mailchimp raising the
> freeze_max_age to a point where autovac did not have enough time to
> react with an emergency vacuum. If you keep raising that value because
> the vacuums cause problems for you (they block DDL), there's something
> wrong.

I have seen some very high autovacuum_freeze_max_age settings
recently. It would be interesting to know what they had theirs set to.
I see they mentioned "Search and Url tables". I can imagine "search"
never needs any UPDATEs, so quite possibly those were append-only, in
which case the anti-wraparound vacuum would have had quite a lot of
work on its hands since possibly every page needed frozen. A table
receiving regular auto-vacuums from dead tuples would likely get some
pages frozen during those.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-03-27 23:49:02 Re: Should the docs have a warning about pg_stat_reset()?
Previous Message Chapman Flack 2019-03-27 23:27:23 Re: Fix XML handling with DOCTYPE