Re: autovacuum not prioritising for-wraparound tables

From: Jim Nasby <jim(at)nasby(dot)net>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: autovacuum not prioritising for-wraparound tables
Date: 2013-01-30 21:05:52
Message-ID: 51098B30.9080208@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/25/13 11:56 AM, Christopher Browne wrote:
> With a little bit of noodling around, here's a thought for a joint function
> that I*think* has reasonably common scales:
>
> f(deadtuples, relpages, age) =
> deadtuples/relpages + e ^ (age*ln(relpages)/2^32)

Be careful with dead/relpages, because dead tuples increase relpages as
well. The effect is extremely noticeable on frequently hit tables that
need to be kept small. If you want to have a deadtuples/size metric, I
think it would be far better to do deadtuples/non_bloated_table_size.

Someone else in the thread mentioned that what we really need to be
watching aren't raw values, but trends. Or you can think of it as
watching first (or even second) derivatives if you like. I couldn't
agree more. I believe there are several parts of Postgres that end up
with a bunch of hard to tune GUCs specifically because we're measuring
the wrong things.

Take freezing for example. Since the only reason to freeze is XID wrap
then the *ideal* time to start a freeze vacuum on a table is so that the
vacuum would end *exactly* as we were about to hit XID wrap.

Obviously that's a completely impractical goal to hit, but notice the
simplicity of the goal: we only care about the vacuum ending right
before we'd hit XID wrap. The only way to do that is to monitor how fast
vacuums are running, how fast XIDs are being consumed, and how quickly
the oldest XID in each table is advancing. Notice that all of those
measurements are time derivatives.

From a more practical standpoint, I think it would be extremely useful
to have a metric that showed how quickly a table churned. Something like
dead tuples per time period. Comparing that to the non-bloated table
size should give a very strong indication of how critical frequent
vacuums on that table are.

I don't have a good metric in mind for freeze right now, but I do want
to mention a use case that I don't think has come up before. When
building a londiste slave (and presumably all the other trigger
replication systems suffer from this), each table is copied over in a
single transaction, and then updates start flowing in for that table.
That can easily result in a scenario where you have an enormous volume
of tuples that will all need freezing at almost exactly the same time.
It would be nice if we could detect such a condition and freeze those
tuples over time, instead of trying to freeze all of them in one shot.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2013-01-30 21:11:11 Re: Hm, table constraints aren't so unique as all that
Previous Message Jim Nasby 2013-01-30 20:12:16 Re: Hm, table constraints aren't so unique as all that