Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

From: David Gould <daveg(at)sonic(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alina Alexeeva <alexeeva(at)adobe(dot)com>, Ullas Lakkur Raghavendra <lakkurra(at)adobe(dot)com>
Subject: Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
Date: 2018-03-14 01:10:14
Message-ID: 20180313181014.47d9f040@engels
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 13 Mar 2018 11:29:03 -0400
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> David Gould <daveg(at)sonic(dot)net> writes:
> > I have attached the patch we are currently using. It applies to 9.6.8. I
> > have versions for older releases in 9.4, 9.5, 9.6. I fails to apply to 10,
> > and presumably head but I can update it if there is any interest.
>
> > The patch has three main features:
> > - Impose an ordering on the autovacuum workers worklist to avoid
> > the need for rechecking statistics to skip already vacuumed tables.
> > - Reduce the frequency of statistics refreshes
> > - Remove the AutovacuumScheduleLock
>
> As per the earlier thread, the first aspect of that needs more work to
> not get stuck when the worklist has long tasks near the end. I don't
> think you're going to get away with ignoring that concern.

I agree that the concern needs to be addressed. The other concern is that
sorting by oid is fairly arbitrary, the essential part is that there is
some determinate order.

> Perhaps we could sort the worklist by decreasing table size? That's not
> an infallible guide to the amount of time that a worker will need to
> spend, but it's sure safer than sorting by OID.

That is better. I'll modify it to also prioritize tables that have relpages
and reltuples = 0 which usually means the table has no stats at all. Maybe use
oid to break ties.

> Alternatively, if we decrease the frequency of stats refreshes, how
> much do we even need to worry about reordering the worklist?

The stats refresh in the current scheme is needed to make sure that two
different workers don't vacuum the same table in close succession. It
doesn't actually work, and it costs the earth. The patch imposes an ordering
to prevent workers trying to claim recently vacuumed tables. This removes the
need for the stats refresh.

> In any case, I doubt anyone will have any appetite for back-patching
> such a change. I'd recommend that you clean up your patch and rebase
> to HEAD, then submit it into the September commitfest (either on a
> new thread or a continuation of the old #13750 thread, not this one).

I had in mind to make a more comprehensive patch to try to make utovacuum
more responsive when there are lots of tables, but was a bit shy of the
reception. I'll try again with this one (in a new thread) based on the
suggestions. Thanks!

-dg

--
David Gould daveg(at)sonic(dot)net
If simplicity worked, the world would be overrun with insects.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-03-14 01:27:04 Re: PATCH: Configurable file mode mask
Previous Message Bossart, Nathan 2018-03-14 00:53:08 Re: [HACKERS] pg_upgrade to clusters with a different WAL segment size