Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: David Gould <daveg(at)sonic(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alina Alexeeva <alexeeva(at)adobe(dot)com>, Ullas Lakkur Raghavendra <lakkurra(at)adobe(dot)com>
Subject: Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
Date: 2018-03-12 22:05:01
Message-ID: 27818.1520892301@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Re-reading that thread, it seems like we should have applied Jeff's
> initial trivial patch[1] (to not hold AutovacuumScheduleLock across
> table_recheck_autovac) rather than waiting around for a super duper
> improvement to get agreed on. I'm a bit tempted to go do that;
> if nothing else, it seems simple enough to back-patch, unlike most
> of the rest of what was discussed.

Jeff mentioned that that patch wasn't entirely trivial to rebase over
15739393e, and I now see why: in the code structure as it stands,
we don't know soon enough whether the table is shared. In the
attached rebase, I solved that with the brute-force method of just
reading the pg_class tuple an extra time. I think this is probably
good enough, really. I thought about having do_autovacuum pass down
the tuple to table_recheck_autovac so as to not increase the net
number of syscache fetches, but I'm slightly worried about whether
we could be passing a stale pg_class tuple to table_recheck_autovac
if we do it like that. OTOH, that just raises the question of why
we are doing any of this while holding no lock whatever on the target
table :-(. I'm content to leave that question to the major redesign
that was speculated about in the bug #13750 thread.

This also corrects the inconsistency that at the bottom, do_autovacuum
clears wi_tableoid while taking AutovacuumLock, not AutovacuumScheduleLock
as is the documented lock for that field. There's no actual bug there,
but cleaning this up might provide a slight improvement in concurrency
for operations that need AutovacuumLock but aren't looking at other
processes' wi_tableoid. (Alternatively, we could decide that there's
no real need anymore for the separate AutovacuumScheduleLock, but that's
more churn than I wanted to deal with here.)

I think this is OK to commit/backpatch --- any objections?

regards, tom lane

Attachment Content-Type Size
vac_move_lock-2.patch text/x-diff 5.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-03-12 22:20:23 explain with costs in subselect.sql
Previous Message Christos Maris 2018-03-12 21:34:10 Re: Google Summer of Code: Potential Applicant