Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: David Steele <david(at)pgmasters(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Date: 2018-03-20 20:14:04
Message-ID: 3b3d8eac-c709-0d25-088e-b98339a1b28a@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

So here is an updated version of the patch/fix, addressing the remaining
issues in v3 posted by Tom in November.

The 0001 part is actually a bugfix in bloom and spgist index AM, which
did something like this:

reltuples = IndexBuildHeapScan(...)

result->heap_tuples = result->index_tuples = reltuples;

That is, these two AMs simply used the number of heap rows for the
index. That does not work for partial indexes, of course, where the
correct index reltuples value is likely much lower.

0001 fixes this by tracking the number of actually indexed rows in the
build states, just like in the other index AMs.

A VACUUM or ANALYZE will fix the estimate, of course, but for tables
that are not changing very much it may take quite a while. So I think
this is something we definitely need to back-patch.

The 0002 part is the main part, unifying the definition of reltuples on
three main places:

a) acquire_sample_rows (ANALYZE)
b) lazy_scan_heap (VACUUM)
c) IndexBuildHeapRangeScan (CREATE INDEX)

As the ANALYZE case seems the most constrained, the other two places
were updated to use the same criteria for which rows to include in the
reltuples estimate:

* HEAPTUPLE_LIVE
* HEAPTUPLE_INSERT_IN_PROGRESS (same transaction)
* HEAPTUPLE_DELETE_IN_PROGRESS (not the same trasaction)

This resolves the issue with oscillating reltuples estimates, produced
by VACUUM and ANALYZE (with many non-removable dead tuples).

I've checked all IndexBuildHeapRangeScan callers, and none of them is
using the reltuples estimate for anything except for passing it to
index_update_stats. Aside from the bug fixed in 0001, of course.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-Compute-index_tuples-correctly-in-bloom-and-spgist.patch.gz application/gzip 1.4 KB
0002-Unify-the-definition-of-reltuples-in-VACUUM-ANALYZE-.patch.gz application/gzip 4.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-03-20 20:28:48 Re: configure's checks for --enable-tap-tests are insufficient
Previous Message Christoph Berg 2018-03-20 20:00:41 Re: configure's checks for --enable-tap-tests are insufficient