Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

From: David Gould <daveg(at)sonic(dot)net>
To: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Alina Alexeeva <alexeeva(at)adobe(dot)com>, Ullas Lakkur Raghavendra <lakkurra(at)adobe(dot)com>
Subject: Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
Date: 2018-03-02 22:00:37
Message-ID: 20180302140037.00af305d@engels
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2 Mar 2018 18:47:44 +0300
Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru> wrote:

> The calculation I made for the first step applies to the next steps too,
> with minor differences. So, the estimate increases at each step. Just
> out of interest, I plotted the reltuples for 60 steps, and it doesn't
> look like it's going to converge anytime soon (see attached).
> Looking at the formula, this overshoot term is created when we multiply
> the old density by the new number of pages. I'm not sure how to fix
> this. I think we could average the number of tuples, not the densities.
> The attached patch demonstrates what I mean.

I'm confused at this point, I provided a patch that addresses this and a
test case. We seem to be discussing everything as if we first noticed the
issue. Have you reviewed the patch and and attached analysis and tested it?
Please commment on that?

Thanks.

Also, here is a datapoint that I found just this morning on a clients
production system:

INFO: "staging_xyz": scanned 30000 of pages, containing 63592 live rows and 964346 dead rows;
30000 rows in sample, 1959918155 estimated total rows

# select (50000953.0/30000*63592)::int as nrows;
nrows
-----------
105988686

This tables reltuples is 18 times the actual row count. It will never converge
because with 50000953 pages analyze can only adjust reltuples by 0.0006 each time.

It will also almost never get vacuumed because the autovacuum threshold of
0.2 * 1959918155 = 391983631 about 3.7 times larger than the actual row count.

The submitted patch is makes analyze effective in setting reltuples to within
a few percent of the count(*) value.

-dg

--
David Gould daveg(at)sonic(dot)net
If simplicity worked, the world would be overrun with insects.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2018-03-02 22:01:45 Re: Online enabling of checksums
Previous Message Tom Lane 2018-03-02 21:56:32 Re: heap_lock_updated_tuple_rec can leak a buffer refcount