Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

From: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
To: David Gould <daveg(at)sonic(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Alina Alexeeva <alexeeva(at)adobe(dot)com>, Ullas Lakkur Raghavendra <lakkurra(at)adobe(dot)com>
Subject: Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
Date: 2018-02-28 12:55:19
Message-ID: fa8122c2-ea7a-ff38-49ac-e6616aec38ea@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi David,

I was able to reproduce the problem using your script.
analyze_counts.awk is missing, though.

The idea of using the result of ANALYZE as-is, without additional
averaging, was discussed when vac_estimate_reltuples() was introduced
originally. Ultimately, it was decided not to do so. You can find the
discussion in this thread:
https://www.postgresql.org/message-id/flat/BANLkTinL6QuAm_Xf8teRZboG2Mdy3dR_vw%40mail(dot)gmail(dot)com#BANLkTinL6QuAm_Xf8teRZboG2Mdy3dR_vw(at)mail(dot)gmail(dot)com

The core problem here seems to be that this calculation of moving
average does not converge in your scenario. It can be shown that when
the number of live tuples is constant and the number of pages grows, the
estimated number of tuples will increase at each step. Do you think we
can use some other formula that would converge in this scenario, but
still filter the noise in ANALYZE results? I couldn't think of one yet.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2018-02-28 13:28:41 Re: [PoC PATCH] Parallel dump to /dev/null
Previous Message Anastasia Lubennikova 2018-02-28 12:11:33 Function to track shmem reinit time