Re: Autovacuum on partitioned table (autoanalyze)

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: yuzuko <yuzukohosoya(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>
Subject: Re: Autovacuum on partitioned table (autoanalyze)
Date: 2020-04-25 14:13:20
Message-ID: 20200425141320.GR28974@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 18, 2020 at 11:30:39AM -0500, Justin Pryzby wrote:
> In the past, I think there's was talk that maybe someone would invent a clever
> way to dynamically combine all the partitions' statistics, so analyzing the
> parent wasn't needed. [...]

I happened across the thread I was referring to:
https://www.postgresql.org/message-id/7363.1426537103@sss.pgh.pa.us

I'm not opposed to doing things the currently-proposed way (trigger analyze of
partitioned tables based on updates, same as nonpartitioned tables), but we
should think if it's worth doing something totally different, like what Tom
proposed.

Robert had concerns that it would increase planning time. I imagine that
argument is even stronger now, since PG12 has *less* planning time for large
heirarchies (428b260f8) and advertizes support for "thousands" of partitions.

Tom said:
> we would automatically get statistics that account for
> partitions being eliminated by constraint exclusion, because only the
> non-eliminated partitions are present in the appendrel. And second,

That's a pretty strong benefit. I don't know if there's a good way to support
both(either) ways of doing things. Like maybe a reloption that allows
triggering autovacuum on partitioned tables, but if no statistics exist on a
partitioned table, then the planner would dynamically determine the selectivity
by decending into child statistics (Tom's way). I think the usual way this
would play out is that someone with a small partition heirarchies would
eventually complain about high planning time and then we'd suggest implementing
a manual ANALYZE job.

I'm not sure it's good to support two ways anyway, since 1) I think that gives
different (better) statistics Tom's way (due to excluding stats of excluded
partitions); 2) there's not a good way to put an ANALYZE job in place and then
get rid of parent stats (have to DELETE FROM pg_statistic WHERE
starelid='...'::regclass; 3) if someone implements an ANALYZE job, but they
disable it or it stops working then they have outdated stats forever;

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-04-25 14:36:06 Re: Anybody want to check for Windows timezone updates?
Previous Message Tomas Vondra 2020-04-25 12:59:07 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays