Re: Autovacuum on partitioned table (autoanalyze)

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: yuzuko <yuzukohosoya(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>
Subject: Re: Autovacuum on partitioned table (autoanalyze)
Date: 2020-04-30 06:56:00
Message-ID: CA+HiwqGASrBs7DQuo6yNOYqQGKLXJh7RfRKUe9-vjLaEXxLUmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 25, 2020 at 11:13 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> On Wed, Mar 18, 2020 at 11:30:39AM -0500, Justin Pryzby wrote:
> > In the past, I think there's was talk that maybe someone would invent a clever
> > way to dynamically combine all the partitions' statistics, so analyzing the
> > parent wasn't needed. [...]
>
> I happened across the thread I was referring to:
> https://www.postgresql.org/message-id/7363.1426537103@sss.pgh.pa.us
>
> I'm not opposed to doing things the currently-proposed way (trigger analyze of
> partitioned tables based on updates, same as nonpartitioned tables), but we
> should think if it's worth doing something totally different, like what Tom
> proposed.
>
> Robert had concerns that it would increase planning time. I imagine that
> argument is even stronger now, since PG12 has *less* planning time for large
> heirarchies (428b260f8) and advertizes support for "thousands" of partitions.
>
> Tom said:
> > we would automatically get statistics that account for
> > partitions being eliminated by constraint exclusion, because only the
> > non-eliminated partitions are present in the appendrel. And second,
>
> That's a pretty strong benefit. I don't know if there's a good way to support
> both(either) ways of doing things. Like maybe a reloption that allows
> triggering autovacuum on partitioned tables, but if no statistics exist on a
> partitioned table, then the planner would dynamically determine the selectivity
> by decending into child statistics (Tom's way). I think the usual way this
> would play out is that someone with a small partition heirarchies would
> eventually complain about high planning time and then we'd suggest implementing
> a manual ANALYZE job.
>
> I'm not sure it's good to support two ways anyway, since 1) I think that gives
> different (better) statistics Tom's way (due to excluding stats of excluded
> partitions); 2) there's not a good way to put an ANALYZE job in place and then
> get rid of parent stats (have to DELETE FROM pg_statistic WHERE
> starelid='...'::regclass; 3) if someone implements an ANALYZE job, but they
> disable it or it stops working then they have outdated stats forever;

Thanks for sharing that thread, had not seen it before.

I remember discussing with Alvaro and Hosoya-san an approach of
generating the whole-tree pg_statistics entries by combining the
children's entries, not during planning as the linked thread
discusses, but inside autovacuum. The motivation for that design was
the complaint that we scan the children twice with the current method
of generating whole-tree statistics -- first to generate their own
statistics and then again to generate the parent's.

Aside from how hard it would be to actually implement, that approach
also doesn't address the concern about when to generate the whole-tree
statistics. Because the linked thread mentions getting rid of the
whole-tree statistics altogether, there is no such concern if we go
its way. Although I do agree with Robert's assertion on that thread
that making every query on a parent a bit slower would not be a good
compromise.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-04-30 07:00:49 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Masahiko Sawada 2020-04-30 06:38:04 Re: Fixes for two separate bugs in nbtree VACUUM's page deletion