Re: Add important info about ANALYZE after create Functional Index

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: Add important info about ANALYZE after create Functional Index
Date: 2020-10-28 23:02:11
Message-ID: 20201028230211.dxbjw6mffq37mudu@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 28, 2020 at 05:43:08PM -0300, Fabrízio de Royes Mello wrote:
>On Wed, Oct 28, 2020 at 4:35 PM Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
>wrote:
>>
>> I don't think anyone proposed to do this through autovacuum. There was a
>> reference to auto-analyze but I think that was meant as 'run analyze
>> automatically.' Which would work in transactions just fine, I think.
>>
>
>Maybe I was not very clear at the beginning so will try to clarify my
>thoughts:
>
>1) We should add notes on our docs about the need to issue ANALYZE after
>creating indexes using expressions and create extended statistics. Nikolay
>sent a patch upthread and we can work on it and back patch.
>

+1

>2) REINDEX CONCURRENTLY does not keep statistics (pg_statistc) like a
>regular REINDEX for indexes using expressions and to me it's a bug. Michael
>pointed out upthread that maybe we should rework a bit
>index_concurrently_swap() to copy statistics from old index to new one.
>

Yeah. Not sure it counts as a bug, but I see what you mean - it's
definitely an unexpected/undesirable difference in behavior between
plain REINDEX and concurrent one.

>
>> But I agree it'd likely be a more complicated patch than it might seem
>> at first glance.
>>
>
>If we think about a way to kick AutoAnalyze for sure it will be a more
>complicated task but IMHO for now we can do it simply just by copying
>statistics like I mentioned above.
>

I very much doubt just we can rely on autoanalyze here. For one, it'll
have issues with transactions, as Tom already pointed out elsewhere in
this thread. So if you do a reindex after a bulk load in a transaction,
followed by some report queries, autoanalyze is not going to help.

But it has another issue - there may not be any free autovacuum workers,
so it'd have to wait for unknown amount of time. In fact, it'd have to
wait for the autovacuum worker to actually do the analyze, otherwise we
could still have unpredictable behavior for queries immediately after
the REINDEX, even outside transactions. That's not good, so this would
have to do an actual analyze I think.

But as Tom pointed out, the automatic analyze may be against wishes of
some users, and there are other similar cases that don't trigger analyze
(CREATE STATISTICS). So not sure about this.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Victor Yegorov 2020-10-28 23:05:01 Re: Deleting older versions in unique indexes to avoid page splits
Previous Message Fabrízio de Royes Mello 2020-10-28 20:43:08 Re: Add important info about ANALYZE after create Functional Index