Re: n_mod_since_analyze isn't reset at table truncation

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: n_mod_since_analyze isn't reset at table truncation
Date: 2021-03-05 09:07:05
Message-ID: dc844936-cbb6-69c0-73f4-8fbd01774181@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021/03/05 15:59, Julien Rouhaud wrote:
> On Fri, Mar 05, 2021 at 03:31:33PM +0900, Fujii Masao wrote:
>>
>> On 2021/03/04 12:40, Julien Rouhaud wrote:
>>>> In that case, conversely, we want to trigger autoanalyze ASAP because the contents in the table was changed very much?
>>>
>>> We might want, but wouldn't keeping the current n_mod_since_analyze would make
>>> things unpredictable?
>>
>> I don't think so. autoanalyze still works based on the number of
>> modifications since last analyze, so ISTM that's predictable...
>
> If we keep n_mod_since_analyze, autoanalyze might trigger after the first write
> or might wait for a full cycle of writes, depending on the kept value. So yes
> it can make autoanalyze triggers earlier in some cases, but that's not
> predictable from the TRUNCATE even point of view.
>
>>> Also the selectivity estimation functions already take into account the actual
>>> relation size, so the estimates aren't completely off after a truncate.
>>
>> But the statistics is out of date and which may affect the planning badly?
>> Also if generic plan is used, it's not updated until next analyze, i.e.,
>> generic plan created based on old statistics is used for a while.
>>
>> So I'm still not sure why you want to defer autoanalyze in that case.
>
> I don't especially want to defer autoanalyze in that case. But an autoanalyze
> happening quickly after a TRUNCATE is critical for performance, I'd prefer to
> find a way to trigger autoanalyze reliably.

One just idea is to make TRUNCATE increase n_mod_since_analyze by
the number of records to truncate. That is, we treat TRUNCATE
in the same way as "DELETE without WHERE".

If the table has lots of records and is truncated, n_mod_since_analyze
will be increased very much and which would trigger autoanalyze soon.
This might be expected behavior because the statistics collected before
truncate is very "different" from the status of the table after truncate.

OTOH, if the table is very small, TRUNCATE doesn't increase
n_mod_since_analyze so much. So analyze might not be triggered soon.
But this might be ok because the statistics collected before truncate is
not so "different" from the status of the table after truncate.

I'm not sure how much this idea is "reliable" and would be helpful in
practice, though.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-03-05 09:08:19 Re: 011_crash_recovery.pl intermittently fails
Previous Message Magnus Hagander 2021-03-05 09:03:46 Re: PROXY protocol support