Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Date: 2021-06-08 19:27:11
Message-ID: 20210608192711.GA21312@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 08, 2021 at 12:34:04PM -0500, Justin Pryzby wrote:
> On Tue, Jun 08, 2021 at 12:06:02PM -0400, Alvaro Herrera wrote:
> > On 2021-Jun-06, Justin Pryzby wrote:
> >
> > > However, I also found an autovacuum chewing 100% CPU, and it appears the
> > > problem is actually because autovacuum has locked a page of pg-statistic, and
> > > every other process then gets stuck waiting in the planner. I checked a few
> > > and found these:
> >
> > Hmm ... I wonder if this could be related to commits d9d076222f5b,
> > c98763bf51bf, etc. I don't have any connecting thoughts other than the
> > tuple visibility code being involved. Do you see any procs with the
> > PROC_IN_SAFE_IC flag set?
>
> I do have a reindex script which runs CIC, and it does looks suspicious.
>
> <<Mon Jun 7 22:00:54 CDT 2021: starting db=ts
> ...
> Mon Jun 7 22:02:57 CDT 2021: ts: pg_statio_all_tables_snap: pg_statio_all_tables_snap_t_idx(reindex non-partitioned)...
> ERROR: canceling statement due to statement timeout
> reindex: warning, dropping invalid/unswapped index: pg_statio_all_tables_snap_t_idx_ccnew
> Mon Jun 7 23:02:57 CDT 2021: ts: pg_statio_all_tables_snap: pg_toast.pg_toast_33011_index(reindex system)...
> Mon Jun 7 23:02:57 CDT 2021: ts: pg_statio_all_indexes_snap: pg_statio_all_indexes_snap_t_idx(reindex non-partitioned)...
> ERROR: canceling statement due to statement timeout
> reindex: warning, dropping invalid/unswapped index: pg_statio_all_indexes_snap_t_idx_ccnew
> Tue Jun 8 00:02:57 CDT 2021: ts: pg_statio_all_indexes_snap: pg_toast.pg_toast_33020_index(reindex system)...
> Tue Jun 8 01:21:20 CDT 2021: ts: permissions: perm_group_idx(reindex non-partitioned)...
> ERROR: canceling statement due to statement timeout
> reindex: warning, dropping invalid/unswapped index: perm_group_idx_ccnew
> Tue Jun 8 02:21:20 CDT 2021: ts: permissions: perm_user_idx(reindex non-partitioned)...
> ERROR: canceling statement due to statement timeout
> reindex: warning, dropping invalid/unswapped index: perm_user_idx_ccnew
> Tue Jun 8 03:21:20 CDT 2021: ts: permissions: pg_toast.pg_toast_33577_index(reindex system)...
> Tue Jun 8 03:21:20 CDT 2021: ts: patchfiles: patchfiles_filename_idx(reindex non-partitioned)...
> ERROR: canceling statement due to statement timeout
> reindex: warning, dropping invalid/unswapped index: patchfiles_filename_idx_ccnew
> Tue Jun 8 04:21:21 CDT 2021: ts: patchfiles: patchfiles_pkey(reindex non-partitioned)...
> ERROR: canceling statement due to statement timeout
> reindex: warning, dropping invalid/unswapped index: patchfiles_pkey_ccnew
>
> => It's strange that these timed out, since the statio tables are less than
> 15MB, and permissions and patchfiles are both under 100kB.
>
> And it seems to say that it time out after less than 1sec.

Oops, no: it timed out after 3600sec, as requested.

> They're running this:
> | PGOPTIONS="--deadlock_timeout=333ms -cstatement-timeout=3600s" psql -c "REINDEX INDEX CONCURRENTLY $i"
> And if it times out, it then runs: $PSQL "DROP INDEX CONCURRENTLY $bad"
...
> $ date -d @1623121264
> Mon Jun 7 22:01:04 CDT 2021

Which is probably because the reindex was waiting for the vacuum process to
finish (or maybe waiting on the page that vacuum had locked?).

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2021-06-08 19:28:28 Re: Make unlogged table resets detectable
Previous Message David Christensen 2021-06-08 19:25:49 Re: DELETE CASCADE