Re: Automatic aggressive vacuum on almost frozen table takes too long

From: Mikhail Balayan <mv(dot)balayan(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Automatic aggressive vacuum on almost frozen table takes too long
Date: 2023-02-20 05:29:16
Message-ID: CAC2oM1ZjGKEwoOU5KQW9YZRJ8SuZtFZ1+cFbcyFVC9=Bgpx6XA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>
> >> Can you run amcheck's bt_index_check() routine against some of the
> >> indexes you've shown? There is perhaps some chance that index
> >> corruption exists and causes VACUUM to take a very long time to delete
> >> index pages. This is pretty much a wild guess, though.

Unfortunately I can't, we haven't enabled this extension yet. And since
this is a production, I'm not ready to turn it on right away. But I can say
that this theory is unlikely, since this problem occurs on different
sites. Here's an example of an output from another site where table size is
46.5 GB (again, data only), indexes 107GB, toast 62MB:

automatic aggressive vacuum of table
"appdbname2.appschemaname.applications": index scans: 1
pages: 0 removed, 6091646 remain, 0 skipped due to pins, 6086395 skipped
frozen
tuples: 2344 removed, 35295654 remain, 0 are dead but not yet removable,
oldest xmin: 213412878
buffer usage: 251980554 hits, 14462331 misses, 18844 dirtied
avg read rate: 12.018 MB/s, avg write rate: 0.016 MB/s
system usage: CPU: user: 7734.43 s, system: 178.98 s, elapsed: 9401.36 s

Here again we see that there are 5251 blocks that need to be cleaned
(6091646 - 6086395), buffer usage is 266461729 blocks or ~ 2 TB and
processing time is 2.5h+.

>> It's possible that VACUUM had to wait a long time for a cleanup lock
> on one individual heap page here

If such a scenario is possible, it makes sense to add information about the
blocking waiting time to the output. Something like:
system usage: CPU: user: 7734.43 s, system: 178.98 s, lock_wait:
1234.56 s, elapsed: 9401.36 s

Mikhail

On Sat, 18 Feb 2023 at 05:35, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Thu, Feb 16, 2023 at 5:40 PM Mikhail Balayan <mv(dot)balayan(at)gmail(dot)com>
> wrote:
> >> >> Do you have any non-btree indexes on the table? Can you show us the
> details of the
> >> >> table, including all of its indexes? In other words, can you show
> "\d applications" output from psql?
> >
> > Only btree indexes. Please find the full table schema below:
>
> It's possible that VACUUM had to wait a long time for a cleanup lock
> on one individual heap page here, which could have added a long delay.
> But...that doesn't seem particularly likely.
>
> Can you run amcheck's bt_index_check() routine against some of the
> indexes you've shown? There is perhaps some chance that index
> corruption exists and causes VACUUM to take a very long time to delete
> index pages. This is pretty much a wild guess, though.
>
> --
> Peter Geoghegan
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Masahiko Sawada 2023-02-20 08:22:50 Re: Support logical replication of DDLs
Previous Message Pavel Stehule 2023-02-20 04:54:01 Re: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5