Re: autovacuum big table taking hours and sometimes seconds

From: Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: autovacuum big table taking hours and sometimes seconds
Date: 2019-02-07 11:55:34
Message-ID: CA+t6e1kH8KtHDEWHVnGq3As1+E7Wut0qk9M-W5G+0L-6EVYgrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Just to make sure that I understood :
-By increasing the cost_limit or decreasing the cost of the page_cost we
can decrease the time it takes the autovacuum process to vacuum a specific
table.
-The vacuum threshold/scale are used to decide how often the table will be
vacuum and not how long it should take.

I have 3 questions :
1)To what value do you recommend to increase the vacuum cost_limit ? 2000
seems reasonable ? Or maybe its better to leave it as default and assign a
specific value for big tables ?
2)When the autovacuum reaches the cost_limit while trying to vacuum a
specific table, it wait nap_time seconds and then it continue to work on
the same table ?
3)So in case I have a table that keeps growing (not fast because I set the
vacuum_scale_factor to 0 and the autovacuum_vacuum_threshold to 10000). If
the table keep growing it means I should try to increase the cost right ?
Do you see any other option ? The table represent sessions of my system so
basically from my point of view I should have almost the same amount of
sessions every day and the table shouldn't grow dramatically but before
changing the vacuum threshold/factor it happened. As I mentioned in my
first comment there is a byte column and therefore the toasted table is the
problematic here.

‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-0:34 מאת ‪David Rowley‬‏ <‪
david(dot)rowley(at)2ndquadrant(dot)com‬‏>:‬

> On Thu, 7 Feb 2019 at 02:34, Mariel Cherkassky
> <mariel(dot)cherkassky(at)gmail(dot)com> wrote:
> > As I said, I set the next settings for the toasted table :
> >
> > alter table orig_table set (toast.autovacuum_vacuum_scale_factor = 0);
> >
> > alter table orig_table set (toast.autovacuum_vacuum_threshold =10000);
>
> These settings don't control how fast auto-vacuum runs, just when it
> should run.
>
> > Can you explain a little bit more why you decided that the autovacuum
> spent it time on sleeping ?
>
> Yeah, if you look at the following settings.
>
> vacuum_cost_limit | 200
> vacuum_cost_page_dirty | 20
> vacuum_cost_page_hit | 1
> vacuum_cost_page_miss | 10
> autovacuum_vacuum_cost_delay | 20ms
>
> I've tagged on the default setting for each of these. Both vacuum and
> auto-vacuum keep score of how many points they've accumulated while
> running. 20 points for dirtying a page, 10 for a read that's not found
> to be in shared_buffers, 1 for reading a buffer from shared buffers.
> When vacuum_cost_limit points is reached (or
> autovacuum_vacuum_cost_limit if not -1) auto-vacuum sleeps for
> autovacuum_vacuum_cost_delay, normal manual vacuums sleep for
> vacuum_cost_delay.
>
> In one of the log entries you saw:
>
> > buffer usage: 4808221 hits, 6404148 misses, 6152603 dirtied
> > avg read rate: 2.617 MiB/s, avg write rate: 2.514 MiB/s
> > system usage: CPU 148.65s/70.06u sec elapsed 19119.55 sec
>
> Doing a bit of maths to see how much time that vacuum should have slept
> for:
>
> postgres=# select (4808221 * 1 + 6404148 * 10 + 6152603 * 20) / 200.0
> * 20 / 1000;
> ?column?
> --------------------
> 19190.176100000000
>
> That's remarkably close to the actual time of 19119.55 sec. If you do
> the same for the other 3 vacuums then you'll see the same close match.
>
> > I see the autovacuum statistics from the logs, how can I check that the
> workers are busy very often ?
>
> It would be nice if there was something better, but periodically doing:
>
> SELECT count(*) FROM pg_stat_activity where query like 'autovacuum%';
>
> will work.
>
> > My vacuum limit is 200 right now, basically If vacuum runs on my toasted
> table and reached 200 but it didnt finish to clean all the dead tuples,
> after the nap, should it continue cleaning it or wait until the
> vacuum_threshold hit again ?
>
> You're confusing nap time is something else, Maybe you're confusing
> that with speed of vacuum? Napping is just the time auto-vacuum will
> wait between checking for new tables to work on. Having the
> auto-vacuum run so slowly is a probable cause of still having dead
> tuples after the vacuum... likely because they became dead after
> vacuum started.
>
> I'd recommend reading the manual or Tomas Vondra's blog about vacuum
> costs. It's not overly complex, once you understand what each of the
> vacuum settings does.
>
> --
> David Rowley http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kristi Anderson 2019-02-07 16:08:45 Managing High Availability in PostgreSQL – Part I
Previous Message Peter Eisentraut 2019-02-07 10:45:58 Re: Transaction size and Wal2Json