Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Gunther <raj(at)gusw(dot)net>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.
Date: 2019-02-24 06:41:08
Message-ID: CAKJS1f_kxF=3K=D8MXa2NwdycwKJsNqVXNXyeDWv3t9f3p6J2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, 24 Feb 2019 at 10:06, Gunther <raj(at)gusw(dot)net> wrote:
> I am using an SQL queue for distributing work to massively parallel workers. Workers come in servers with 12 parallel threads. One of those worker sets handles 7 transactions per second. If I add a second one, for 24 parallel workers, it scales to 14 /s. Even a third, for 36 parallel workers, I can add to reach 21 /s. If I try a fourth set, 48 workers, I end up in trouble. But that isn't even so much my problem rather than the fact that in short time, the performance will deteriorate, and it looks like that is because the queue index deteriorates and needs a REINDEX.

It sounds very much like auto-vacuum is simply unable to keep up with
the rate at which the table is being updated. Please be aware, that
by default, auto-vacuum is configured to run fairly slowly so as not
to saturate low-end machines.

vacuum_cost_limit / autovacuum_vacuum_cost limit control how many
"points" the vacuum process can accumulate before it will perform an
autovacuum_vacuum_cost_delay / vacuum_cost_delay.

Additionally, after an auto-vacuum run completes it will wait for
autovacuum_naptime before checking again if any tables require some
attention.

I think you should be monitoring how many auto-vacuums workers are
busy during your runs. If you find that the "queue" table is being
vacuumed almost constantly, then you'll likely want to increase
vacuum_cost_limit / autovacuum_vacuum_cost_limit. You could get an
idea of how often this table is being auto-vacuumed by setting
log_autovacuum_min_duration to 0 and checking the logs. Another way
to check would be to sample what: SELECT query FROM pg_stat_activity
WHERE query LIKE 'autovacuum%'; returns. You may find that all of the
workers are busy most of the time. If so, that indicates that the
cost limits need to be raised.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2019-02-24 07:03:58 Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.
Previous Message Gunther 2019-02-24 02:55:00 Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.