Transaction held open by autoanalyze can be a bottleneck

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Transaction held open by autoanalyze can be a bottleneck
Date: 2017-05-10 20:09:38
Message-ID: CAMkU=1zcY7m123u=4R4pbFp=7UOQ01AGxymTspBeDqK30m_sOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Autovacuum's analyze starts a transaction when it starts on a table, and
holds it for the duration. This holds back the xmin horizon.

On a TPC-B-like benchmark, this can be a problem. While it is
autoanalyzing pgbench_accounts and pgbench_history, dead-but-for-analyze
tuples accumulate rapidly in pgbench_tellers and pgbench_branches. Now the
UPDATES to those tables have to walk the unprunable HOT chains to find
their tuples to update, greatly slowing them down.

The analyze actually takes quite a while, because it is frequently setting
hint bits and so dirtying pages and so sleeping
for autovacuum_vacuum_cost_delay.

If I set autovacuum_vacuum_cost_delay=0, then tps averaged over an hour
goes from 12,307.6 to 24,955.2. I can get a similar gain just by changing
the relopts for those two tables to autovacuum_analyze_threshold =
2000000000. I don't think these are particularly attractive solutions, but
they do demonstrate the nature of the problem.

Does analyze need all of its work done under the same transaction? Is
there an elegant way to make it periodically discard the transaction and
get a new one, so that the xmin horizon can advance? I think doing so every
time vacuum_delay_point decides to sleep would be a good time to do that,
but that would expand its contract quite a bit. And it is probably possible
to have analyze take a long time without ever deciding to sleep, so doing
it there would not be a fully general solution.

Cheers,

Jeff

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2017-05-10 20:10:48 [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Joe Conway 2017-05-10 20:09:36 Re: Should pg_current_wal_location() become pg_current_wal_lsn()