old_snapshot_threshold vs indexes

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: old_snapshot_threshold vs indexes
Date: 2019-06-20 13:21:09
Message-ID: CA+hUKGKT8oTkp5jw_U4p0S-7UG9zsvtw_M47Y285bER6a2gD+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I ran into someone with a system where big queries scanning 8GB+ of
all-in-cache data took consistently ~2.5x longer on a primary server
than on a replica. Both servers had concurrent activity on them but
plenty of spare capacity and similar specs. After some investigation
it turned out that on the primary there were (1) some select()
syscalls waiting for 1ms, which might indicate contended
SpinLockAcquire() back-offs, and (2) a huge amount of time spent in:

+ 93,31% 0,00% postgres postgres [.] index_getnext
+ 93,30% 0,00% postgres postgres [.] index_fetch_heap
+ 81,66% 0,01% postgres postgres [.] heap_page_prune_opt
+ 75,85% 0,00% postgres postgres [.] TransactionIdLimitedForOldSnapshots
+ 75,83% 0,01% postgres postgres [.] RelationHasUnloggedIndex
+ 75,79% 0,00% postgres postgres [.] RelationGetIndexList
+ 75,79% 75,78% postgres postgres [.] list_copy

The large tables in question have around 30 indexes. I see that
heap_page_prune_opt()'s call to TransactionIdLimitedForOldSnapshots()
acquires a couple of system-wide spinlocks, and also tests
RelationAllowsEarlyPruning() which calls RelationHasUnloggedIndex()
which says:

* Tells whether any index for the relation is unlogged.
*
* Note: There doesn't seem to be any way to have an unlogged index attached
* to a permanent table, but it seems best to keep this general so that it
* returns sensible results even when they seem obvious (like for an unlogged
* table) and to handle possible future unlogged indexes on permanent tables.

It calls RelationGetIndexList() which conses up a new copy of the list
every time, so that we can spin through it looking for unlogged
indexes (and in this user's case there are none). I didn't try to
poke at this in lab conditions, but from a glance a the code, I guess
heap_page_prune_opt() is running for every index tuple except those
that reference the same heap page as the one before, so I guess it
happens a lot if the heap is not physically correlated with the index
keys. Ouch.

--
Thomas Munro
https://enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-06-20 13:42:14 Re: check_recovery_target_lsn() does a PG_CATCH without a throw
Previous Message Jesper Pedersen 2019-06-20 13:20:39 Re: Index Skip Scan