Re: Postgres is not able to handle more than 4k tables!?

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Postgres is not able to handle more than 4k tables!?
Date: 2020-07-09 06:31:06
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09.07.2020 03:49, tsunakawa(dot)takay(at)fujitsu(dot)com wrote:
> From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
>> Autovacuum is periodically processing all this 5k relations (because
>> them are actively updated).
>> And as far as most of this tables are small enough autovacuum complete
>> processing of them almost in the same time.
>> As a result autovacuum workers produce ~5k invalidation messages in
>> short period of time.
> How about trying CREATE/ALTER TABLE WITH (vacuum_truncate = off)? It's available since PG 12. It causes autovacuum to not truncate the relation. It's the relation truncation what produces those shared invalidation messages.

Invalidation messages are also caused by statistic update:

#0  0x000055a85f4f5fd6 in RegisterCatcacheInvalidation (cacheId=49,
hashValue=715727843, dbId=12443)
    at inval.c:483
#1  0x000055a85f4f4dc2 in PrepareToInvalidateCacheTuple
(relation=0x7f45a34ce5a0, tuple=0x7ffc75bebc70,
    newtuple=0x7f4598e75ef8, function=0x55a85f4f5fc0
<RegisterCatcacheInvalidation>) at catcache.c:1830
#2  0x000055a85f4f6b21 in CacheInvalidateHeapTuple
(relation=0x7f45a34ce5a0, tuple=0x7ffc75bebc70,
    newtuple=0x7f4598e75ef8) at inval.c:1149
#3  0x000055a85f016372 in heap_update (relation=0x7f45a34ce5a0,
    newtup=0x7f4598e75ef8, cid=0, crosscheck=0x0, wait=1 '\001',
    lockmode=0x7ffc75bebce8) at heapam.c:4245
#4  0x000055a85f016f98 in simple_heap_update (relation=0x7f45a34ce5a0,
    tup=0x7f4598e75ef8) at heapam.c:4490
#5  0x000055a85f153ec5 in update_attstats (relid=16384, inh=0 '\000',
natts=1, vacattrstats=0x55a860f0fba0)
    at analyze.c:1619
#6  0x000055a85f151898 in do_analyze_rel (onerel=0x7f45a3480080,
options=98, params=0x55a860f0f028,
    va_cols=0x0, acquirefunc=0x55a85f15264e <acquire_sample_rows>,
relpages=26549, inh=0 '\000',
    in_outer_xact=0 '\000', elevel=13) at analyze.c:562
#7  0x000055a85f150be1 in analyze_rel (relid=16384,
relation=0x7ffc75bec370, options=98,
    params=0x55a860f0f028, va_cols=0x0, in_outer_xact=0 '\000',
bstrategy=0x55a860f0f0b8) at analyze.c:257
#8  0x000055a85f1e1589 in vacuum (options=98, relation=0x7ffc75bec370,
relid=16384, params=0x55a860f0f028,
    va_cols=0x0, bstrategy=0x55a860f0f0b8, isTopLevel=1 '\001') at
#9  0x000055a85f2fd92a in autovacuum_do_vac_analyze (tab=0x55a860f0f020,
    at autovacuum.c:2874
#10 0x000055a85f2fcccb in do_autovacuum () at autovacuum.c:2374
>> But then we come to the next show stopper: NUM_LOCK_PARTITIONS.
>> It is also hardcoded and can't be changed without recompilation:
>> Having just 16 LW-Locks greatly increase conflict probability (taken in
>> account that there are 5k tables and totally about 25k relations).
>> It cause huge lw-lock acquisition time for heap_open and planning stage
>> of some queries is increased from milliseconds to several minutes!
>> Koda!
> The vacuum's relation truncation is also the culprit here, and it can be eliminated by the above storage parameter. It acquires Access Exclusive lock on the relation. Without the strong Access Exclusive lock, just running DML statements use the fast path locking, which doesn't acquire the lock manager partition lock.

Looks like it is not true (at lest for PG9.6):

#0  0x00007fa6d30da087 in semop () from /lib64/
#1  0x0000000000682241 in PGSemaphoreLock
(sema=sema(at)entry=0x7fa66f5655d8) at pg_sema.c:387
#2  0x00000000006ec6eb in LWLockAcquire (lock=lock(at)entry=0x7f23b544f800,
mode=mode(at)entry=LW_EXCLUSIVE) at lwlock.c:1338
#3  0x00000000006e5560 in LockAcquireExtended
(locktag=locktag(at)entry=0x7ffd94883fa0, lockmode=lockmode(at)entry=1,
sessionLock=sessionLock(at)entry=0 '\000', dontWait=dontWait(at)entry=0
'\000', reportMemoryError=reportMemoryError(at)entry=1 '\001',
locallockp=locallockp(at)entry=0x7ffd94883f98) at lock.c:962
#4  0x00000000006e29f6 in LockRelationOid (relid=87103837, lockmode=1)
at lmgr.c:113
#5  0x00000000004a9f55 in relation_open (relationId=87103837,
lockmode=lockmode(at)entry=1) at heapam.c:1131
#6  0x00000000004bdc66 in index_open (relationId=<optimized out>,
lockmode=lockmode(at)entry=1) at indexam.c:151
#7  0x000000000067be58 in get_relation_info (root=root(at)entry=0x3a1a758,
relationObjectId=72079078, inhparent=<optimized out>,
rel=rel(at)entry=0x3a2d460) at plancat.c:183
#8  0x000000000067ef45 in build_simple_rel (root=root(at)entry=0x3a1a758,
relid=2, reloptkind=reloptkind(at)entry=RELOPT_BASEREL) at relnode.c:148

Please notice  lockmode=1 (AccessShareLock)

> The long lwlock wait is a sad story. The victim is probably exclusive lockers. When someone holds a shared lock on a lwlock, the exclusive locker has to wait. That's OK. However, if another share locker comes later, it acquires the lwlock even though there're waiting exclusive lockers. That's unfair, but this is the community decision.
Yes, I also think that it is the reason of the problem.
Alexandr Korotokov has implemented fair LW-Locks which eliminate such
kind of problems in some scenarios.
May it also can help here.

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-07-09 06:33:28 Re: A patch for get origin from commit_ts.
Previous Message Amit Khandekar 2020-07-09 06:27:34 Re: calling procedures is slow and consumes extra much memory against calling function