Re: Unresolved repliaction hang and stop problem.

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: alvherre(at)alvh(dot)no-ip(dot)org
Cc: klasahubert(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, amit(dot)kapila16(at)gmail(dot)com, lukasz(dot)biegaj(at)unitygroup(dot)com, krzysztof(dot)kois(at)unitygroup(dot)com
Subject: Re: Unresolved repliaction hang and stop problem.
Date: 2021-06-17 01:58:44
Message-ID: 20210617.105844.489549862805876505.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 16 Jun 2021 18:28:28 -0400, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote in
> On 2021-Jun-16, Ha Ka wrote:
> # Children Self Command Shared Object Symbol
> # ........ ........ ........ ............. ..................................
> #
> 100.00% 0.00% postgres postgres [.] exec_replication_command
> |
> ---exec_replication_command
> WalSndLoop
> XLogSendLogical
> LogicalDecodingProcessRecord
> |
> --99.51%--ReorderBufferQueueChange
> |
> |--96.06%--hash_seq_search
> |
> |--1.78%--ReorderBufferSerializeTXN
> | |
> | --0.52%--errstart
> |
> --0.76%--deregister_seq_scan
>
> What this tells me is that ReorderBufferQueueChange is spending a lot of
> time doing hash_seq_search, which probably is the one in
> ReorderBufferTXNByXid.

I don't see a call to hash_*seq*_search there. Instead, I see one in
ReorderBufferCheckMemoryLimit().

If added an elog line in hash_seq_search that is visited only when it
is called under ReorderBufferQueueChange, then set
logical_decoding_work_mem to 64kB.

Running the following query calls hash_seq_search (relatively) frequently.

pub=# create table t1 (a int primary key);
pub=# create publication p1 for table t1;
sub=# create table t1 (a int primary key);
sub=# create subscription s1 connection 'host=/tmp port=5432' publication p1;
pub=# insert into t1 (select a from generate_series(0, 9999) a);

The insert above makes 20 calls to ReorderBufferLargestTXN() (via
ReorderBufferCheckmemoryLimit()), which loops over hash_seq_search.

/*
* Find the largest transaction (toplevel or subxact) to evict (spill to disk).
*
* XXX With many subtransactions this might be quite slow, because we'll have
* to walk through all of them. There are some options how we could improve
* that: (a) maintain some secondary structure with transactions sorted by
* amount of changes, (b) not looking for the entirely largest transaction,
* but e.g. for transaction using at least some fraction of the memory limit,
* and (c) evicting multiple transactions at once, e.g. to free a given portion
* of the memory limit (e.g. 50%).
*/
static ReorderBufferTXN *
ReorderBufferLargestTXN(ReorderBuffer *rb)

This looks like a candidate of the culprit. The perf line for
"ReorderBufferSerializeTXN supports this hypothesis.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhihong Yu 2021-06-17 02:23:12 Re: Centralizing protective copying of utility statements
Previous Message Peter Geoghegan 2021-06-17 01:53:47 Re: Teaching users how they can get the most out of HOT in Postgres 14