Re: Unresolved repliaction hang and stop problem.

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Lukasz Biegaj <lukasz(dot)biegaj(at)unitygroup(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Krzysztof Kois <krzysztof(dot)kois(at)unitygroup(dot)com>
Subject: Re: Unresolved repliaction hang and stop problem.
Date: 2021-05-04 14:35:05
Message-ID: 20210504143505.GA21686@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Lukasz, thanks for following up.

On 2021-May-04, Lukasz Biegaj wrote:

> The problem is as described in https://www.postgresql.org/message-id/flat/8bf8785c-f47d-245c-b6af-80dc1eed40db%40unitygroup.com
>
> It does occur on two separate production clusters and one test cluster - all
> belonging to the same customer, although processing slightly different data
> (it's an e-commerce store with multiple languages and separate production
> databases for each language).

I think the best next move would be to make certain that the problem is
what we think it is, so that we can discuss if Amit's commit is an
appropriate fix. I would suggest to do that by running the problematic
workload in the test system under "perf record -g" and then get a report
with "perf report -g" which should hopefully give enough of a clue.
(Sometimes the reports are much better if you use a binary that was
compiled with -fno-omit-frame-pointer, so if you're in a position to try
that, it might be useful -- or apparently you could try "perf record
--call-graph dwarf" or "perf record --call-graph lbr", depending.)

Also I would be much more comfortable about proposing to backpatch such
an invasive change if you could ensure that in pg10 the same workload
does not cause the problem. If it does, then it'd be clear we're
talking about a regression.

--
Álvaro Herrera Valdivia, Chile
"I'm always right, but sometimes I'm more right than other times."
(Linus Torvalds)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-05-04 15:38:09 Re: Simplify backend terminate and wait logic in postgres_fdw test
Previous Message Bharath Rupireddy 2021-05-04 14:23:30 Re: Identify missing publications from publisher while create/alter subscription.