From: | Konstantin Knizhnik <knizhnik(at)garret(dot)ru> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Logical replication prefetch |
Date: | 2025-07-13 12:29:35 |
Message-ID: | 6faa3037-609e-4cd2-a4f8-c97fc5cb390b@garret.ru |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 13/07/2025 9:28 am, Amit Kapila wrote:
> I didn't understand your scenario. pa_launch_parallel_worker() should
> spawn a new worker only if all the workers in the pool are busy, and
> then it will free the worker if the pool already has enough workers.
> So, do you mean to say that the workers in the pool are always busy in
> your workload which lead spawn/exit of new workers? Can you please
> explain your scenario in some more detail?
>
Current LR apply logic is not working well for applying small OLTP
transactions.
First of all by default reorder buffer at publisher will buffer them and
so prevent parallel apply at subscriber.
Publisher switches to streaming mode only if transaction is too large or
`debug_logical_replication_streaming=immediate`.
But even if we force publisher to stream short transactions, subscriber
will try to launch new parallel apply worker for each transactions (if
all existed workers are busy).
If there are 100 active backends at publisher, then subscriber will try
to launch 100 parallel apply workers.
Most likely it fails because of limit for maximal number of workers. In
this case leader will serialize such transactions.
So if there are 100 streamed transactions and 10 parallel apply workers,
then 10 transactions are started in parallel and 90 will be serialized
to disk.
It seems to be not so efficient for short transaction. It is better to
wait for some time until some of workers become vacant.
But the worst thing happen when parallel apply worker completes its
transactions. If number of parallel apply workers in pool exceeds
`max_parallel_apply_workers_per_subscription / 2`,
then this parallel apply worker is terminated. So instead of having
`max_parallel_apply_workers_per_subscription` workers applying
transactions at maximal possible speed and leader
which distributes transaction between them and stops receiving new data
from publisher if there is no vacant worker, we will have leader
serializing and writing transactions to the disk
(and then definitely reading them from the disk) and permanently
starting and terminating parallel apply worker processes. It leads to
awful performance.
Certainly originally intended use case was different: parallel apply is
performed only for large transactions. Number of of such transactions is
not so big and
so there should be enough parallel apply workers in pool to proceed
them. And if there are not enough workers, it is not a problem to spawn
new one and terminate
it after completion of transaction (because transaction is long,
overhead of spawning process is not so larger comparing with redo of
large transaction).
But if we want to efficiently replicate OLTP workload, then we
definitely need some other approach.
Prefetch is actually more compatible with current implementation because
prefetch operations don't need to be grouped by transaction and can be
executed by any prefetch worker.
From | Date | Subject | |
---|---|---|---|
Next Message | Konstantin Knizhnik | 2025-07-13 12:36:30 | Re: Logical replication prefetch |
Previous Message | Amit Kapila | 2025-07-13 11:30:01 | Re: failover logical replication slots |