Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2019-09-16 16:54:32
Message-ID: b25ce80e-f536-78c8-d5c8-a5df3e230785@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>
>> FWIW my understanding is that the speedup comes mostly from
>> elimination of
>> the serialization to a file. That however requires savepoints to handle
>> aborts of subtransactions - I'm pretty sure I'd be trivial to create a
>> workload where this will be much slower (with many aborts of large
>> subtransactions).
>>

Yes, and it was my main motivation to eliminate that extra serialization
to file. I've experimented a bit with large transactions + savepoints +
aborts and ended up with a following query (the same schema as before
with 600k rows):

BEGIN;
SAVEPOINT s1;
UPDATE large_test SET num1 = num1 + 1, num2 = num2 + 1, num3 = num3 + 1;
SAVEPOINT s2;
UPDATE large_test SET num1 = num1 + 1, num2 = num2 + 1, num3 = num3 + 1;
SAVEPOINT s3;
UPDATE large_test SET num1 = num1 + 1, num2 = num2 + 1, num3 = num3 + 1;
ROLLBACK TO SAVEPOINT s3;
ROLLBACK TO SAVEPOINT s2;
ROLLBACK TO SAVEPOINT s1;
END;

It looks like the worst case scenario, as we do a lot of work and then
abort all subxacts one by one. As expected,it takes much longer (up to
x30) to process using background worker instead of spilling to file.
Surely, it is much easier to truncate a file, than apply all changes +
abort. However, I guess that this kind of load pattern is not the most
typical for real-life applications.

Also this test helped me to find a bug in my current savepoints routine,
so new patch is attached.

On 30.08.2019 18:59, Konstantin Knizhnik wrote:
>
> I think that instead of defining savepoints it is simpler and more
> efficient to use
>
> BeginInternalSubTransaction +
> ReleaseCurrentSubTransaction/RollbackAndReleaseCurrentSubTransaction
>
> as it is done in PL/pgSQL (pl_exec.c).
> Not sure if it can pr
>

Both BeginInternalSubTransaction and DefineSavepoint use
PushTransaction() internally for a normal subtransaction start. So they
seems to be identical from the performance perspective, which is also
stated in the comment section:

/*
 * BeginInternalSubTransaction
 *        This is the same as DefineSavepoint except it allows
TBLOCK_STARTED,
 *        TBLOCK_IMPLICIT_INPROGRESS, TBLOCK_END, and TBLOCK_PREPARE
states,
 *        and therefore it can safely be used in functions that might
be called
 *        when not inside a BEGIN block or when running deferred
triggers at
 *        COMMIT/PREPARE time.  Also, it automatically does
 *        CommitTransactionCommand/StartTransactionCommand instead of
expecting
 *        the caller to do it.
 */

Please, correct me if I'm wrong.

Anyway, I've performed a profiling of my apply worker (flamegraph is
attached) and it spends the vast amount of time (>90%) applying changes.
So the problem is not in the savepoints their-self, but in the fact that
we first apply all changes and then abort all the work. Not sure, that
it is possible to do something in this case.

Regards

--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

Attachment Content-Type Size
worker_aborts_perf.zip application/zip 69.8 KB
v2-0011-BGWorkers-pool-for-streamed-transactions-apply-wi.patch text/x-patch 60.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-09-16 17:01:15 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Robert Haas 2019-09-16 16:23:28 Re: block-level incremental backup