Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-03-05 17:50:32
Message-ID: 20200305175032.4iolgyumq4aomiwu@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 04, 2020 at 10:28:32AM +0530, Amit Kapila wrote:
>On Wed, Mar 4, 2020 at 3:16 AM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>
>> Hi,
>>
>> I started looking at this patch series again, hoping to get it moving
>> for PG13.
>>
>
>It is good to keep moving this forward, but there are quite a few
>problems with the design which need a broader discussion. Some of
>what I recall are:
>a. Handling of abort of concurrent transactions. There is some code
>in the patch which might work, but there is not much discussion when
>it was posted.
>b. Handling of partial tuples (while streaming, we came to know that
>toast tuple is not complete or speculative insert is incomplete). For
>this also, we have proposed a few solutions which need further
>discussion. One of those is implemented in the patch series.
>c. We might also need some handling for replication origins.
>d. Try to minimize the performance overhead of WAL logging for
>invalidations. We discussed different solutions for this and
>implemented one of those.
>e. How to skip already streamed transactions.
>
>There might be a few more which I can't recall now. Apart from this,
>I haven't done any detailed review of subscriber-side implementation
>where we write streamed transactions to file. All of this will need
>much more discussion and review before we can say it is ready to
>commit, so I thought it might be better to pick it up for PG14 and
>focus on other things that have a better chance for PG13 especially
>because all the problems were not solved/discussed before last CF.
>However, it is a good idea to keep moving this and have a discussion
>on some of these issues.
>

Sure, there's a lot to discuss. And it's possible (likely) it's not
feasible to get this into PG13. But I think it's still worth discussing
it, instead of just punting it into the next CF right away.

>> There's been a tremendous amount of work done since I last
>> worked on it, and a lot was discussed on this thread, so it'll take a
>> while to get familiar with the new code ...
>>
>> The first thing I realized that WAL-logging of assignments in v12 does
>> both the "old" logging (using dedicated message) and "new" with
>> toplevel-XID embedded in the first message. Yes, the patch was wrong,
>> because it eliminated all calls to ProcArrayApplyXidAssignment() and so
>> it was trivial to crash the replica due to KnownAssignedXids overflow.
>> But I don't think re-introducing XLOG_XACT_ASSIGNMENT message is the
>> right fix.
>>
>> I actually proposed doing this (having both ways to log assignments) so
>> that there's no regression risk with (wal_level < logical). But IIRC
>> Andres objected to it, argumenting that we should not log the same piece
>> of information in two very different ways at the same time (IIRC it was
>> discussed on the FOSDEM dev meeting, so I don't have a link to share).
>> And I do agree with him ...
>>
>
>So, aren't we worried about the overhead of the amount of WAL and
>performance impact for the transactions? We might want to check the
>pgbench read-write test to see if that will add any significant
>overhead.
>

Well, sure. I agree we need to see how this affects performance, and
I'll do some benchmarks (I think I did that when submitting the patch,
but I don't recall the numbers / details).

Isn't it a bit strange to log stuff twice, though, if we worry about
performance? Surely that's more expensive than logging it just once. Of
course, it might be useful if most systems need just the "old" way.

I know it's going to be a bit hand-wavy, but I think embedding the
assignments into existing WAL messages is about the cheapest way to log
this. I would not expect this to be mesurably more expensive than what
we have now, but I might be wrong.

>> The question is, why couldn't the replica use the same assignment info
>> we already write for logical decoding?
>>
>
>I haven't thought about it in detail, but we can think on those lines
>if the performance overhead is in the acceptable range.
>

OK, let me do some measurements ...

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-03-05 17:54:29 Re: proposal: schema variables
Previous Message vignesh C 2020-03-05 17:34:04 Re: Psql patch to show access methods info