Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-06-17 04:39:33
Message-ID: CAFiTN-uct6cXApRG5OVyRTyvx0-XMQoDaRk68a5n+rR6EVJzPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 17, 2020 at 9:33 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jun 16, 2020 at 7:49 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Tue, Jun 9, 2020 at 3:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Jun 8, 2020 at 11:53 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > I think one of the usages we still need is in ReorderBufferForget
> > > > because it can be called when we skip processing the txn. See the
> > > > comments in DecodeCommit where we call this function. If I am
> > > > correct, we need to probably collect all invalidations in
> > > > ReorderBufferTxn as we are collecting tuplecids and use them here. We
> > > > can do the same during processing of XLOG_XACT_INVALIDATIONS.
> > > >
> > >
> > > One more point related to this is that after this patch series, we
> > > need to consider executing all invalidation during transaction abort.
> > > Because it is possible that due to memory overflow, we have processed
> > > some of the messages which also contain a few XACT_INVALIDATION
> > > messages, so to avoid cache pollution, we need to execute all of them
> > > in abort. We also do the similar thing in Rollback/Rollback To
> > > Savepoint, see AtEOXact_Inval and AtEOSubXact_Inval.
> >
> > I have analyzed this further and I think there is some problem with
> > that. If Instead of keeping the invalidation as an individual change,
> > if we try to combine them in ReorderBufferTxn's invalidation then what
> > happens if the (sub)transaction is aborted. Basically, in this case,
> > we will end up executing all those invalidations for those we never
> > polluted the cache if we never try to stream it. So this will affect
> > the normal case where we haven't streamed the transaction because
> > every time we have executed the invalidation logged by transaction
> > those are aborted. One way is we develop the list at the
> > sub-transaction level and just before sending the transaction (on
> > commit) combine all the (sub) transaction's invalidation list. But,
> > I think since we already have the invalidation in the commit message
> > then there is no point in adding this complexity.
> > But, my main worry is about the streaming transaction, the problems are
> > - Immediately on the arrival of individual invalidation, we can not
> > directly add to the top-level transaction's invalidation list because
> > later if the transaction aborted before we stream (or we directly
> > stream on commit) then we will get an unnecessarily long list of
> > invalidation which is done by aborted subtransaction.
> >
>
> Is there any problem you see with this or you are concerned with the
> efficiency? Please note, we already do something similar in
> ReorderBufferForget and if your concern is efficiency then that
> applies to existing cases as well. I think if we want we can improve
> it later in many ways and one of them you have already suggested, at
> this time, the main thing is correctness and also aborts are not
> frequent enough to worry too much about their performance.

As of now, I am not seeing the problem, I was just concerned about
processing more invalidation messages in the aborted cases compared to
current code, even if the streaming is off/ or transaction never
streamed as memory size is not crossed. But, I agree that it is only
in the case of the abort, so I will work on this and later maybe we
can test the performance.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-06-17 04:47:23 Re: Does TupleQueueReaderNext() really need to copy its result?
Previous Message vignesh C 2020-06-17 04:10:09 Re: Parallel copy