From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Cc: | Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Date: | 2020-07-06 06:01:06 |
Message-ID: | CAA4eK1LKHs4Jd-6XvLdManqw7YmeC6jg+dFeF=SJxxPFC9UAHQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jul 5, 2020 at 4:47 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Sat, Jul 4, 2020 at 11:35 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
>
> > 9.
> > +ReorderBufferHandleConcurrentAbort(ReorderBuffer *rb, ReorderBufferTXN *txn,
> > {
> > ..
> > + ReorderBufferToastReset(rb, txn);
> > + if (specinsert != NULL)
> > + ReorderBufferReturnChange(rb, specinsert);
> > ..
> > }
> >
> > Why do we need to do these here when we wouldn't have been done for
> > any exception other than ERRCODE_TRANSACTION_ROLLBACK?
>
> Because we are handling this exception "ERRCODE_TRANSACTION_ROLLBACK"
> gracefully and we are continuing with further decoding so we need to
> return this change back.
>
Okay, then I suggest we should do these before calling stream_stop and
also move ReorderBufferResetTXN after calling stream_stop to follow a
pattern similar to try block unless there is a reason for not doing
so. Also, it would be good if we can initialize specinsert with NULL
after returning the change as we are doing at other places.
> > 10. I have got the below failure once. I have not investigated this
> > in detail as the patch is still under progress. See, if you have any
> > idea?
> > # Failed test 'check extra columns contain local defaults'
> > # at t/013_stream_subxact_ddl_abort.pl line 81.
> > # got: '2|0'
> > # expected: '1000|500'
> > # Looks like you failed 1 test of 2.
> > make[2]: *** [check] Error 1
> > make[1]: *** [check-subscription-recurse] Error 2
> > make[1]: *** Waiting for unfinished jobs....
> > make: *** [check-world-src/test-recurse] Error 2
>
> Even I got the failure once and after that, it did not reproduce. I
> have executed it multiple time but it did not reproduce again. Are
> you able to reproduce it consistently?
>
No, I am also not able to reproduce it consistently but I think this
can fail if a subscriber sends the replay_location before actually
replaying the changes. First, I thought that extra send_feedback we
have in apply_handle_stream_commit might have caused this but I guess
that can't happen because we need the commit time location for that
and we are storing the same at the end of apply_handle_stream_commit
after applying all messages. I am not sure what is going on here. I
think we somehow need to reproduce this or some variant of this test
consistently to find the root cause.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2020-07-06 06:13:59 | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Previous Message | Fujii Masao | 2020-07-06 05:29:28 | Re: track_planning causing performance regression |