From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Cc: | Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Date: | 2020-07-13 06:00:23 |
Message-ID: | CAA4eK1JEA4XvibjQKvpQrCD0qJ6h6dfkaVPGmpZed2hmBjgT-Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 13, 2020 at 10:47 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Sun, Jul 12, 2020 at 9:56 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Mon, Jul 6, 2020 at 11:43 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Jul 6, 2020 at 11:31 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > >
> > > > > > 10. I have got the below failure once. I have not investigated this
> > > > > > in detail as the patch is still under progress. See, if you have any
> > > > > > idea?
> > > > > > # Failed test 'check extra columns contain local defaults'
> > > > > > # at t/013_stream_subxact_ddl_abort.pl line 81.
> > > > > > # got: '2|0'
> > > > > > # expected: '1000|500'
> > > > > > # Looks like you failed 1 test of 2.
> > > > > > make[2]: *** [check] Error 1
> > > > > > make[1]: *** [check-subscription-recurse] Error 2
> > > > > > make[1]: *** Waiting for unfinished jobs....
> > > > > > make: *** [check-world-src/test-recurse] Error 2
> > > > >
> > > > > Even I got the failure once and after that, it did not reproduce. I
> > > > > have executed it multiple time but it did not reproduce again. Are
> > > > > you able to reproduce it consistently?
> > > > >
> > > >
...
..
> >
> > I think the reason for the failure is that we are not setting
> > remote_final_lsn, in the streaming mode. I have put multiple logs and
> > executed in log and from logs it appeared that some of the logical wal
> > did not get replayed due to below check in
> > should_apply_changes_for_rel.
> > return (rel->state == SUBREL_STATE_READY || (rel->state ==
> > SUBREL_STATE_SYNCDONE && rel->statelsn <= remote_final_lsn));
> >
> > I still need to do the detailed analysis that why does this fail in
> > some cases, basically, most of the time the rel->state is
> > SUBREL_STATE_READY so this check passes but whenever the state is
> > SUBREL_STATE_SYNCDONE it failed because we never update
> > remote_final_lsn. I will try to set this value in
> > apply_handle_stream_commit and see whether it ever fails or not.
>
> I have verified that after setting the remote_final_lsn in the
> apply_handle_stream_commit, I don't see that regression failure in
> over 70 runs whereas without that change it failed 6 times in 50 runs.
>
Your analysis and fix seem correct to me.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2020-07-13 06:10:10 | Re: Don't choke on files that are removed while pg_rewind runs. |
Previous Message | Masahiko Sawada | 2020-07-13 05:54:56 | Re: WIP: BRIN multi-range indexes |