Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Noah Misch <noah(at)leadboat(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-12-09 09:26:25
Message-ID: 20201209092625.GA1685466@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 02, 2020 at 01:50:25PM +0530, Amit Kapila wrote:
> On Wed, Dec 2, 2020 at 1:20 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > On Mon, Nov 30, 2020 at 6:49 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > On Mon, Nov 30, 2020 at 3:14 AM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > > > > On Mon, Sep 07, 2020 at 12:00:41PM +0530, Amit Kapila wrote:
> > > > > > Thanks, I have pushed the last patch. Let's wait for a day or so to
> > > > > > see the buildfarm reports
> > > > >
> > > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2020-09-08%2006%3A24%3A14
> > > > > failed the new 015_stream.pl test with the subscriber looping like this:
> > > > >
> > > > > 2020-09-08 11:22:49.848 UTC [13959252:1] LOG: logical replication apply worker for subscription "tap_sub" has started
> > > > > 2020-09-08 11:22:54.045 UTC [13959252:2] ERROR: could not open temporary file "16393-510.changes.0" from BufFile "16393-510.changes": No such file or directory
> > > > > 2020-09-08 11:22:54.055 UTC [7602182:1] LOG: logical replication apply worker for subscription "tap_sub" has started
> > > > > 2020-09-08 11:22:54.101 UTC [31785284:4] LOG: background worker "logical replication worker" (PID 13959252) exited with exit code 1
> > > > > 2020-09-08 11:23:01.142 UTC [7602182:2] ERROR: could not open temporary file "16393-510.changes.0" from BufFile "16393-510.changes": No such file or directory
> > > > > ...

> > > > The above kind of error can happen due to the following reasons: (a)
> > > > the first time we sent the stream and created the file and that got
> > > > removed before the second stream reached the subscriber. (b) from the
> > > > publisher-side, we never sent the indication that it is the first
> > > > stream and the subscriber directly tries to open the file thinking it
> > > > is already there.

Further testing showed it was a file location problem, not a deletion problem.
The worker tried to open
base/pgsql_tmp/pgsql_tmp9896408.1.sharedfileset/16393-510.changes.0, but these
were the files actually existing:

[nm(at)power-aix 0:2 2020-12-08T13:56:35 64gcc 0]$ ls -la $(find src/test/subscription/tmp_check -name '*sharedfileset*')
src/test/subscription/tmp_check/t_015_stream_subscriber_data/pgdata/base/pgsql_tmp/pgsql_tmp9896408.0.sharedfileset:
total 408
drwx------ 2 nm usr 256 Dec 08 03:20 .
drwx------ 4 nm usr 256 Dec 08 03:20 ..
-rw------- 1 nm usr 207806 Dec 08 03:20 16393-510.changes.0

src/test/subscription/tmp_check/t_015_stream_subscriber_data/pgdata/base/pgsql_tmp/pgsql_tmp9896408.1.sharedfileset:
total 0
drwx------ 2 nm usr 256 Dec 08 03:20 .
drwx------ 4 nm usr 256 Dec 08 03:20 ..
-rw------- 1 nm usr 0 Dec 08 03:20 16393-511.changes.0

> > I have executed "make check" in the loop with only this file. I have
> > repeated it 5000 times but no failure, I am wondering shall we try to
> > execute in the same machine in a loop where it failed once?
>
> Yes, that might help. Noah, would it be possible for you to try that

The problem is xidhash using strcmp() to compare keys; it needs memcmp(). For
this to matter, xidhash must contain more than one element. Existing tests
rarely exercise the multi-element scenario. Under heavy load, on this system,
the test publisher can have two active transactions at once, in which case it
does exercise multi-element xidhash. (The publisher is sensitive to timing,
but the subscriber is not; once WAL contains interleaved records of two XIDs,
the subscriber fails every time.) This would be much harder to reproduce on a
little-endian system, where strcmp(&xid, &xid_plus_one)!=0. On big-endian,
every small XID has zero in the first octet; they all look like empty strings.

The attached patch has the one-line fix and some test suite changes that make
this reproduce frequently on any big-endian system. I'm currently planning to
drop the test suite changes from the commit, but I could keep them if folks
like them. (They'd need more comments and timeout handling.)

Attachment Content-Type Size
xidhash-blobs-v1.patch text/plain 3.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2020-12-09 09:41:54 Re: Change default of checkpoint_completion_target
Previous Message Dilip Kumar 2020-12-09 09:07:44 Re: Parallel INSERT (INTO ... SELECT ...)