Re: speed up a logical replica setup

From: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-01-05 06:49:14
Message-ID: CANhcyEXjOcnOJKdOfK8Z1+bbymuZzwSjrppbt4FKHbrDD_t7Mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 4 Jan 2024 at 16:46, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Jan 4, 2024 at 12:22 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> >
> > Hi,
> > I was testing the patch with following test cases:
> >
> > Test 1 :
> > - Create a 'primary' node
> > - Setup physical replica using pg_basebackup "./pg_basebackup –h
> > localhost –X stream –v –R –W –D ../standby "
> > - Insert data before and after pg_basebackup
> > - Run pg_subscriber and then insert some data to check logical
> > replication "./pg_subscriber –D ../standby -S “host=localhost
> > port=9000 dbname=postgres” -P “host=localhost port=9000
> > dbname=postgres” -d postgres"
> > - Also check pg_publication, pg_subscriber and pg_replication_slots tables.
> >
> > Observation:
> > Data is not lost. Replication is happening correctly. Pg_subscriber is
> > working as expected.
> >
> > Test 2:
> > - Create a 'primary' node
> > - Use normal pg_basebackup but don’t set up Physical replication
> > "./pg_basebackup –h localhost –v –W –D ../standby"
> > - Insert data before and after pg_basebackup
> > - Run pg_subscriber
> >
> > Observation:
> > Pg_subscriber command is not completing and is stuck with following
> > log repeating:
> > LOG: waiting for WAL to become available at 0/3000168
> > LOG: invalid record length at 0/3000150: expected at least 24, got 0
> >
>
> I think probably the required WAL is not copied. Can you use the -X
> option to stream WAL as well and then test? But I feel in this case
> also, we should wait for some threshold time and then exit with
> failure, removing new objects created, if any.

I have tested with -X stream option in pg_basebackup as well. In this
case also the pg_subscriber command is getting stuck.
logs:
2024-01-05 11:49:34.436 IST [61948] LOG: invalid resource manager ID
102 at 0/3000118
2024-01-05 11:49:34.436 IST [61948] LOG: waiting for WAL to become
available at 0/3000130

>
> > Test 3:
> > - Create a 'primary' node
> > - Use normal pg_basebackup but don’t set up Physical replication
> > "./pg_basebackup –h localhost –v –W –D ../standby"
> > -Insert data before pg_basebackup but not after pg_basebackup
> > -Run pg_subscriber
> >
> > Observation:
> > Pg_subscriber command is not completing and is stuck with following
> > log repeating:
> > LOG: waiting for WAL to become available at 0/3000168
> > LOG: invalid record length at 0/3000150: expected at least 24, got 0
> >
>
> This is similar to the previous test and you can try the same option
> here as well.
For this test as well tried with -X stream option in pg_basebackup.
It is getting stuck here as well with similar log.

Will investigate the issue further.

Thanks and regards
Shlok Kyal

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2024-01-05 07:11:33 Re: Adding facility for injection points (or probe points?) for more advanced tests
Previous Message Nathan Bossart 2024-01-05 06:14:15 Re: verify predefined LWLocks have entries in wait_event_names.txt