Re: pg_receivewal starting position

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: pg_receivewal starting position
Date: 2021-08-28 12:10:25
Message-ID: CALj2ACW6YA0SfqwPm1-FRRdeNj1e9isiSax3jZ8MmyQYYQQWNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 26, 2021 at 5:45 PM Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> wrote:
> order to fail early if the replication slot doesn't exist, so please find
> attached v2 for that.

Thanks for the patches. Here are some comments:

1) While the intent of these patches looks good, I have following
concern with new replication command READ_REPLICATION_SLOT: what if
the pg_receivewal exits (because user issued a SIGINT or for some
reason) after flushing the received WAL to disk, before it sends
sendFeedback to postgres server's walsender so that it doesn't get a
chance to update the restart_lsn in the replication slot via
PhysicalConfirmReceivedLocation. If the pg_receivewal is started
again, isn't it going to get the previous restart_lsn and receive the
last chunk of flushed WAL again?

2) What is the significance of READ_REPLICATION_SLOT for logical
replication slots? I read above that somebody suggested to restrict
the walsender to handle READ_REPLICATION_SLOT for physical replication
slots so that the callers will see a command failure. But I tend to
think that it is clean to have this command common for both physical
and logical replication slots and the callers can have an Assert(type
== 'physical').

3) Isn't it useful to send active, active_pid info of the replication
slot via READ_REPLICATION_SLOT? pg_receivewal can use Assert(active ==
true && active_pid == getpid()) as an assertion to ensure that it is
the sole owner of the replication slot? Also, is it good send
wal_status info

4) I think below messages should start with lower case letter and also
there are some typos:
+ pg_log_warning("Could not fetch the replication_slot \"%s\" information "
+ pg_log_warning("Server does not suport fetching the slot's position, "
something like:
+ pg_log_warning("could not fetch replication slot \"%s\" information, "
+ "resuming from current server position instead", replication_slot);
+ pg_log_warning("server does not support fetching replication slot
information, "
+ "resuming from current server position instead");

5) How about emitting the above messages in case of "verbose"?

6) How about an assertion like below?
+ if (stream.startpos == InvalidXLogRecPtr)
+ {
+ stream.startpos = serverpos;
+ stream.timeline = servertli;
+ }
+
+Assert(stream.startpos != InvalidXLogRecPtr)>>

7) How about we let pg_receivewal use READ_REPLICATION_SLOT as an option?

8) Just an idea, how about we store pg_receivewal's lastFlushPosition
in a file before pg_receivewal exits and compare it with the
restart_lsn that it received from the replication slot, if
lastFlushPosition == received_restart_lsn well and good, if not, then
something would have happened and we always start at the
lastFlushPosition ?

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-08-28 12:10:55 pg_receivewal: remove extra conn = NULL; in StreamLog
Previous Message Zhihong Yu 2021-08-28 11:33:59 Re: Gather performance analysis