Re: min_safe_lsn column in pg_replication_slots view

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org
Subject: Re: min_safe_lsn column in pg_replication_slots view
Date: 2020-06-18 06:22:16
Message-ID: 20200618.152216.1889381045723164350.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the comments.

At Wed, 17 Jun 2020 21:37:55 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
> On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> > At Mon, 15 Jun 2020 13:44:31 +0900, Michael Paquier
> > <michael(at)paquier(dot)xyz> wrote in
> >> On Mon, Jun 15, 2020 at 12:40:03PM +0900, Fujii Masao wrote:
> >>> BTW, I just wonder why each row in pg_replication_slots needs to have
> >>> min_safe_lsn column? Basically min_safe_lsn should be the same between
> >>> every replication slots.
> >>
> >> Indeed, that's confusing in its current shape. I would buy putting
> >> this value into pg_replication_slots if there were some consistency of
> >> the data to worry about because of locking issues, but here this data
> >> is controlled within info_lck, which is out of the the repslot LW
> >> lock. So I think that it is incorrect to put this data in this view
> >> and that we should remove it, and that instead we had better push for
> >> a system view that maps with the contents of XLogCtl.
>
> Agreed. But as you know it's too late to do that for v13...
> So firstly I'd like to fix the issues in pg_replication_slots view,
> and then we can improve the things later for v14 if necessary.
>
>
> > It was once the difference from the safe_lsn to restart_lsn which is
> > distinct among slots. Then it was changed to the safe_lsn. I agree to
> > the discussion above, but it is needed anywhere since no one can know
> > the margin until the slot goes to the "lost" state without it. (Note
> > that currently even wal_status and min_safe_lsn can be inconsistent in
> > a line.)
> > Just for the need for table-consistency and in-line consistency, we
> > could just remember the result of XLogGetLastRemovedSegno() around
> > taking info_lock in the function. That doesn't make a practical
> > difference but makes the view look consistent.
>
> Agreed. Thanks for the patch. Here are the review comments:
>
>
> Not only the last removed segment but also current write position
> should be obtain at the beginning of pg_get_replication_slots()
> and should be given to GetWALAvailability(), for the consistency?

You are right. Though I faintly thought that I didn't need that since
WriteRecPtr doesn't move by so wide steps as removed_segment, actually
it moves.

> Even after applying your patch, min_safe_lsn is calculated for
> each slot even though the calculated result is always the same.
> Which is a bit waste of cycle. We should calculate min_safe_lsn
> at the beginning and use it for each slot?

Agreed. That may results in a wastful calculation but it's better than
repeated wasteful calculations.

> XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
>
> Isn't it better to use 1 as the second argument of the above,
> in order to address the issue that I reported upthread?
> Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn)
> returns
> would be confusing.

Mmm. pg_walfile_name seems too specialize to
pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
segment boundaries.) I'm not willing to do that only to follow such
suspicious(?) specification, but surely it would practically be better
doing that. Please find the attached first patch. I found that
there's no reason to hide min_safe_lsn when wal_status has certain
values. So I changed it to be shown always.

By the way, I noticed that when a replication slot reserves all
existing WAL segments, checkpoint cannot remove a file and
lastRemovedSegment is left being 0. The second attached forces
RemoveOldXlogFiles to initialize the variable even when no WAL
segments are removed. It puts no additional loads on file system
since the directory is scanned anyway. My old proposal to
unconditionally initialize it separately from checkpoint was rejected,
but I think this is acceptable.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v2-0001-Make-pg_stat_replication-view-consistent.patch text/x-patch 6.1 KB
v2-0002-Forcibly-initialize-lastRemovedSegNo-at-the-first.patch text/x-patch 2.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 李杰 (慎追) 2020-06-18 06:37:43 回复:回复:回复:回复:how to create index concurrently on partitioned table
Previous Message Thomas Munro 2020-06-18 06:05:50 Re: Fast DSM segments