Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, marvin_liang(at)qq(dot)com, actyzhang(at)outlook(dot)com
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Date: 2022-03-17 08:23:55
Message-ID: CALj2ACVBST5Us6-eDz4q_Gem3rUHSC7AYNOB7tmp9Yqq6PHsXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 17, 2022 at 10:48 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> It still suggests unspecifiable end-LSN..
>
> > select * from pg_get_wal_records_info('4/4B28EB68', '4/4C000060');
> > ERROR: cannot accept future end LSN
> > DETAIL: Last known WAL LSN on the database system is 4/4C000060.

Thanks Kyotaro-san. We can change the detail message to show (current
flush lsn/last replayed lsn - 1), that's what I've done in v11 posted
upthread at [1]. The problem is that all the pg_walinspect functions
would wait for the first valid record in read_local_xlog_page() via
InitXLogReaderState()->XLogFindNextRecord(), see[2].

We have two things to do:
1) Just document the behaviour "pg_walinspect functions will wait for
the first valid WAL record if there is none found after the specified
input LSN/start LSN.". This seems easier but some may see it as a
problem.
2) Have read_local_xlog_page_2 which doesn't wait for future WAL LSN
unlike read_local_xlog_page and like pg_waldump's WALDumpReadPage. It
requires a new function read_local_xlog_page_2 that almost looks like
read_local_xlog_page except wait (pg_usleep) loop, we can avoid code
duplication by moving the read_local_xlog_page code to a static
function read_local_xlog_page_guts(existing params, bool wait):

read_local_xlog_page(params)
read_local_xlog_page_guts(existing params, false);

read_local_xlog_page_2(params)
read_local_xlog_page_guts(existing params, true);

read_local_xlog_page_guts:
if (wait) wait for future wal; ---> existing pg_usleep code in
read_local_xlog_page.
else return;

I'm fine either way, please let me know your thoughts on this?

[1] https://www.postgresql.org/message-id/CALj2ACU8XjbYbMwh5x6hEUJdpRoG9%3DPO52_tuOSf1%3DMO7WtsmQ%40mail.gmail.com
[2]
postgres=# select pg_current_wal_flush_lsn();
pg_current_wal_flush_lsn
--------------------------
0/1624430
(1 row)

postgres=# select * from pg_get_wal_record_info('0/1624430');
ERROR: cannot accept future input LSN
DETAIL: Last known WAL LSN on the database system is 0/162442F.
postgres=# select * from pg_get_wal_record_info('0/162442f'); --->
waits for the first valid record in read_local_xlog_page.

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-03-17 08:52:20 Re: Skipping logical replication transactions on subscriber side
Previous Message Kyotaro Horiguchi 2022-03-17 08:07:37 Re: pg_tablespace_location() failure with allow_in_place_tablespaces