Re: Wrong statistics for size of XLOG_SWITCH during pg_waldump.

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: movead(dot)li(at)highgo(dot)ca
Cc: andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org, ahsan(dot)hadi(at)highgo(dot)ca
Subject: Re: Wrong statistics for size of XLOG_SWITCH during pg_waldump.
Date: 2020-10-15 08:32:10
Message-ID: 20201015.173210.1595235318844221199.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 15 Oct 2020 12:56:02 +0800, "movead(dot)li(at)highgo(dot)ca" <movead(dot)li(at)highgo(dot)ca> wrote in
> Thanks for all the suggestions.
>
> >Yeah. In its current shape, it means that only pg_waldump would be
> >able to know this information. If you make this information part of
> >xlogdesc.c, any consumer of the WAL record descriptions would be able
> >to show this information, so it would provide a consistent output for
> >any kind of tools.
> I have change the implement, move some code into xlog_desc().

Andres suggested that we don't need that description with per-record
basis. Do you have a reason to do that? (For clarity, I'm not
suggesting that you should reving it.)

> >On top of that, it seems to me that the calculation used in the patch
> >is wrong in two aspects at quick glance:
> >1) startSegNo and endSegNo point always to two different segments with
> >a XLOG_SWITCH record, so you should check that ReadRecPtr is not at a
> >segment border instead before extracting SizeOfXLogLongPHD, no?
> Yes you are right, and I use 'record->EndRecPtr - 1' instead.

+ XLByteToSeg(record->EndRecPtr - 1, endSegNo, record->segcxt.ws_segsize);

We use XLByteToPrevSeg instead for this purpose.

> >2) This stuff should also check after the case of a WAL *page* border
> >where you'd need to adjust based on SizeOfXLogShortPHD instead.
> The 'junk_len -= SizeOfXLogLongPHD' code is considered for when the
> remain size of a wal segment can not afford a XLogRecord struct for
> XLOG_SWITCH, it needn't care *page* border.
>
> >I'm not sure the exact motive of this proposal, but if we show the
> >wasted length in the stats result, I think it should be other than
> >existing record types.
> Yes agree, and now it looks like below as new patch:

You forgot to add a correction for short headers.

> movead(at)movead-PC:/h2/pg/bin$ ./pg_waldump -p ../walbk/ -s 0/3000000 -e 0/6000000 -z
> Type N (%) Record size (%) FPI size (%) Combined size (%)
> ---- - --- ----------- --- -------- --- ------------- ---
> XLOG 5 ( 31.25) 300 ( 0.00) 0 ( 0.00) 300 ( 0.00)
> XLOGSwitchJunk 3 ( 18.75) 50330512 (100.00) 0 ( 0.00) 50330512 (100.00)
>
>
> movead(at)movead-PC:/h2/pg/bin$ ./pg_waldump -p ../walbk/ -s 0/3000000 -e 0/6000000 --stat=record
> XLOG/SWITCH 3 ( 18.75) 72 ( 0.00) 0 ( 0.00) 72 ( 0.00)
> XLOG/SWITCH_JUNK 3 ( 18.75) 50330512 (100.00) 0 ( 0.00) 50330512 (100.00)
>
> The shortcoming now is I do not know how to handle the 'count' of SWITCH_JUNK
> in pg_waldump results. Currently I regard SWITCH_JUNK as one count.

+ if(RM_XLOG_ID == rmid && XLOG_SWITCH == info)

We need a comment for the special code path.
We don't follow this kind of convension. Rather use "variable =
constant".

+ {
+ junk_len = xlog_switch_junk_len(record);
+ stats->count_xlog_switch++;
+ stats->junk_size += junk_len;

junk_len is used only the last line above. We don't need that
temporary variable.

+ * If the wal switch record spread on two segments, we should extra minus the

This comment is sticking out of 80-column border. However, I'm not
sure if we have reached a conclustion about the target console-width.

+ info = (rj << 4) & ~XLR_INFO_MASK;
+ switch_junk_id = "XLOG/SWITCH_JUNK";
+ if( XLOG_SWITCH == info && stats->count_xlog_switch > 0)

This line order is strange. At least switch_junk_id is used only in
the if-then block.

I'm not confindent on which is better, but I feel that this is not a
work for display side, but for the recorder side like attached.

> >By the way, I noticed that --stats=record shows two lines for
> >Transaction/COMMIT. The cause is that XLogDumpCountRecord assumes the
> >all four bits of xl_info is used to identify record id.
> Thanks,I didn't notice it before, and your patch added into v3 patch attached.

Sorry for the confusion, but it would be a separate topic even if we
are going to fix that..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
fix_waldump_size_for_wal_switch_v3-1.patch text/x-patch 4.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-10-15 08:32:46 Re: gs_group_1 crashing on 13beta2/s390x
Previous Message Michael Paquier 2020-10-15 08:12:03 Re: setLastTid() and currtid()