Re: Attempt to consolidate reading of XLOG page

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Attempt to consolidate reading of XLOG page
Date: 2019-09-26 12:08:33
Message-ID: 75115.1569499713@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> On 2019-Sep-24, Antonin Houska wrote:
> > Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> > > If you don't have any strong dislikes for these changes, I'll push this
> > > part and let you rebase the remains on top.
> >
> > No objections here.
> oK, pushed. Please rebase the other parts.


> I made one small adjustment: in read_local_xlog_page() there was one
> *readTLI output parameter that was being changed to a local variable
> plus later assigment to the output struct member; I changed the code to
> continue to assign directly to the output variable instead. There was
> an error case in which the TLI was not assigned to; I suppose this
> doesn't really change things (we don't examine the TLI in that case, do
> we?), but it seemed dangerous to leave like that.

I used the local variable to make some expressions simpler, but missed the
fact that this way I can leave the ws_tli field unassigned if the function
returns prematurely. Now that I look closer, I see that it can be a problem -
in the case of ERROR, XLogReadRecord() does reset the state, but it does not
reset the TLI:

* Invalidate the read state. We might read from a different source after
* failure.

Thus the TLI appears to be important even on ERROR, and what you've done is
correct. Thanks for fixing that.

One comment on the remaining part of the series:

Before this refactoring, the walsender.c:XLogRead() function contained these

* After reading into the buffer, check that what we read was valid. We do
* this after reading, because even though the segment was present when we
* opened it, it might get recycled or removed while we read it. The
* read() succeeds in that case, but the data we tried to read might
* already have been overwritten with new WAL records.
XLByteToSeg(startptr, segno, segcxt->ws_segsize);
CheckXLogRemoved(segno, ThisTimeLineID);

but they don't fit into the new, generic implementation, so I copied these
lines to the two places right after the call of the new XLogRead(). However I
was not sure if ThisTimeLineID was ever correct here. It seems the original
walsender.c:XLogRead() implementation did not update ThisTimeLineID (and
therefore neither the new callback WalSndSegmentOpen() does), so both
logical_read_xlog_page() and XLogSendPhysical() could read the data from
another (historic) timeline. I think we should check the segment we really
read data from:

CheckXLogRemoved(segno, sendSeg->ws_tli);

The rebased code is attached.

Antonin Houska

Attachment Content-Type Size
v07-0005-Use-only-xlogreader.c-XLogRead.patch text/x-diff 18.7 KB
v07-0006-Remove-the-old-implemenations-of-XLogRead.patch text/x-diff 13.8 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-09-26 13:06:27 Re: Refactoring of connection with password prompt loop for frontends
Previous Message Alexey Kondratov 2019-09-26 12:08:22 Re: [Patch] pg_rewind: options to use restore_command from recovery.conf or command line