Re: Attempt to consolidate reading of XLOG page

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Attempt to consolidate reading of XLOG page
Date: 2019-11-20 14:50:29
Message-ID: 88183.1574261429@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> On 2019-Nov-12, Antonin Houska wrote:
> > ok, the next version uses explicit lseek(). Maybe the fact that XLOG is mostly
> > read sequentially (i.e. without frequent seeks) is the reason pread() has't
> > been adopted so far.
> I don't quite understand why you backed off from switching to pread. It
> seemed a good change to me.

I agreed with Michael that it makes comparison of the old and new code more
difficult, and I also thought that his arguments about performance might be
worthwhile because WAL reading is mostly sequential and does not require many
seeks. However things appear to be more complex, see below.

> Here's a few edits on top of your latest.

> ...

I agree with your renamings.

> Change xlr_seg to be a struct rather than pointer to struct. It seems a
> bit dangerous to me to return a pointer that we don't know is going to
> be valid at raise-error time. Struct assignment works fine for the
> purpose.


> I would only like to switch this back to pg_pread() (from seek/read) and
> I'd be happy to commit this.

I realized that, starting from commit 709d003fbd98b975a4fbcb4c5750fa6efaf9ad87
we use the WALOpenSegment.ws_off field incorrectly in
walsender.c:XLogRead(). In that commit we used this field to replace

@@ -156,10 +165,9 @@ struct XLogReaderState
char *readBuf;
uint32 readLen;

- /* last read segment, segment offset, TLI for data currently in readBuf */
- XLogSegNo readSegNo;
- uint32 readOff;
- TimeLineID readPageTLI;
+ /* last read XLOG position for data currently in readBuf */
+ WALSegmentContext segcxt;
+ WALOpenSegment seg;

* beginning of prior page read, and its TLI. Doesn't necessarily

Thus we cannot use it in XLogRead() to track the current position in the
segment file. Although walsender.c:XLogRead() misses this point, it's not
broken because walsender.c does not use XLogReaderState at all.

So if explicit lseek() should be used, another field should be added to
WALOpenSegment. I failed to do so when removing the pg_pread() call from the
patch, and that was the reason for the problem reported here:

Thus the use of pg_pread() makes the code quite a bit simpler, so I
re-introduced it. If you decide that an explicit lseek() should be used yet,
just let me know.

> What is logical_read_local_xlog_page all about? Seems useless. Let's
> get rid of it.

It seems so. Should I post a patch for that?

Antonin Houska

Attachment Content-Type Size
v11-0005-Use-only-xlogreader.c-XLogRead.patch text/x-diff 23.0 KB
v11-0006-Remove-the-old-implemenations-of-XLogRead.patch text/x-diff 10.4 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2019-11-20 14:52:03 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Alvaro Herrera 2019-11-20 13:54:11 Re: logical decoding : exceeded maxAllocatedDescs for .spill files