Re: Remove page-read callback from XLogReaderState.

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: hlinnaka(at)iki(dot)fi
Cc: andres(at)anarazel(dot)de, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Remove page-read callback from XLogReaderState.
Date: 2019-08-22 01:43:52
Message-ID: 20190822.104352.26342272.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the suggestion, Heikki.

At Mon, 29 Jul 2019 22:39:57 +0300, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote in <e1ecb53b-663d-98ed-2249-dfa30a74f8c1(at)iki(dot)fi>
> On 12/07/2019 10:10, Kyotaro Horiguchi wrote:
> >> Just FYI, to me this doesn't clearly enough look like an improvement,
> >> for a change of this size.
> > Thanks for the opiniton. I kinda agree about size but it is a
> > decision between "having multiple callbacks called under the
> > hood" vs "just calling a series of functions". I think the
> > patched XlogReadRecord is easy to use in many situations.
> > It would be better if I could completely refactor the function
> > without the syntax tricks but I think the current patch is still
> > smaller and clearer than overhauling it.
>
> I like the idea of refactoring XLogReadRecord() to not use a callback,
> and return a XLREAD_NEED_DATA value instead. It feels like a nicer,
> easier-to-use, interface, given that all the page-read functions need
> quite a bit of state and internal logic themselves. I remember that I
> felt that that would be a nicer interface when we originally extracted
> xlogreader.c into a reusable module, but I didn't want to make such
> big changes to XLogReadRecord() at that point.
>
> I don't much like the "continuation" style of implementing the state
> machine. Nothing wrong with such a style in principle, but we don't do
> that anywhere else, and the macros seem like overkill, and turning the

Agreed that it's a kind of ugly. I could overhaul the logic to
reduce state variables, but I thought that it would make the
patch hardly reviewable.

The "continuation" style was intended to impact the main path's
shape as small as possible. For the same reason I made variables
static instead of using individual state struct or reducing state
variables. (And it the style was fun for me:p)

> local variables static is pretty ugly. But I think XLogReadRecord()
> could be rewritten into a more traditional state machine.
>
> I started hacking on that, to get an idea of what it would look like
> and came up with the attached patch, to be applied on top of all your
> patches. It's still very messy, it needs quite a lot of cleanup before
> it can be committed, but I think the resulting switch-case state
> machine in XLogReadRecord() is quite straightforward at high level,
> with four states.

Sorry for late reply. It seems less messy than I thought it could
be if I refactored it more aggressively.

> I made some further changes to the XLogReadRecord() interface:
>
> * If you pass a valid ReadPtr (i.e. the starting point to read from)
> * argument to XLogReadRecord(), it always restarts reading from that
> * record, even if it was in the middle of reading another record
> * previously. (Perhaps it would be more convenient to provide a separate
> * function to set the starting point, and remove the RecPtr argument
> * from XLogReadRecord altogther?)

Seems reasonable. randAccess property was replaced with the
state.PrevRecPtr = Invalid. It is easier to understand for me.

> * XLogReaderState->readBuf is now allocated and controlled by the
> * caller, not by xlogreader.c itself. When XLogReadRecord() needs data,
> * the caller makes the data available in readBuf, which can point to the
> * same buffer in all calls, or the caller may allocate a new buffer, or
> * it may point to a part of a larger buffer, whatever is convenient for
> * the caller. (Currently, all callers just allocate a BLCKSZ'd buffer,
> * though). The caller also sets readPagPtr, readLen and readPageTLI to
> * tell XLogReadRecord() what's in the buffer. So all these read* fields
> * are now set by the caller, XLogReadRecord() only reads them.

The caller knows how many byes to be read. So the caller provides
the required buffer seems reasonable.

> * In your patch, if XLogReadRecord() was called with state->readLen ==
> * -1, XLogReadRecord() returned an error. That seemed a bit silly; if an
> * error happened while reading the data, why call XLogReadRecord() at
> * all? You could just report the error directly. So I removed that.

Agreed. I forgot to move the error handling to more proper location.

> I'm not sure how intelligible this patch is in its current state. But
> I think the general idea is good. I plan to clean it up further next
> week, but feel free to work on it before that, either based on this
> patch or by starting afresh from your patch set.

I think you diff is intelligible enough for me. I'll take this if
you haven't done. Anyway I'm staring on this.

Thanks!

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-08-22 02:40:30 Re: Why overhead of SPI is so large?
Previous Message Tsunakawa, Takayuki 2019-08-22 00:27:20 RE: Why overhead of SPI is so large?