Re: WIP: WAL prefetch (another approach)

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: WAL prefetch (another approach)
Date: 2020-04-08 11:27:56
Message-ID: CA+hUKGKGhsRHhHJ4ybftCOjRxErnJYj-yOVamQaAsNmUBx4Aqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 8, 2020 at 12:52 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> * he gave some feedback on the read_local_xlog_page() modifications: I
> probably need to reconsider the change to logical.c that passes NULL
> instead of cxt to the read_page callback; and the switch statement in
> read_local_xlog_page() probably should have a case for the preexisting
> mode

So... logical.c wants to give its LogicalDecodingContext to any
XLogPageReadCB you give it, via "private_data"; that is, it really
only accepts XLogPageReadCB implementations that understand that (or
ignore it). What I want to do is give every XLogPageReadCB the chance
to have its own state that it is control of (to receive settings
specific to the implementation, or whatever), that you supply along
with it. We can't do both kinds of things with private_data, so I
have added a second member read_page_data to XLogReaderState. If you
pass in read_local_xlog_page as read_page, then you can optionally
install a pointer to XLogReadLocalOptions as reader->read_page_data,
to activate the new behaviours I added for prefetching purposes.

While working on that, I realised the readahead XLogReader was
breaking a rule expressed in XLogReadDetermineTimeLine(). Timelines
are really confusing and there were probably several subtle or not to
subtle bugs there. So I added an option to skip all of that logic,
and just say "I command you to read only from TLI X". It reads the
same TLI as recovery is reading, until it hits the end of readable
data and that causes prefetching to shut down. Then the main recovery
loop resets the prefetching module when it sees a TLI switch, so then
it starts up again. This seems to work reliably, but I've obviously
had limited time to test. Does this scheme sound sane?

I think this is basically committable (though of course I wish I had
more time to test and review). Ugh. Feature freeze in half an hour.

Attachment Content-Type Size
v7-0001-Rationalize-GetWalRcv-Write-Flush-RecPtr.patch text/x-patch 12.4 KB
v7-0002-Add-pg_atomic_unlocked_add_fetch_XXX.patch text/x-patch 3.3 KB
v7-0003-Allow-XLogReadRecord-to-be-non-blocking.patch text/x-patch 14.5 KB
v7-0004-Prefetch-referenced-blocks-during-recovery.patch text/x-patch 60.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-04-08 11:55:51 pgsql: Rationalize GetWalRcv{Write,Flush}RecPtr().
Previous Message Amit Langote 2020-04-08 11:16:48 Re: adding partitioned tables to publications