|From:||Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>|
|Cc:||andres(at)anarazel(dot)de, michael(dot)paquier(at)gmail(dot)com, sfrost(at)snowman(dot)net, nag1010(at)gmail(dot)com, jdnelson(at)dyn(dot)com, pgsql-hackers(at)postgresql(dot)org|
|Subject:||Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?|
|Views:||Raw Message | Whole Thread | Download mbox|
Thank you for adding the detailed comment and commiting.
At Sat, 5 May 2018 01:49:31 +0300, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote in <47215279-228d-f30d-35d1-16af695e53f3(at)iki(dot)fi>
> On 04/05/18 10:05, Heikki Linnakangas wrote:
> > On 24/04/18 13:57, Kyotaro HORIGUCHI wrote:
> >> At Mon, 23 Apr 2018 03:41:47 -0400, Heikki Linnakangas
> >> <hlinnaka(at)iki(dot)fi> wrote in
> >> <89e33d4f-5c75-0738-3dcb-44c4df59e920(at)iki(dot)fi>
> >>> Looking at the patch linked above
> >>> (https://www.postgresql.org/message-id/20171026.190551.208996945.horiguchi.kyotaro%40lab.ntt.co.jp):
> >>>> --- a/src/backend/access/transam/xlog.c
> >>>> +++ b/src/backend/access/transam/xlog.c
> >>>> @@ -11693,6 +11693,10 @@ retry:
> >>>> Assert(reqLen <= readLen);
> >>>> *readTLI = curFileTLI;
> >>>> +
> >>>> + if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr,
> >>>> readBuf))
> >>>> + goto next_record_is_invalid;
> >>>> +
> >>>> return readLen;
> >>>> next_record_is_invalid:
> >>> What is the point of adding this XLogReaderValidatePageHeader() call?
> >>> The caller of this callback function, ReadPageInternal(), checks the
> >>> page header anyway. Earlier in this thread, you said:
> >> Without the lines, server actually fails to start replication.
> >> (I try to remember the details...)
> >> The difference is whether the function can cause retry for the
> >> same portion of a set of continued records (without changing the
> >> plugin API). XLogPageRead can do that. On the other hand all
> >> ReadPageInternal can do is just letting the caller ReadRecords
> >> retry from the beginning of the set of continued records since
> >> the caller side handles only complete records.
> >> In the failure case, in XLogPageRead, when read() fails, it can
> >> try the next source midst of a continued records. On the other
> >> hand if the segment was read but it was recycled one, it passes
> >> "success" to ReadPageInternal and leads to retring from the
> >> beginning of the recrod. Infinite loop.
> > I see. You have the same problem if you have a WAL file that's corrupt
> > in some other way, but one of the sources would have a correct copy.
> > ValidXLogPageHeader() only checks the page header, after all. But
> > unlike
> > a recycled WAL segment, that's not supposed to happen as part of
> > normal
> > operation, so I guess we can live with that.
Anyway we read successive complete records from different sources
so I think if such curruption causes a problem we would have
faced the same problem even without this.
> Pushed this now, after adding some comments. Thanks!
> >> Calling XLogReaderValidatePageHeader in ReadPageInternal is
> >> redundant, but removing it may be interface change of xlogreader
> >> plugin and I am not sure that is allowed.
> > We should avoid doing that in back-branches, at least. But in
> > 'master',
> > I wouldn't mind redesigning the API. Dealing with all the retrying is
> > pretty complicated as it is, if we can simplify that somehow, I think
> > that'd be good.
> I spent some time musing on what a better API might look like. We
> could remove the ReadPage callback, and instead have XLogReadRecord
> return a special return code to mean "give me more data". I'm thinking
> of something like:
Sounds reasonable. That makes an additional return/call iteration
to XLogReadRecord but it would be ignorable comparing to the cost
> /* return code of XLogReadRecord() */
> typedef enum
> XLREAD_INVALID_RECORD, /* a record was read, but it was corrupt */
> XLREAD_INVALID_PAGE, /* the page that was supplied looks invalid. */
> XLREAD_NEED_DATA, /* caller should place more data in buffer, and
> retry */
> } XLogReadRecord_Result;
> And the calls to XLogReadRecord() would look something like this:
> rc = XLogReadRecord(reader, startptr, errormsg);
> if (rc == XLREAD_SUCCESS)
> /* great, got record */
> if (rc == XLREAD_INVALID_PAGE || XLREAD_INVALID_RECORD)
> elog(ERROR, "invalid record");
> if (rc == XLREAD_NEED_DATA)
> * Read a page from disk, and place it into reader->readBuf
> XLogPageRead(reader->readPagePtr, /* page to read */
> reader->reqLen /* # of bytes to read */ );
> * Now that we have read the data that XLogReadRecord()
> * requested, call it again.
> So instead of having a callback, XLogReadRecord() would return
> XLREAD_NEED_DATA. The caller would then need to place that data into
> the buffer, and call it again. If a record spans multiple pages,
> XLogReadRecord() would return with XLREAD_NEED_DATA multiple times, to
> read each page.
It seems easier to understand at a glance. In short, I like it.
> The important difference for the bug we're discussing on this thread
> is is that if you passed an invalid page to XLogReadRecord(), it would
> return with XLREAD_INVALID_PAGE. You could then try reading the same
> page from a different source, and call XLogReadRecord() again, and it
> could continue where it was left off, even if it was in the middle of
> a continuation record.
We will have more control on how to continue reading WAL with
this than the current code and it seems preferable as the whole.
> This is clearly not backpatchable, but maybe something to think about
> for v12.
Are you planning to working on this shortly?
NTT Open Source Software Center
|Next Message||Ralf Jung||2018-05-14 09:15:44||"REVOKE ... ON DATABASE template1 ..." has no effect|
|Previous Message||Tom Lane||2018-05-14 05:18:42||Re: Abnormal JSON query performance|
|Next Message||Pavan Deolasee||2018-05-14 07:44:22||Re: PANIC during crash recovery of a recently promoted standby|
|Previous Message||Simon Muller||2018-05-14 06:35:34||Re: Allow COPY's 'text' format to output a header|