Re: prevent immature WAL streaming

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>
Cc: "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "mengjuan(dot)cmj(at)alibaba-inc(dot)com" <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, "Jakub(dot)Wartak(at)tomtom(dot)com" <Jakub(dot)Wartak(at)tomtom(dot)com>
Subject: Re: prevent immature WAL streaming
Date: 2021-08-26 03:24:48
Message-ID: D1901DF4-B986-402A-A128-2D1DD2637016@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/25/21, 5:40 PM, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> At Wed, 25 Aug 2021 18:18:59 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in
>> Let's say we have the following situation (F = flush, E = earliest
>> registered boundary, and L = latest registered boundary), and let's
>> assume that each segment has a cross-segment record that ends in the
>> next segment.
>>
>> F E L
>> |-----|-----|-----|-----|-----|-----|-----|-----|
>> 1 2 3 4 5 6 7 8
>>
>> Then, we write out WAL to disk and create .ready files as needed. If
>> we didn't flush beyond the latest registered boundary, the latest
>> registered boundary now becomes the earliest boundary.
>>
>> F E
>> |-----|-----|-----|-----|-----|-----|-----|-----|
>> 1 2 3 4 5 6 7 8
>>
>> At this point, the earliest segment boundary past the flush point is
>> before the "earliest" boundary we are tracking.
>
> We know we have some cross-segment records in the regin [E L] so we
> cannot add a .ready file if flush is in the region. I haven't looked
> the latest patch (or I may misunderstand the discussion here) but I
> think we shouldn't move E before F exceeds previous (or in the first
> picture above) L. Things are done that way in my ancient proposal in
> [1].

The strategy in place ensures that we track a boundary that doesn't
change until the flush position passes it as well as the latest
registered boundary. I think it is important that any segment
boundary tracking mechanism does at least those two things. I don't
see how we could do that if we didn't update E until F passed both E
and L.

Nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ajin Cherian 2021-08-26 03:51:06 Re: Failure of subscription tests with topminnow
Previous Message Amit Kapila 2021-08-26 03:19:56 Re: row filtering for logical replication