Re: archive recovery fetching wrong segments

From: Grigory Smolkin <g(dot)smolkin(at)postgrespro(dot)ru>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: archive recovery fetching wrong segments
Date: 2020-04-06 19:23:55
Message-ID: 3f2465c0-92c2-1497-f987-ad26b3bb7b20@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 4/6/20 9:17 PM, David Steele wrote:
> Hi Grigory,

Hello!
>
> On 4/5/20 8:02 PM, Grigory Smolkin wrote:
>> Hello, hackers!
>>
>> I`m investigating a complains from our clients about archive recovery
>> speed been very slow, and I`ve noticed a really strange and, I think,
>> a very dangerous recovery behavior.
>>
>> When running multi-timeline archive recovery, for every requested
>> segno startup process iterates through every timeline in restore
>> target timeline history, starting from highest timeline and ending in
>> current, and tries to fetch the segno in question from this timeline.
>
> <snip>
>
>> Is there a reason behind this behavior?
>>
>> Also I`ve  attached a patch, which fixed this issue for me, but I`m
>> not sure, that chosen approach is sound and didn`t break something.
>
> This sure looks like [1] which has a completed patch nearly ready to
> commit. Can you confirm and see if the proposed patch looks good?

Well I`ve been testing it all day and so far nothing is broken.

But this foreach(xlog.c:3777) loop looks very strange to me, it is not
robust, we are blindly going over timelines and feeding recovery some
files, hoping they are the right ones. I think we can do better, because:
1. we know whether or not we are running multi-timeline recovery
2. we know next timeline ID and can calculate switchpoint segment
3. make an informed decision about from what timeline we must requesting
files now.

I will work on it.

--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2020-04-06 19:51:45 Re: archive recovery fetching wrong segments
Previous Message Tom Lane 2020-04-06 19:08:49 Re: DROP OWNED CASCADE vs Temp tables