Re: Improving replay of XLOG_BTREE_VACUUM records

From: Vladimir Borodin <root(at)simply(dot)name>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kevin Grittner <kgrittn(at)ymail(dot)com>
Subject: Re: Improving replay of XLOG_BTREE_VACUUM records
Date: 2015-05-03 09:27:29
Message-ID: 5D3B0EA2-53CA-4B58-B819-5C8A57BD120E@simply.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Jim.

Thanks for review.

> 2 мая 2015 г., в 2:10, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> написал(а):
>
> On 5/1/15 11:19 AM, Vladimir Borodin wrote:
>> There are situations in which vacuuming big btree index causes stuck in
>> WAL replaying on hot standby servers for quite a long time. I’ve
>> described the problem in more details in this thread [0]. Below in that
>> thread Kevin Grittner proposed a good way for improving btree scans so
>> that btree vacuuming logic could be seriously simplified. Since I don’t
>> know when that may happen I’ve done a patch that makes some improvement
>> right now. If Kevin or someone else would expand [1] for handling all
>> types of btree scans, I suppose, my patch could be thrown away and
>> vacuuming logic should be strongly rewritten.
>
> This looks like a good way to address this until the more significant work can be done.
>
> I'm not a fan of "RBM_ZERO_NO_BM_VALID"; how about RBM_ZERO_BM_INVALID? or BM_NOT_VALID? Or maybe I'm just trying to impose too much English on the code; I see the logic to NO_BM_VALID…

Perhaps, RBM_ZERO_NO_BM_VALID is not so good (it makes more difficult to grep BM_VALID in code), but I don’t actually like BM_INVALID and BM_NOT_VALID, sorry :( But I also don’t insist on NO_BM_VALID, any other suggestions?

>
> + * RBM_ZERO_NO_BM_VALID is the same as RBM_ZERO_AND_LOCK, but does not set
> + * BM_VALID bit before returning buffer so that noone could pin it.
>
> It would be better to explain why we want that mode. How about:
>
> RBM_ZERO_NO_BM_VALID is the same as RBM_ZERO_AND_LOCK but does not set BM_VALID before returning the buffer. This is done to ensure that no one can pin the buffer without actually reading the buffer contents in. This is necessary while replying XLOG_BTREE_VACUUM records in hot standby.

Good point, fixed in attached patch.

>
> + if (mode == RBM_ZERO_NO_BM_VALID)
> + TerminateBufferIO(bufHdr, false, 0);
> + else
> + TerminateBufferIO(bufHdr, false, BM_VALID);
>
> Simply passing in a 0 seems a bit odd to me; is there anywhere else we do that?

Yes, it is done the same way in FlushBuffer function [0]. Also comments before TerminateBufferIO say that 0 is expected value for third argument.

[0] http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/storage/buffer/bufmgr.c;h=a68eae81695f3f3b711d35d6c910e46b031f1cbc;hb=HEAD#l2426 <http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/storage/buffer/bufmgr.c;h=a68eae81695f3f3b711d35d6c910e46b031f1cbc;hb=HEAD#l2426>

> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com

--
May the force be with you…
https://simply.name

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergey Grinko 2015-05-03 10:15:28 Re: Loss of some parts of the function definition
Previous Message Joe Wildish 2015-05-03 09:24:46 Re: Implementing SQL ASSERTION