Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: aekorotkov(at)gmail(dot)com, Andres Freund <andres(at)anarazel(dot)de>, chjischj(at)163(dot)com, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Connections hang indefinitely while taking a gin index's LWLock buffer_content lock
Date: 2018-12-07 09:14:31
Message-ID: 9E342A95-F611-4658-BCDE-67DD5036C59B@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

> 7 дек. 2018 г., в 2:50, Peter Geoghegan <pg(at)bowt(dot)ie> написал(а):
>
> On Thu, Dec 6, 2018 at 12:51 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>>
>> However, I'd like to note that 218f51584d5 introduces two changes:
>> 1) Cleanup locking only if there pages to delete
>> 2) Cleanup locking only subtree root
>> The 2nd one is broken. But the 1st one seems still good for me and
>> useful, because in vast majority of cases vacuum doesn't delete any
>> index pages. So, I propose to revert 218f51584d5, but leave there
>> logic, which locks root for cleanup only once there are pages to
>> delete. Any thoughts?
>
> Can you post a patch that just removes the 2nd part, leaving the
> still-correct first part?

I like the idea of keeping cleanup lock only if there are pages to delete. It will still solve the original problem that caused proposals for GIN VACUUM enhancements.

But I must note that there is one more problem: ginVacuumPostingTreeLeaves() do not ensure that all splitted pages are visited. It copies downlink block numbers to a temp array and then unlocks parent. It is not correct way to scan posting tree for cleanup.

PFA diff with following changes:
1. Always take root cleanup lock before deleting pages
2. Check for concurrent splits after scanning page

Please note, that neither applying this diff nor reverting 218f51584d5 will solve bug of page delete redo lock on standby.

Best regards, Andrey Borodin.

Attachment Content-Type Size
gin_root_lock.diff application/octet-stream 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2018-12-07 09:33:04 Re: Too many logs are written on Windows (LOG: could not reserve shared memory region (addr=%p) for child %p:)
Previous Message Noah Misch 2018-12-07 07:45:59 Re: Too many logs are written on Windows (LOG: could not reserve shared memory region (addr=%p) for child %p:)