Re: Proposal: Log inability to lock pages during vacuum

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Log inability to lock pages during vacuum
Date: 2014-11-06 20:55:37
Message-ID: 545BE049.4090500@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/29/14, 11:49 AM, Jim Nasby wrote:
> On 10/21/14, 6:05 PM, Tom Lane wrote:
>> Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> writes:
>>> - What happens if we run out of space to remember skipped blocks?
>>
>> You forget some, and are no worse off than today. (This might be an
>> event worthy of logging, if the array is large enough that we don't
>> expect it to happen often ...)
>
> Makes sense. I'll see if there's some reasonable way to retry pages when the array fills up.
>
> I'll make the array 2k in size; that allows for 512 pages without spending a bunch of memory.

Attached is a patch for this. It also adds logging of unobtainable cleanup locks, and refactors scanning a page for vacuum into it's own function.

Anyone reviewing this might want to look at https://github.com/decibel/postgres/commit/69ab22f703d577cbb3d8036e4e42563977bcf74b, which is the refactor with no whitespace changes.

I've verified this works correctly by connecting to a backend with gdb and halting it with a page pinned. Both vacuum and vacuum freeze on that table do what's expected, but I also get this waring (which AFAICT is a false positive):

decibel(at)decina(dot)local=# vacuum verbose i;
INFO: vacuuming "public.i"
INFO: "i": found 0 removable, 399774 nonremovable row versions in 1769 out of 1770 pages
DETAIL: 200000 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
Retried cleanup lock on 0 pages, retry failed on 1, skipped retry on 0.
CPU 0.00s/0.06u sec elapsed 12.89 sec.
WARNING: buffer refcount leak: [105] (rel=base/16384/16385, blockNum=0, flags=0x106, refcount=2 1)
VACUUM

I am doing a simple static allocation of retry_pages[]; my understanding is that will only exist for the duration of this function so it's OK. If not I'll palloc it. If it is OK then I'll do the same for the freeze array.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Attachment Content-Type Size
0001-Vacuum-cleanup-lock-retry.patch text/plain 34.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-11-06 20:58:35 Re: json, jsonb, and casts
Previous Message Andrew Dunstan 2014-11-06 20:49:16 json, jsonb, and casts