Re: optimizing vacuum truncation scans

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: optimizing vacuum truncation scans
Date: 2015-04-20 06:50:22
Message-ID: CAMkU=1xei3Ge=B3to2rzrCDAUTv4Ym=qFyD7WMOS=vqZeBxurQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 19, 2015 at 10:38 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
wrote:

> On 4/19/15 9:09 PM, Jeff Janes wrote:
>
>> I did literally the simplest thing I could think of as a proof of
>> concept patch, to see if it would actually fix things. I just jumped
>> back a certain number of blocks occasionally and prefetched them
>> forward, then resumed the regular backward scan. The patch and driving
>> script are attached.
>>
>
> Shouldn't completely empty pages be set as all-visible in the VM? If so,
> can't we just find the largest not-all-visible page and move forward from
> there, instead of moving backwards like we currently do?
>

If the entire table is all-visible, we would be starting from the
beginning, even though the beginning of the table still has read only
tuples present.

> For that matter, why do we scan backwards anyway? The comments don't
> explain it, and we have nonempty_pages as a starting point, so why don't we
> just scan forward? I suspect that eons ago we didn't have that and just
> blindly reverse-scanned until we finally hit a non-empty buffer...

nonempty_pages is not concurrency safe, as the pages could become used
after vacuum passed them over but before the access exclusive lock was
grabbed before the truncation scan. But maybe the combination of the two?
If it is above nonempty_pages, then anyone who wrote into the page after
vacuum passed it must have cleared the VM bit. And currently I think no one
but vacuum ever sets VM bit back on, so once cleared it would stay cleared.

In any event nonempty_pages could be used to set the guess as to how many
pages (if any) might be worth prefetching, as that is not needed for
correctness.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2015-04-20 07:40:52 Re: Optimization for updating foreign tables in Postgres FDW
Previous Message Heikki Linnakangas 2015-04-20 06:31:30 Re: alternative compression algorithms?