Re: Set visibility map bit after HOT prune

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Set visibility map bit after HOT prune
Date: 2012-12-20 16:49:30
Message-ID: CABOikdN_z0ojf_sX3X3txdt7Gj=QdOYdyP6UioFpDkAQ8+DGRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 20, 2012 at 9:23 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Dec 19, 2012 at 11:12 PM, Pavan Deolasee
>
>
>> I'm very reluctant to suggest that we can solve
>> this my setting aside another page-level bit to track visibility of
>> tuples for heapscans. Or even have a bit in the tuple header itself to
>> track this information at that level to avoid repeated visibility
>> check for a tuple which is known to be visible to all current and
>> future transactions.
>
> This has been suggested before, as an alternative to freezing tuples.
> It seems to have some potential although I think more thought and work
> is needed to figure out exactly where to go with it.
>

Ok. Will try to read archives to see what was actually suggested and
why it was put on back burner. BTW at the risk of being shot down
again, I wonder if can we push down the freeze operation to HOT prune
also. A single WAL record can then record this action as well. Also,
it saves us from repeated checks for transaction status flags in
heap_freeze_tuple(). Of course, we do all these only if HOT prune has
work on its on and just try to piggyback.

I wonder if we should add a flag to heap_page_prune and try to do some
additional work if its being called from lazy vacuum such as setting
the VM bit and the tuple freeze. IIRC I had put something like that in
the early patches, but then ripped of for simplicity. May be its time
to play with that again.

In fact, I'd also suggested ripping off the line pointer scan in lazy
vacuum since its preceded by a HOT prune which does bulk of the work
anyways. I remember Tom taking objection to that, but can't remember
why. Will try to read up the old thread again.

>> And we expect vacuums to be very less or none. AFAIR in pgbench, it
>> now takes hours for accounts table to get chosen for vacuum and we
>> should be happy about it. But IOS are almost impossible for pgbench
>> kind of workloads today because of our aggressive strategy to clear
>> the VM bits.
>
> IMHO, it's probably fairly hopeless to make a pure pgbench workload
> show a benefit from index-only scans. A large table under a very
> heavy, completely random write workload is just about the worst
> possible case for index-only scans. Index-only scans are a way of
> avoiding unnecessary visibility checks when the target data hasn't
> changed recently, not a magic bullet to escape all heap access. If
> the target data has changed, you're going to have to touch the heap.

Not always. Not clearing the VM bit at HOT update is one such idea we
discussed. Of course, there are open issues with that, but they are
not unsolvable. The advantage of not touching heap is just too big to
ignore.

> And while I agree that we aren't aggressive enough in setting the VM
> bits right now, I also think it wouldn't be too hard to go too far in
> the opposite direction: we could easily spend more effort trying to
> make index-only scans effective than we could ever hope to recoup from
> the scans themselves.
>

I agree. I also started having that worry. We are at one extreme right
now and it might not help to get to the other extreme. Looks like I'm
coming along the idea of somehow detecting if the scan is happening on
the result relation of a ModifyTable and avoid setting VM bit in that
case.

Thanks,
Pavan

--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-12-20 16:52:21 Re: ThisTimeLineID in checkpointer and bgwriter processes
Previous Message Fujii Masao 2012-12-20 16:48:35 Re: Switching timeline over streaming replication