Re: show Heap Fetches in EXPLAIN for index-only scans

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: show Heap Fetches in EXPLAIN for index-only scans
Date: 2012-01-26 01:42:05
Message-ID: CA+TgmoYt1bD8H31MMAGy8NkEhkea2HFwe4TE4NvCV1R-srs1Rg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 21, 2012 at 9:50 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> A review:
>
> [ review ]

Thanks. Committed with hopefully-appropriate revisions.

> As a side-note, I noticed that I needed to run vacuum twice in a row
> to get the Heap Fetches to drop to zero.  I vaguely recall that only
> one vacuum was needed when ios first went in (and I had instrumented
> it to elog heap-fetches).  Does anyone know if this the expected
> consequence of one of the recent changes we made to vacuum?

No, that's not expected. The change we made to vacuum was to skip
pages that are busy - but it shouldn't be randomly skipping pages for
no reason. I can reproduce what you're observing, though:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)

After updating a row in the table and checkpointing, the page the rows
was on is marked full and the page that gets the new version becomes
not-all-visible:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
TLI: 0x0001 Prune XID: 0x000003fb Flags: 0x0003 (HAS_FREE_LINES|PAGE_FULL)
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0001 (HAS_FREE_LINES)

Now I vacuum the relation and checkpoint, and the page the *new*
relation is on becomes all-visible:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0001 (HAS_FREE_LINES)
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)

Now I vacuum it again and checkpoint, and now the old page also
becomes all-visible:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)
TLI: 0x0001 Prune XID: 0x00000000 Flags: 0x0005 (HAS_FREE_LINES)

But it seems to me that this is expected (if non-optimal) behavior.
Only the first pass of vacuum knows how to mark pages all-visible.
After the update, the first pass of the first vacuum sees a dead tuple
on the old page and truncates it to a dead line pointer. When it
comes to the new page, it observes that the page is now all-visible
and marks it so. It then does index vacuuming and returns to the
first page, marking the dead line pointer unused. But during this
second visit to the old page, there's no possibility of marking the
page as all-visible, because the code doesn't know how to do that.
The next vacuum's first pass, however, can do so, because there are no
longer any dead tuples on the page.

We could fix this by modifying lazy_vacuum_page(): since we have to
dirty the buffer anyway, we could recheck whether all the remaining
tuples on the page are now all-visible, and if so set the visibility
map bit. This is probably desirable even apart from index-only scans,
because it will frequently save the next vacuum the cost of reading,
dirtying, and writing extra pages. There will be some incremental CPU
cost, but that seems likely to be more than repaid by the I/O savings.

Thoughts? Should we do this at all? If so, should we squeeze it into
9.2 or leave it for 9.3?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-01-26 02:10:47 Re: Group commit, revised
Previous Message Greg Smith 2012-01-26 01:17:28 Re: Vacuum rate limit in KBps