Re: [PERFORM] encouraging index-only scans

From: Jim Nasby <jim(at)nasby(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PERFORM] encouraging index-only scans
Date: 2013-09-17 16:37:35
Message-ID: 5238854F.5040708@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 9/7/13 12:34 AM, Andres Freund wrote:
> What I was thinking of was to keep track of the oldest xids on pages
> that cannot be marked all visible. I haven't thought about the
> statistics part much, but what if we binned the space between
> [RecentGlobalXmin, ->nextXid) into 10 bins and counted the number of
> pages falling into each bin. Then after the vacuum finished we could
> compute how far RecentGlobalXmin would have to progress to make another
> vacuum worthwile by counting the number of pages from the lowest bin
> upwards and use the bin's upper limit as the triggering xid.
>
> Now, we'd definitely need to amend that scheme by something that handles
> pages that are newly written to, but it seems something like that
> wouldn't be too hard to implement and would make autovacuum more useful.

If we're binning by XID though you're still dependent on scanning to build that range. Anything that creates dead tuples will also be be problematic, because it's going to unset VM bits on you, and you won't know if it's due to INSERTS or dead tuples.

What if we maintained XID stats for ranges of pages in a separate fork? Call it the XidStats fork. Presumably the interesting pieces would be min(xmin) and max(xmax) for pages that aren't all visible. If we did that at a granularity of, say, 1MB worth of pages[1] we're talking 8 bytes per MB, or 1 XidStats page per GB of heap. (Worst case alignment bumps that up to 2 XidStats pages per GB of heap.)

Having both min(xmin) and max(xmax) for a range of pages would allow for very granular operation of vacuum. Instead of hitting every heap page that's not all-visible, it would only hit those that are not visible and where min(xmin) or max(xmax) were less than RecentGlobalXmin.

One concern is maintaining this data. A key point is that we don't have to update it every time it changes; if the min/max are only off by a few hundred XIDs there's no point to updating the XidStats page. We'd obviously need the XidStats page to be read in, but even a 100GB heap would be either 100 or 200 XidStats pages.

[1]: There's a trade-off between how much space we 'waste' on XidStats pages and how many heap pages we potentially have to scan in the range. We'd want to see what this looked like in a real system. The thing that helps here is that regardless of what the stats for a particular heap range are, you're not going to scan any pages in that range that are already all-visible.
--
Jim C. Nasby, Data Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-09-17 17:12:47 Re: Freezing without write I/O
Previous Message Noah Misch 2013-09-17 16:30:56 Re: relscan_details.h

Browse pgsql-performance by date

  From Date Subject
Next Message Bartłomiej Romański 2013-09-17 22:16:49 Re: Planner performance extremely affected by an hanging transaction (20-30 times)?
Previous Message Andres Freund 2013-09-17 13:43:48 Re: Cpu usage 100% on slave. s_lock problem.