Re: Clock sweep not caching enough B-Tree leaf pages?

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?
Date: 2014-04-28 18:41:38
Message-ID: CAM3SWZRnNBs7VWP2a5zUQ7j+QGuXtzNN98oW3A6VDp1B8cR5Dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 28, 2014 at 6:02 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Also true. But the problem is that it is very rarely, if ever, the
> case that all pages are *equally* hot. On a pgbench workload, for
> example, I'm very confident that while there's not really any cold
> data, the btree roots and visibility map pages are a whole lot hotter
> than a randomly-selected heap page. If you evict a heap page, you're
> going to need it back pretty quick, because it won't be long until the
> random-number generator again chooses a key that happens to be located
> on that page. But if you evict the root of the btree index, you're
> going to need it back *immediately*, because the very next query, no
> matter what key it's looking for, is going to need that page. I'm
> pretty sure that's a significant difference.

I emphasized leaf pages because even with master the root and inner
pages are still going to be so hot as to make them constantly in
cache, at least with pgbench's use of a uniform distribution. You'd
have to have an absolutely enormous scale factor before this might not
be the case. As such, I'm not all that worried about inner pages when
performing these simple benchmarks. However, in the case of the
pgbench_accounts table, each of the B-Tree leaf pages that comprise
about 99.5% of the total is still going to be about six times more
frequently accessed than each heap page. That's a small enough
difference for it to easily go unappreciated, and yet a big enough
difference for it to hurt a lot.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-04-28 18:44:16 Re: includedir_internal headers are not self-contained
Previous Message Jeff Janes 2014-04-28 18:21:05 Re: allowing VACUUM to be cancelled for conflicting locks