Re: Clock sweep not caching enough B-Tree leaf pages?

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Ants Aasma <ants(at)cybertec(dot)at>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?
Date: 2014-04-17 20:45:38
Message-ID: CAM3SWZQ-znY56vJDkN2Vk3TKeDaovavPx2UAt65UuX+U32A4rg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 17, 2014 at 1:39 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-04-17 13:33:27 -0700, Peter Geoghegan wrote:
>> Just over 99.6% of pages (leaving aside the meta page) in the big 10
>> GB pgbench_accounts_pkey index are leaf pages.
>
> That's a rather nice number. I knew it was big, but I'd have guessed
> it'd be a percent lower.

Yes, it's usually past 99.5% for int4. It's really bad if it's as low
as 96%, and I think that often points to what are arguably bad
indexing choices, like indexing text columns that have long text
strings.

> Do you happen to have the same stat handy for a sensibly wide text or
> numeric real world index? It'd be interesting to see what the worst case
> there is.

Yes, as it happens I do:
http://www.postgresql.org/message-id/CAM3SWZTcXrdDZSpA11qZXiyo4_jtxwjaNdZpnY54yjzq7d64=A@mail.gmail.com

I was working of my Mouse Genome database, which is actually
real-world data use by medical researchers, stored in a PostgreSQL
database by those researchers and made available for the benefit of
other medical researchers.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-04-17 20:47:03 Re: assertion in 9.4 with wal_level=logical
Previous Message Joshua D. Drake 2014-04-17 20:44:21 DISCARD ALL (Again)