Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)

From: Greg Stark <stark(at)mit(dot)edu>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, kskim(at)bitnine(dot)net, Lukas Fittl <lukas(at)fittl(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Btree Index on PostgreSQL and Wiredtiger (MongoDB3.2)
Date: 2016-08-13 00:05:54
Message-ID: CAM-w4HNkyT6g8UtM8aLT=HNx6P3njMY+Xd6ktbpcJkHDiBwu_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 12, 2016 at 8:13 PM, Andrew Gierth
<andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
> No, because as the pages split, they fill more slowly (because there are
> now more pages). So on average in a large randomly filled index, pages
> spend more time nearer 50% full than 100% full. This is easy to
> demonstrate by creating a table with an indexed float8 column and adding
> batches of random() values to it, checking with pgstatindex at intervals -
> the average leaf density will rarely exceed 70%.
>
> However, worst case conditions can give lower leaf densities; obviously
> the worst case is if the data is loaded in an order and quantity that
> just happens to leave every leaf page recently split.

btree pages don't split 50/50 either. They split biased to assume the
greater side of the split will receive more inserts -- iirc 70/30. So
if they're loaded sequentially you should get a btree that's 70% full
but the worst case is in theory closer to 30% though I think the
insert order would have to be pretty perverse to be worse than 50%.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2016-08-13 00:09:50 Re: No longer possible to query catalogs for index capabilities?
Previous Message Greg Stark 2016-08-12 23:59:51 Re: Add hint for function named "is"