Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2019-03-12 19:06:29
Message-ID: CAH2-WznhWAs_ti2_UW0LMeaC_Wdnd3as6Y8Yyh=4PbUtkVG-Lg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 12, 2019 at 11:40 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Hey, I understood something today!

And I said something that could be understood!

> I think it's pretty clear that we have to view that as acceptable. I
> mean, we could reduce contention even further by finding a way to make
> indexes 40% larger, but I think it's clear that nobody wants that.
> Now, maybe in the future we'll want to work on other techniques for
> reducing contention, but I don't think we should make that the problem
> of this patch, especially because the regressions are small and go
> away after a few hours of heavy use. We should optimize for the case
> where the user intends to keep the database around for years, not
> hours.

I think so too. There is a feature in other database systems called
"reverse key indexes", which deals with this problem in a rather
extreme way. This situation is very similar to the situation with
rightmost page splits, where fillfactor is applied to pack leaf pages
full. The only difference is that there are multiple groupings, not
just one single implicit grouping (everything in the index). You could
probably make very similar observations about rightmost page splits
that apply leaf fillfactor.

Here is an example of how the largest index looks for master with the
100 warehouse case that was slightly regressed:

table_name | index_name | page_type | npages |
avg_live_items | avg_dead_items | avg_item_size
------------------+-----------------------+-----------+---------+----------------+----------------+---------------
bmsql_order_line | bmsql_order_line_pkey | R | 1 |
54.000 | 0.000 | 23.000
bmsql_order_line | bmsql_order_line_pkey | I | 11,482 |
143.200 | 0.000 | 23.000
bmsql_order_line | bmsql_order_line_pkey | L | 1,621,316 |
139.458 | 0.003 | 24.000

Here is what we see with the patch:

table_name | index_name | page_type | npages |
avg_live_items | avg_dead_items | avg_item_size
------------------+-----------------------+-----------+---------+----------------+----------------+---------------
bmsql_order_line | bmsql_order_line_pkey | R | 1 |
29.000 | 0.000 | 22.000
bmsql_order_line | bmsql_order_line_pkey | I | 5,957 |
159.149 | 0.000 | 23.000
bmsql_order_line | bmsql_order_line_pkey | L | 936,170 |
233.496 | 0.052 | 23.999

REINDEX would leave bmsql_order_line_pkey with 262 items, and we see
here that it has 233 after several hours, which is pretty good given
the amount of contention. The index actually looks very much like it
was just REINDEXED when initial bulk loading finishes, before we get
any updates, even though that happens using retail insertions.

Notice that the number of internal pages is reduced by almost a full
50% -- it's somewhat better than the reduction in the number of leaf
pages, because the benefits compound (items in the root are even a bit
smaller, because of this compounding effect, despite alignment
effects). Internal pages are the most important pages to have cached,
but also potentially the biggest points of contention.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-03-12 19:40:08 Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Previous Message Paul Ramsey 2019-03-12 19:03:24 Re: Compressed TOAST Slicing