Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2019-03-22 21:15:25
Message-ID: CAH2-Wz=tDiSNcwg08EcQW6rNkMVy=9pC5YrSOh=mvwDu_P5B=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 21, 2019 at 10:28 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> I've committed the first 4 patches. Many thanks to Heikki for his very
> valuable help! Thanks also to the other reviewers.
>
> I'll likely push the remaining two patches on Sunday or Monday.

I noticed that if I initidb and run "make installcheck" with and
without the "split after new tuple" optimization patch, the largest
system catalog indexes shrink quite noticeably:

Master
======
pg_depend_depender_index 1456 kB
pg_depend_reference_index 1416 kB
pg_class_tblspc_relfilenode_index 224 kB

Patch
=====
pg_depend_depender_index 1088 kB -- ~25% smaller
pg_depend_reference_index 1136 kB -- ~20% smaller
pg_class_tblspc_relfilenode_index 160 kB -- 28% smaller

This is interesting to me because it is further evidence that the
problem that the patch targets is reasonably common. It's also
interesting to me because we benefit despite the fact there are a lot
of duplicates in parts of these indexes; we vary our strategy at
different parts of the key space, which works well. We pack pages
tightly where they're full of duplicates, using the "single value"
strategy that I've already committed, whereas the apply the "split
after new tuple" optimization in parts of the index with localized
monotonically increasing insertions. If there were no duplicates in
the indexes, then they'd be about 40% smaller, which is exactly what
we see with the TPC-C indexes (they're all unique indexes, with very
few physical duplicates). Looks like the duplicates are mostly
bootstrap mode entries. Lots of the pg_depend_depender_index
duplicates look like "(classid, objid, objsubid)=(0, 0, 0)", for
example.

I also noticed one further difference: the pg_shdepend_depender_index
index grew from 40 kB to 48 kB. I guess that might count as a
regression, though I'm not sure that it should. I think that we would
do better if the volume of data in the underlying table was greater.
contrib/pageinspect shows that a small number of the leaf pages in the
improved cases are not very filled at all, which is more than made up
for by the fact that many more pages are packed as if they were
created by a rightmost split (262 items 24 byte tuples is exactly
consistent with that). IOW, I suspect that the extra page in
pg_shdepend_depender_index is due to a "local minimum".

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-03-22 21:20:40 Re: rename labels in heapam.c?
Previous Message Andres Freund 2019-03-22 21:12:43 Re: rename labels in heapam.c?