Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2019-03-10 20:11:04
Message-ID: CAH2-WzkzJXTPpp5tKWPaa_UXfu0h80ACTD2zo7Cn_BY0tWTUEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 10, 2019 at 12:53 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> Ah, yeah. Not sure. I wrote it as "searching_for_pivot_tuple" first, but
> changed to "searching_for_leaf_page" at the last minute. My thinking was
> that in the page-deletion case, you're trying to re-locate a particular
> leaf page. Otherwise, you're searching for matching tuples, not a
> particular page. Although during insertion, I guess you are also
> searching for a particular page, the page to insert to.

I prefer something like "searching_for_pivot_tuple", because it's
unambiguous. Okay with that?

> It's a hot codepath, but I doubt it's *that* hot that it matters,
> performance-wise...

I'll figure that out. Although I am currently looking into a
regression in workloads that fit in shared_buffers, that my
micro-benchmarks didn't catch initially. Indexes are still much
smaller, but we get a ~2% regression all the same. OTOH, we get a
7.5%+ increase in throughput when the workload is I/O bound, and
latency is generally no worse and even better with any workload.

I suspect that the nice top-down approach to nbtsplitloc.c has its
costs...will let you know more when I know more.

> > The idea with pg_upgrade'd v3 indexes is, as I said a while back, that
> > they too have a heap TID attribute. nbtsearch.c code is not allowed to
> > rely on its value, though, and must use
> > minusinfkey/searching_for_pivot_tuple semantics (relying on its value
> > being minus infinity is still relying on its value being something).
>
> Yeah. I find that's a complicated way to think about it. My mental model
> is that v4 indexes store heap TIDs, and every tuple is unique thanks to
> that. But on v3, we don't store heap TIDs, and duplicates are possible.

I'll try it that way, then.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-03-10 20:58:39 Re: Should we increase the default vacuum_cost_limit?
Previous Message Alvaro Herrera 2019-03-10 20:09:23 Re: performance issue in remove_from_unowned_list()