Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date: 2019-09-18 18:01:19
Message-ID: CAH2-WzmrcCrPDChnGTd-qwqaHhvQB5qaLbLarY_JPCVcTXdXPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 18, 2019 at 10:43 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> This also suggests that making _bt_dedup_one_page() do raw page adds
> and page deletes to the page in shared_buffers (i.e. don't use a temp
> buffer page) could pay off. As I went into at the start of this
> e-mail, unnecessarily doing expensive things like copying large
> posting lists around is a real concern. Even if it isn't truly useful
> for _bt_dedup_one_page() to operate in a very incremental fashion,
> incrementalism is probably still a good thing to aim for -- it seems
> to make deduplication faster in all cases.

I think that I forgot to mention that I am concerned that the
kill_prior_tuple/LP_DEAD optimization could be applied less often
because _bt_dedup_one_page() operates too aggressively. That is a big
part of my general concern.

Maybe I'm wrong about this -- who knows? I definitely think that
LP_DEAD setting by _bt_check_unique() is generally a lot more
important than LP_DEAD setting by the kill_prior_tuple optimization,
and the patch won't affect unique indexes. Only very serious
benchmarking can give us a clear answer, though.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2019-09-18 18:43:41 Re: dropdb --force
Previous Message James Coleman 2019-09-18 17:51:00 [DOC] Document concurrent index builds waiting on each other