Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date: 2019-09-25 19:14:08
Message-ID: CAH2-Wz=1FaPa2TroKYxpG7mqUJrssOfmNQOPW+vTVmG4vyXD7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 25, 2019 at 8:05 AM Anastasia Lubennikova
<a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
> Attached is v18. In this version bt_dedup_one_page() is refactored so that:
> - no temp page is used, all updates are applied to the original page.
> - each posting tuple wal logged separately.
> This also allowed to simplify btree_xlog_dedup significantly.

This looks great! Even if it isn't faster than using a temp page
buffer, the flexibility seems like an important advantage. We can do
things like have the _bt_dedup_one_page() caller hint that
deduplication should start at a particular offset number. If that
doesn't work out by the time the end of the page is reached (whatever
"works out" may mean), then we can just start at the beginning of the
page, and work through the items we skipped over initially.

> > We still haven't added an "off" switch to deduplication, which seems
> > necessary. I suppose that this should look like GIN's "fastupdate"
> > storage parameter.

> Why is it necessary to save this information somewhere but rel->rd_options,
> while we can easily access this field from _bt_findinsertloc() and
> _bt_load().

Maybe, but we also need to access a flag that says it's safe to use
deduplication. Obviously deduplication is not safe for datatypes like
numeric and text with a nondeterministic collation. The "is
deduplication safe for this index?" mechanism will probably work by
doing several catalog lookups. This doesn't seem like something we
want to do very often, especially with a buffer lock held -- ideally
it will be somewhere that's convenient to access.

Do we want to do that separately, and have a storage parameter that
says "I would like to use deduplication in principle, if it's safe"?
Or, do we store both pieces of information together, and forbid
setting the storage parameter to on when it's known to be unsafe for
the underlying opclasses used by the index? I don't know.

I think that you can start working on this without knowing exactly how
we'll do those catalog lookups. What you come up with has to work with
that before the patch can be committed, though.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2019-09-25 19:26:54 Re: Two pg_rewind patches (auto generate recovery conf and ensure clean shutdown)
Previous Message Alvaro Herrera 2019-09-25 18:59:52 Re: [PATCH][PROPOSAL] Add enum releation option type