Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date: 2019-07-11 18:19:46
Message-ID: CAH2-WzkkCa4ES9BTt4VCe1RygOqf8ZZppvnSg_URaW7EFa1KDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 11, 2019 at 8:34 AM Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com> wrote:
> I tried this patch and found the improvements impressive. However,
> when I tried with multi-column indexes it wasn't giving any
> improvement, is it the known limitation of the patch?

It'll only deduplicate full duplicates. It works with multi-column
indexes, provided the entire set of values in duplicated -- not just a
prefix. Prefix compression is possible, but it's more complicated. It
seems to generally require the DBA to specify a prefix length,
expressed as a number of prefix columns.

> I am surprised to find that such a patch is on radar since quite some
> years now and not yet committed.

The v12 work on nbtree (making heap TID a tiebreaker column) seems to
have made the general approach a lot more effective. Compression is
performed lazily, not eagerly, which seems to work a lot better.

> + elog(DEBUG4, "insert_itupprev_to_page. compressState->ntuples %d
> IndexTupleSize %zu free %zu",
> + compressState->ntuples, IndexTupleSize(to_insert), PageGetFreeSpace(page));
> +
> and other such DEBUG4 statements are meant to be removed, right...?

I hope so too.

> /*
> * If we have only 10 uncompressed items on the full page, it probably
> * won't worth to compress them.
> */
> if (maxoff - n_posting_on_page < 10)
> return;
>
> Is this a magic number...?

I think that this should be a constant or something.

> /*
> * We do not expect to meet any DEAD items, since this function is
> * called right after _bt_vacuum_one_page(). If for some reason we
> * found dead item, don't compress it, to allow upcoming microvacuum
> * or vacuum clean it up.
> */
> if (ItemIdIsDead(itemId))
> continue;
>
> This makes me wonder about those 'some' reasons.

I think that this is just defensive. Note that _bt_vacuum_one_page()
is prepared to find no dead items, even when the BTP_HAS_GARBAGE flag
is set for the page.

> Caller is responsible for checking BTreeTupleIsPosting to ensure that
> + * he will get what he expects
>
> This can be re-framed to make the caller more gender neutral.

Agreed. I also don't like anthropomorphizing code like this.

> Other than that, I am curious about the plans for its backward compatibility.

Me too. There is something about a new version 5 in comments in
nbtree.h, but the version number isn't changed. I think that we may be
able to get away with not increasing the B-Tree version from 4 to 5,
actually. Deduplication is performed lazily when it looks like we
might have to split the page, so there isn't any expectation that
tuples will either be compressed or uncompressed in any context.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2019-07-11 18:28:49 Re: let's make the list of reportable GUCs configurable (was Re: Add %r substitution for psql prompts to show recovery status)
Previous Message Peter Geoghegan 2019-07-11 17:51:23 Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.