Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Date: 2019-08-23 11:45:47
Message-ID: d06c5e6f-52be-f63e-79cd-c7fc38f11185@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

23.08.2019 7:33, Peter Geoghegan wrote:
> On Wed, Aug 21, 2019 at 10:19 AM Anastasia Lubennikova
> <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>> I'm going to look through the patch once more to update nbtxlog
>> comments, where needed and
>> answer to your remarks that are still left in the comments.
> Have you been using amcheck's rootdescend verification?

No, I haven't checked it with the latest version yet.

> There were many large indexes that amcheck didn't detect a problem
> with. I don't yet understand what the problem is, or why we only see
> the problem for a small number of indexes. Note that all of these
> indexes passed verification with v5, so this is some kind of
> regression.
>
> I also noticed that there were some regressions in the size of indexes
> -- indexes were not nearly as small as they were in v5 in some cases.
> The overall picture was a clear regression in how effective
> deduplication is.
Do these indexes have something in common? Maybe some specific workload?
Are there any error messages in log?

I'd like to specify what caused the problem.
There were several major changes between v5 and v8:
- dead tuples handling added in v6;
- _bt_split changes for posting tuples in v7;
- WAL logging of posting tuple changes in v8.

I don't think the last one could break regular indexes on master.
Do you see the same regression in v6, v7?

> I think that it would save time if you had direct access to my test
> data, even though it's a bit cumbersome. You'll have to download about
> 10GB of dumps, which require plenty of disk space when restored:
>
>
> Want me to send this data and the associated tests script over to you?
>
Yes, I think it will help me to debug the patch faster.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Asim R P 2019-08-23 12:09:35 Re: Cleanup isolation specs from unused steps
Previous Message Stephen Frost 2019-08-23 11:45:22 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)