Re: Making all nbtree entries unique by having heap TIDs participate in comparisons

From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Subject: Re: Making all nbtree entries unique by having heap TIDs participate in comparisons
Date: 2018-08-02 04:48:34
Message-ID: 23dd1626-8bf3-a74c-904c-5511384bb6e3@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I use v3 version of the patch for a Retail Indextuple Deletion and from
time to time i catch regression test error (see attachment).
As i see in regression.diff, the problem is instability order of DROP
... CASCADE deletions.
Most frequently i get error on a test called 'updatable views'.
I check nbtree invariants during all tests, but index relations is in
consistent state all time.
My hypothesis is: instability order of logical duplicates in index
relations on a pg_depend relation.
But 'updatable views' test not contains any sources of instability:
concurrent insertions, updates, vacuum and so on. This fact discourage me.
May be you have any ideas on this problem?

18.07.2018 00:21, Peter Geoghegan пишет:
> Attached is my v3, which has some significant improvements:
>
> * The hinting for unique index inserters within _bt_findinsertloc()
> has been restored, more or less.
>
> * Bug fix for case where left side of split comes from tuple being
> inserted. We need to pass this to _bt_suffix_truncate() as the left
> side of the split, which we previously failed to do. The amcheck
> coverage I've added allowed me to catch this issue during a benchmark.
> (I use amcheck during benchmarks to get some amount of stress-testing
> in.)
>
> * New performance optimization that allows us to descend a downlink
> when its user-visible attributes have scankey-equal values. We avoid
> an unnecessary move left by using a sentinel scan tid that's less than
> any possible real heap TID, but still greater than minus infinity to
> _bt_compare().
>
> I am now considering pursuing this as a project in its own right,
> which can be justified without being part of some larger effort to add
> retail index tuple deletion (e.g. by VACUUM). I think that I can get
> it to the point of being a totally unambiguous win, if I haven't
> already. So, this patch is no longer just an interesting prototype of
> a new architectural direction we should take. In any case, it has far
> fewer problems than v2.
>
> Testing the performance characteristics of this patch has proven
> difficult. My home server seems to show a nice win with a pgbench
> workload that uses a Gaussian distribution for the pgbench_accounts
> queries (script attached). That seems consistent and reproducible. My
> home server has 32GB of RAM, and has a Samsung SSD 850 EVO SSD, with a
> 250GB capacity. With shared_buffers set to 12GB, 80 minute runs at
> scale 4800 look like this:
>
> Master:
>
> 25 clients:
> tps = 15134.223357 (excluding connections establishing)
>
> 50 clients:
> tps = 13708.419887 (excluding connections establishing)
>
> 75 clients:
> tps = 12951.286926 (excluding connections establishing)
>
> 90 clients:
> tps = 12057.852088 (excluding connections establishing)
>
> Patch:
>
> 25 clients:
> tps = 17857.863353 (excluding connections establishing)
>
> 50 clients:
> tps = 14319.514825 (excluding connections establishing)
>
> 75 clients:
> tps = 14015.794005 (excluding connections establishing)
>
> 90 clients:
> tps = 12495.683053 (excluding connections establishing)
>
> I ran this twice, and got pretty consistent results each time (there
> were many other benchmarks on my home server -- this was the only one
> that tested this exact patch, though). Note that there was only one
> pgbench initialization for each set of runs. It looks like a pretty
> strong result for the patch - note that the accounts table is about
> twice the size of available main memory. The server is pretty well
> overloaded in every individual run.
>
> Unfortunately, I have a hard time showing much of any improvement on a
> storage-optimized AWS instance with EBS storage, with scaled up
> pgbench scale and main memory. I'm using an i3.4xlarge, which has 16
> vCPUs, 122 GiB RAM, and 2 SSDs in a software RAID0 configuration. It
> appears to more or less make no overall difference there, for reasons
> that I have yet to get to the bottom of. I conceived this AWS
> benchmark as something that would have far longer run times with a
> scaled-up database size. My expectation was that it would confirm the
> preliminary result, but it hasn't.
>
> Maybe the issue is that it's far harder to fill the I/O queue on this
> AWS instance? Or perhaps its related to the higher latency of EBS,
> compared to the local SSD on my home server? I would welcome any ideas
> about how to benchmark the patch. It doesn't necessarily have to be a
> huge win for a very generic workload like the one I've tested, since
> it would probably still be enough of a win for things like free space
> management in secondary indexes [1]. Plus, of course, it seems likely
> that we're going to eventually add retail index tuple deletion in some
> form or another, which this is prerequisite to.
>
> For a project like this, I expect an unambiguous, across the board win
> from the committed patch, even if it isn't a huge win. I'm encouraged
> by the fact that this is starting to look like credible as a
> stand-alone patch, but I have to admit that there's probably still
> significant gaps in my understanding of how it affects real-world
> performance. I don't have a lot of recent experience with benchmarking
> workloads like this one.
>
> [1] https://postgr.es/m/CAH2-Wzmf0fvVhU+SSZpGW4Qe9t--j_DmXdX3it5JcdB8FF2EsA@mail.gmail.com
>

--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
collate_bug.tar.gz application/gzip 2.3 KB
updatable_views.tar.gz application/gzip 2.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-08-02 04:56:44 Re: Explain buffers wrong counter with parallel plans
Previous Message Amit Langote 2018-08-02 04:06:41 Re: insert on conflict on updatable views