Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

From: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Date: 2020-02-10 16:58:22
Message-ID: CAKYtNAre3w8DNo7=Kcg7Pt3hZ6_YJ-Fb7_kBFk9BxOks9vSvNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 8 Feb 2020 at 00:27, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
wrote:
>
> On Thu, 6 Feb 2020 at 09:44, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Feb 6, 2020 at 1:57 AM Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
wrote:
> > >
> > > On Wed, 5 Feb 2020 at 12:07, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:
> > > >
> > > > On Mon, Feb 3, 2020 at 8:03 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > > > >
> > > > > On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada <
sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas <
robertmhaas(at)gmail(dot)com> wrote:
> > > > > > > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund <
andres(at)anarazel(dot)de> wrote:
> > > > > > >>> I think the real question is whether the scenario is common
enough to
> > > > > > >>> worry about. In practice, you'd have to be extremely
unlucky to be
> > > > > > >>> doing many bulk loads at the same time that all happened to
hash to
> > > > > > >>> the same bucket.
> > > > > > >>
> > > > > > >> With a bunch of parallel bulkloads into partitioned tables
that really
> > > > > > >> doesn't seem that unlikely?
> > > > > > >
> > > > > > > It increases the likelihood of collisions, but probably
decreases the
> > > > > > > number of cases where the contention gets really bad.
> > > > > > >
> > > > > > > For example, suppose each table has 100 partitions and you are
> > > > > > > bulk-loading 10 of them at a time. It's virtually certain
that you
> > > > > > > will have some collisions, but the amount of contention
within each
> > > > > > > bucket will remain fairly low because each backend spends
only 1% of
> > > > > > > its time in the bucket corresponding to any given partition.
> > > > > > >
> > > > > >
> > > > > > I share another result of performance evaluation between
current HEAD
> > > > > > and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
> > > > > >
> > > > > > Type of table: normal table, unlogged table
> > > > > > Number of child tables : 16, 64 (all tables are located on the
same tablespace)
> > > > > > Number of clients : 32
> > > > > > Number of trials : 100
> > > > > > Duration: 180 seconds for each trials
> > > > > >
> > > > > > The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores),
256GB
> > > > > > RAM, NVMe SSD 1.5TB.
> > > > > > Each clients load 10kB random data across all partitioned
tables.
> > > > > >
> > > > > > Here is the result.
> > > > > >
> > > > > > childs | type | target | avg_tps | diff with HEAD
> > > > > > --------+----------+---------+------------+------------------
> > > > > > 16 | normal | HEAD | 1643.833 |
> > > > > > 16 | normal | Patched | 1619.5404 | 0.985222
> > > > > > 16 | unlogged | HEAD | 9069.3543 |
> > > > > > 16 | unlogged | Patched | 9368.0263 | 1.032932
> > > > > > 64 | normal | HEAD | 1598.698 |
> > > > > > 64 | normal | Patched | 1587.5906 | 0.993052
> > > > > > 64 | unlogged | HEAD | 9629.7315 |
> > > > > > 64 | unlogged | Patched | 10208.2196 | 1.060073
> > > > > > (8 rows)
> > > > > >
> > > > > > For normal tables, loading tps decreased 1% ~ 2% with this patch
> > > > > > whereas it increased 3% ~ 6% for unlogged tables. There were
> > > > > > collisions at 0 ~ 5 relation extension lock slots between 2
relations
> > > > > > in the 64 child tables case but it didn't seem to affect the
tps.
> > > > > >
> > > > >
> > > > > AFAIU, this resembles the workload that Andres was worried about.
I
> > > > > think we should once run this test in a different environment, but
> > > > > considering this to be correct and repeatable, where do we go with
> > > > > this patch especially when we know it improves many workloads [1]
as
> > > > > well. We know that on a pathological case constructed by Mithun
[2],
> > > > > this causes regression as well. I am not sure if the test done by
> > > > > Mithun really mimics any real-world workload as he has tested by
> > > > > making N_RELEXTLOCK_ENTS = 1 to hit the worst case.
> > > > >
> > > > > Sawada-San, if you have a script or data for the test done by you,
> > > > > then please share it so that others can also try to reproduce it.
> > > >
> > > > Unfortunately the environment I used for performance verification is
> > > > no longer available.
> > > >
> > > > I agree to run this test in a different environment. I've attached
the
> > > > rebased version patch. I'm measuring the performance with/without
> > > > patch, so will share the results.
> > > >
> > >
> > > Thanks Sawada-san for patch.
> > >
> > > From last few days, I was reading this thread and was reviewing v13
patch. To debug and test, I did re-base of v13 patch. I compared my
re-based patch and v14 patch. I think, ordering of header files is not
alphabetically in v14 patch. (I haven't reviewed v14 patch fully because
before review, I wanted to test false sharing). While debugging, I didn't
noticed any hang or lock related issue.
> > >
> > > I did some testing to test false sharing(bulk insert, COPY data, bulk
insert into partitions tables). Below is the testing summary.
> > >
> > > Test setup(Bulk insert into partition tables):
> > > autovacuum=off
> > > shared_buffers=512MB -c max_wal_size=20GB -c checkpoint_timeout=12min
> > >
> > > Basically, I created a table with 13 partitions. Using pgbench, I
inserted bulk data. I used below pgbench command:
> > > ./pgbench -c $threads -j $threads -T 180 -f insert1(dot)sql(at)1 -f
insert2(dot)sql(at)1 -f insert3(dot)sql(at)1 -f insert4(dot)sql(at)1 postgres
> > >
> > > I took scripts from previews mails and modified. For reference, I am
attaching test scripts. I tested with default 1024 slots(N_RELEXTLOCK_ENTS
= 1024).
> > >
> > > Clients HEAD (tps) With v14 patch (tps)
%change (time: 180s)
> > > 1 92.979796 100.877446
+8.49 %
> > > 32 392.881863 388.470622
-1.12 %
> > > 56 551.753235 528.018852
-4.30 %
> > > 60 648.273767 653.251507
+0.76 %
> > > 64 645.975124 671.322140
+3.92 %
> > > 66 662.728010 673.399762
+1.61 %
> > > 70 647.103183 660.694914
+2.10 %
> > > 74 648.824027 676.487622
+4.26 %
> > >
> > > From above results, we can see that in most cases, TPS is slightly
increased with v14 patch. I am still testing and will post my results.
> > >
> >
> > The number at 56 and 74 client count seem slightly suspicious. Can
> > you please repeat those tests? Basically, I am not able to come up
> > with a theory why at 56 clients the performance with the patch is a
> > bit lower and then at 74 it is higher.
>
> Okay. I will repeat test.

I re-tested in different machine because in previous machine, results are
in-consistent

*My testing machine:*
$ lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 8
Core(s) per socket: 1
Socket(s): 24
NUMA node(s): 4
Model: IBM,8286-42A
L1d cache: 64K
L1i cache: 32K
L2 cache: 512K
L3 cache: 8192K
NUMA node0 CPU(s): 0-47
NUMA node1 CPU(s): 48-95
NUMA node2 CPU(s): 96-143
NUMA node3 CPU(s): 144-191

./pgbench -c $threads -j $threads -T 180 -f insert1(dot)sql(at)1 -f insert2(dot)sql(at)1
-f insert3(dot)sql(at)1 -f insert4(dot)sql(at)1 postgres

Clients HEAD(tps) With v14 patch(tps) %change
(time: 180s)
1 41.491486 41.375532 -0.27%
32 335.138568 330.028739 -1.52%
56 353.783930 360.883710 +2.00%
60 341.741925 359.028041 +5.05%
64 338.521730 356.511423 +5.13%
66 339.838921 352.761766 +3.80%
70 339.305454 353.658425 +4.23%
74 332.016217 348.809042 +5.05%

From above results, it seems that there is very little regression with the
patch(+-5%) that can be run to run variation.

>
> >
> > > I want to test extension lock by blocking use of fsm(use_fsm=false in
code). I think, if we block use of fsm, then load will increase into
extension lock. Is this correct way to test?
> > >
> >
> > Hmm, I think instead of directly hacking the code, you might want to
> > use the operation (probably cluster or vacuum full) where we set
> > HEAP_INSERT_SKIP_FSM. I think along with this you can try with
> > unlogged tables because that might stress the extension lock.
>
> Okay. I will test.

I tested with unlogged tables also. There also I was getting 3-6% gain in
tps.

>
> >
> > In the above test, you might want to test with a higher number of
> > partitions (say up to 100) as well. Also, see if you want to use the
> > Copy command.
>
> Okay. I will test.

I tested with 500, 1000, 2000 paratitions. I observed max +5% regress in
the tps and there was no performace degradation.

*For example:*
I created a table with 2000 paratitions and then I checked false sharing.
Slot Number Slot Freq. Slot Number Slot Freq. Slot Number Slot Freq.
156 13 973 11 446 10
627 13 52 10 488 10
782 12 103 10 501 10
812 12 113 10 701 10
192 11 175 10 737 10
221 11 235 10 754 10
367 11 254 10 781 10
546 11 314 10 790 10
814 11 419 10 833 10
917 11 424 10 888 10

From above table, we can see that total 13 child tables are falling in same
backet (slot 156) so I did bulk-loading only in those 13 child tables to
check tps in false sharing but I noticed that there was no performance
degradation.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sehrope Sarkuni 2020-02-10 17:03:27 Re: 2020-02-13 Press Release Draft
Previous Message Erik Rijkers 2020-02-10 16:55:17 Re: 2020-02-13 Press Release Draft