Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Date: 2020-02-03 11:03:22
Message-ID: CAA4eK1+C6JQvf=_oW7=GfVec0KKa5GL8uzMqQ9FtUZ3e2gKBdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 26, 2018 at 12:47 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Apr 27, 2018 at 4:25 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Thu, Apr 26, 2018 at 3:10 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >>> I think the real question is whether the scenario is common enough to
> >>> worry about. In practice, you'd have to be extremely unlucky to be
> >>> doing many bulk loads at the same time that all happened to hash to
> >>> the same bucket.
> >>
> >> With a bunch of parallel bulkloads into partitioned tables that really
> >> doesn't seem that unlikely?
> >
> > It increases the likelihood of collisions, but probably decreases the
> > number of cases where the contention gets really bad.
> >
> > For example, suppose each table has 100 partitions and you are
> > bulk-loading 10 of them at a time. It's virtually certain that you
> > will have some collisions, but the amount of contention within each
> > bucket will remain fairly low because each backend spends only 1% of
> > its time in the bucket corresponding to any given partition.
> >
>
> I share another result of performance evaluation between current HEAD
> and current HEAD with v13 patch(N_RELEXTLOCK_ENTS = 1024).
>
> Type of table: normal table, unlogged table
> Number of child tables : 16, 64 (all tables are located on the same tablespace)
> Number of clients : 32
> Number of trials : 100
> Duration: 180 seconds for each trials
>
> The hardware spec of server is Intel Xeon 2.4GHz (HT 160cores), 256GB
> RAM, NVMe SSD 1.5TB.
> Each clients load 10kB random data across all partitioned tables.
>
> Here is the result.
>
> childs | type | target | avg_tps | diff with HEAD
> --------+----------+---------+------------+------------------
> 16 | normal | HEAD | 1643.833 |
> 16 | normal | Patched | 1619.5404 | 0.985222
> 16 | unlogged | HEAD | 9069.3543 |
> 16 | unlogged | Patched | 9368.0263 | 1.032932
> 64 | normal | HEAD | 1598.698 |
> 64 | normal | Patched | 1587.5906 | 0.993052
> 64 | unlogged | HEAD | 9629.7315 |
> 64 | unlogged | Patched | 10208.2196 | 1.060073
> (8 rows)
>
> For normal tables, loading tps decreased 1% ~ 2% with this patch
> whereas it increased 3% ~ 6% for unlogged tables. There were
> collisions at 0 ~ 5 relation extension lock slots between 2 relations
> in the 64 child tables case but it didn't seem to affect the tps.
>

AFAIU, this resembles the workload that Andres was worried about. I
think we should once run this test in a different environment, but
considering this to be correct and repeatable, where do we go with
this patch especially when we know it improves many workloads [1] as
well. We know that on a pathological case constructed by Mithun [2],
this causes regression as well. I am not sure if the test done by
Mithun really mimics any real-world workload as he has tested by
making N_RELEXTLOCK_ENTS = 1 to hit the worst case.

Sawada-San, if you have a script or data for the test done by you,
then please share it so that others can also try to reproduce it.

[1] - https://www.postgresql.org/message-id/4c171ffe-e3ee-acc5-9066-a40d52bc5ae9%40postgrespro.ru
[2] - https://www.postgresql.org/message-id/CAD__Oug52j%3DDQMoP2b%3DVY7wZb0S9wMNu4irXOH3-ZjFkzWZPGg%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message hubert depesz lubaczewski 2020-02-03 11:40:22 Re: BUG #16171: Potential malformed JSON in explain output
Previous Message Andres Freund 2020-02-03 09:37:25 Re: base backup client as auxiliary backend process