Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Date: 2017-12-11 20:55:42
Message-ID: CA+TgmoYOf6MWAMR79fj_K2_6fO2syDx7DbyNahOQQc2mmB6XhQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 11, 2017 at 3:25 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> For me "very short periods of time" and journaled metadatachanging
> filesystem operations don't quite mesh. Language lawyering aside, this
> seems quite likely to bite us down the road.
>
> It's imo perfectly fine to say that there's only a limited number of
> file extension locks, but that there's a far from neglegible chance of
> conflict even without the array being full doesn't seem nice. Think this
> needs use some open addressing like conflict handling or something
> alike.

I guess we could consider that, but I'm not really convinced that it's
solving a real problem. Right now, you start having meaningful chance
of lock-manager lock contention when the number of concurrent
processes in the system requesting heavyweight locks is still in the
single digits, because there are only 16 lock-manager locks. With
this, there are effectively 1024 partitions.

Now I realize you're going to point out, not wrongly, that we're
contending on the locks themselves rather than the locks protecting
the locks, and that this makes everything worse because the hold time
is much longer. Fair enough. On the other hand, what workload would
actually be harmed? I think you basically have to imagine a lot of
relations being extended simultaneously, like a parallel bulk load,
and an underlying filesystem which performs individual operations
slowly but scales really well. I'm slightly skeptical that's how
real-world filesystems behave.

It might be a good idea, though, to test how parallel bulk loading
behaves with this patch applied, maybe even after reducing
N_RELEXTLOCK_ENTS to simulate an unfortunate number of collisions.

This isn't a zero-sum game. If we add collision resolution, we're
going to slow down the ordinary uncontended case; the bookkeeping will
get significantly more complicated. That is only worth doing if the
current behavior produces pathological cases on workloads that are
actually somewhat realistic.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2017-12-11 21:10:24 Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Previous Message Tom Lane 2017-12-11 20:52:01 Re: Inconsistency in plpgsql's error context reports