Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Date: 2017-12-11 21:32:31
Message-ID: 20171211213231.rzgjt7llfv4sfj6s@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-12-11 15:55:42 -0500, Robert Haas wrote:
> On Mon, Dec 11, 2017 at 3:25 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > For me "very short periods of time" and journaled metadatachanging
> > filesystem operations don't quite mesh. Language lawyering aside, this
> > seems quite likely to bite us down the road.
> >
> > It's imo perfectly fine to say that there's only a limited number of
> > file extension locks, but that there's a far from neglegible chance of
> > conflict even without the array being full doesn't seem nice. Think this
> > needs use some open addressing like conflict handling or something
> > alike.
>
> I guess we could consider that, but I'm not really convinced that it's
> solving a real problem. Right now, you start having meaningful chance
> of lock-manager lock contention when the number of concurrent
> processes in the system requesting heavyweight locks is still in the
> single digits, because there are only 16 lock-manager locks. With
> this, there are effectively 1024 partitions.
>
> Now I realize you're going to point out, not wrongly, that we're
> contending on the locks themselves rather than the locks protecting
> the locks, and that this makes everything worse because the hold time
> is much longer.

Indeed.

> Fair enough. On the other hand, what workload would actually be
> harmed? I think you basically have to imagine a lot of relations
> being extended simultaneously, like a parallel bulk load, and an
> underlying filesystem which performs individual operations slowly but
> scales really well. I'm slightly skeptical that's how real-world
> filesystems behave.

Or just two independent relations on two different filesystems.

> It might be a good idea, though, to test how parallel bulk loading
> behaves with this patch applied, maybe even after reducing
> N_RELEXTLOCK_ENTS to simulate an unfortunate number of collisions.

Yea, that sounds like a good plan. Measure two COPYs to relations on
different filesystems, reduce N_RELEXTLOCK_ENTS to 1, and measure
performance. Then increase the concurrency of the copies to each
relation.

> This isn't a zero-sum game. If we add collision resolution, we're
> going to slow down the ordinary uncontended case; the bookkeeping will
> get significantly more complicated. That is only worth doing if the
> current behavior produces pathological cases on workloads that are
> actually somewhat realistic.

Yea, measuring sounds like a good plan.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2017-12-11 21:38:17 Re: proposal: alternative psql commands quit and exit
Previous Message Peter Eisentraut 2017-12-11 21:29:35 Re: [HACKERS] static assertions in C++