Re: Relation extension scalability

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Relation extension scalability
Date: 2016-02-02 15:12:38
Message-ID: CA+TgmobTZSFBZr+RBSUqOSk5mk0JW=6rT8zbKvRyn8gwRzL7Rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 28, 2016 at 6:23 AM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> [ new patch ]

This patch contains a useless hunk and also code not in PostgreSQL
style. Get pgindent set up and it will do it correctly for you, or
look at the style of the surrounding code.

What I'm a bit murky about is *why* this should be any better than the
status quo. I mean, the obvious explanation that pops to mind is that
extending the relation by two pages at a time relieves pressure on the
relation extension lock somehow. One other possible explanation is
that calling RecordPageWithFreeSpace() allows multiple backends to get
access to that page at the same time, while otherwise each backend
would try to conduct a separate extension. But in the first case,
what we ought to do is try to make the locking more efficient; and in
the second case, we might want to think about recording the first page
in the free space map too. I don't think the problem here is that w

Here's a sketch of another approach to this problem. Get rid of the
relation extension lock. Instead, have an array of, say, 256 lwlocks.
Each one protects the extension of relations where hash(relfilenode) %
256 maps to that lock. To extend a relation, grab the corresponding
lwlock, do the work, then release the lwlock. You might occasionally
have a situation where two relations are both being extended very
quickly and happen to map to the same bucket, but that shouldn't be
much of a problem in practice, and the way we're doing it presently is
worse, not better, since two relation extension locks may very easily
map to the same lock manager partition. The only problem with this is
that acquiring an LWLock holds off interrupts, and we don't want
interrupts to be locked out across a potentially lengthy I/O. We
could partially fix that if we call RESUME_INTERRUPTS() after
acquiring the lock and HOLD_INTERRUPTS() just before releasing it, but
there's still the problem that you might block non-interruptibly while
somebody else has the lock. I don't see an easy solution to that
problem right off-hand, but if something like this performs well we
can probably conjure up some solution to that problem.

I'm not saying that we need to do that exact thing - but I am saying
that I don't think we can proceed with an approach like this without
first understanding why it works and whether there's some other way
that might be better to address the underlying problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2016-02-02 15:13:07 Re: Support for N synchronous standby servers - take 2
Previous Message Magnus Hagander 2016-02-02 14:56:18 Re: pg_lsn cast to/from int8