Re: Proposal for CSN based snapshots

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andres Freund <andres(at)2ndquadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, Ants Aasma <ants(at)cybertec(dot)at>, Bruce Momjian <bruce(at)momjian(dot)us>, obartunov <obartunov(at)postgrespro(dot)ru>, Teodor Sigaev <teodor(at)postgrespro(dot)ru>, Borodin Vladimir <root(at)simply(dot)name>
Subject: Re: Proposal for CSN based snapshots
Date: 2017-08-02 02:10:24
Message-ID: CAA4eK1JFuRnA3fQLtWYRH5f1bSAWMbOdHADkhwJVpxt8=sSK6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 1, 2017 at 7:41 PM, Alexander Kuzmenkov
<a(dot)kuzmenkov(at)postgrespro(dot)ru> wrote:
> Hi all,
>
> So I did some more experiments on this patch.
>
> * I fixed the bug with duplicate tuples I mentioned in the previous letter.
> Indeed, the oldestActiveXid could be advanced past the transaction's xid
> before it set the clog status. This happened because the oldestActiveXid is
> calculated based on the CSN log contents, and we wrote to CSN log before
> writing to clog. The fix is to write to clog before CSN log
> (TransactionIdAsyncCommitTree)
>
> * We can remove the exclusive locking on CSNLogControlLock when setting the
> CSN for a transaction (CSNLogSetPageStatus). When we assign a CSN to a
> transaction and its children, the atomicity is guaranteed by using an
> intermediate state (COMMITSEQNO_COMMITTING), so it doesn't matter if this
> function is not atomic in itself. The shared lock should suffice here.
>
> * On throughputs of about 100k TPS, we allocate ~1k CSN log pages per
> second. This is done with exclusive locking on CSN control lock, and
> noticeably increases contention. To alleviate this, I allocate new pages in
> batches (ExtendCSNLOG).
>
> * When advancing oldestActiveXid, we scan CSN log to find an xid that is
> still in progress. To do that, we increment the xid and query its CSN using
> the high level function, acquiring and releasing the lock and looking up the
> log page for each xid. I wrote some code to acquire the lock only once and
> then scan the pages (CSNLogSetPageStatus).
>
> * On bigger buffers the linear page lookup code that the SLRU uses now
> becomes slow. I added a shared dynahash table to speed up this lookup.
>
> * I merged in recent changes from master (up to 7e1fb4). Unfortunately I
> didn't have enough time to fix the logical replication and snapshot import,
> so now it's completely broken.
>
> I ran some pgbench with these tweaks (tpcb-like, 72 cores, scale 500). The
> throughput is good on lower number of clients (on 50 clients it's 35% higher
> than on the master), but then it degrades steadily. After 250 clients it's
> already lower than master; see the attached graph. In perf reports the
> CSN-related things have almost vanished, and I see lots of time spent
> working with clog. This is probably the situation where by making some parts
> faster, the contention in other parts becomes worse and overall we have a
> performance loss.

Yeah, this happens sometimes and I have also observed this behavior.

> Hilariously, at some point I saw a big performance
> increase after adding some debug printfs. I wanted to try some things with
> the clog next, but for now I'm out of time.
>

What problem exactly you are seeing in the clog, is it the contention
around CLOGControlLock or generally accessing CLOG is slower. If
former, then we already have a patch [1] to address it.

[1] - https://commitfest.postgresql.org/14/358/

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2017-08-02 02:30:30 Re: Faster methods for getting SPI results
Previous Message Tatsuo Ishii 2017-08-02 02:03:28 Confusing error message in pgbench