Further XLogInsert scaling tweaking

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Further XLogInsert scaling tweaking
Date: 2013-09-02 07:14:03
Message-ID: 52243ABB.5070903@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Now that I've had a little break from the big XLogInsert scaling patch,
I went back to do some testing and profiling of it. I saw a lot of
contention on the first access of RedoRecPtr and force/fullPageWrites,
which made me realize that I put those variables right next to the
heavily-contended insertpos_lck spinlock and the variables that it
protects. Needless to say, that's a recipe for contention.

I added some padding between those two, per attached patch, and got a
big improvement. So we should definitely do that. I just added a
char[128] field between them, which is enough to put them on different
cache lines on the server I'm testing on. I don't think there is any
platform-independent way to get the current cache line size,
unfortunately. Since this doesn't affect correctness, I'm inclined to
just add the 128-byte padding field there.

Attached is a graph generated by pgbench-tools. Full results are
available here: http://hlinnaka.iki.fi/xloginsert-scaling/padding/. The
test query used was:

insert into foo:client_id select generate_series(1, 100);

That is, each client inserts a lot of rows into a different table. The
table has no indexes. This is pretty much the worst-case scenario for
stressing XLogInsert.

The "master-b03d196" is unpatched git master, "xlog-padding-fb741c0" is
with the padding. The -16 plots are the same, but with
xloginsert_slots=16 (the default is 8). The server has 16 cores, 32 with
hyperthreading.

It's interesting that although the peak performance is better with 8
slots than with 16, it doesn't scale as gracefully with 16 slots. I
believe that's because with more insertion slots, you have more backends
fighting over the insertpos_lck. With fewer slots, the extra work is
queued behind the insertion slots instead, and sleeping is better than
spinning.

In either case, the performance with the padding is better than without.

- Heikki

Attachment Content-Type Size
image/png 7.1 KB
xloginsert-padding-1.patch text/x-diff 1.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gilles Darold 2013-09-02 09:45:59 Re: psql and pset without any arguments
Previous Message Peter Geoghegan 2013-09-02 06:40:34 Re: INSERT...ON DUPLICATE KEY IGNORE