Quick Links

Re: Avoiding repeated snapshot computation

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Subject:	Re: Avoiding repeated snapshot computation
Date:	2011-12-13 17:52:30
Message-ID:	CA+TgmoaKu-h=1oHFqwUw+9Qg4zkyvS8bWcW6Kiw2qxFp3JvC1w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Nov 26, 2011 at 5:49 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On Saturday, November 26, 2011 11:39:23 PM Robert Haas wrote:
>> On Sat, Nov 26, 2011 at 5:28 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> > On Saturday, November 26, 2011 09:52:17 PM Tom Lane wrote:
>> >> I'd just as soon keep the fields in a logical order.
>> >
>> > Btw, I don't think the new order is necessarily worse than the old one.
>>
>> You forget to attach the benchmark results.
>>
>> My impression is that cache lines on modern hardware are 64 or 128
>> *bytes*, in which case this wouldn't matter a bit.
> All current x86 cpus use 64bytes. The 2nd 128bit reference was a typo. Sorry
> for that.
> And why is 72=>56 *bytes* (I even got that one right) not relevant for 64byte
> cachelines?

OK, so I got around to looking at this again today; sorry for the
delay. I agree that 72 -> 56 bytes is very relevant for 64-byte cache
lines. I had not realized the structure was as big as that.

> And yea. I didn't add benchmark results. I don't think I *have* to do that
> when making suggestions to somebody trying to improve something specific. I
> also currently don't have hardware where I can sensibly run at a high enough
> concurrency to see that GetSnapshotData takes ~40% of runtime.
> Additional cacheline references around synchronized access can hurt to my
> knowledge...

I tested this on Nate Boley's 32-core box today with the 32 clients
doing a select-only pgbench test. Results of individual 5 minute
runs:

results.master.32:tps = 171701.947919 (including connections establishing)
results.master.32:tps = 227777.430112 (including connections establishing)
results.master.32:tps = 218257.478461 (including connections establishing)
results.master.32:tps = 226425.964855 (including connections establishing)
results.master.32:tps = 218687.662373 (including connections establishing)
results.master.32:tps = 219819.451605 (including connections establishing)
results.master.32:tps = 216800.131634 (including connections establishing)
results.snapshotdata-one-cacheline.32:tps = 181997.531357 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 171505.145440 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 226970.348605 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 169725.115763 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 219548.690522 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 175440.705722 (including
connections establishing)
results.snapshotdata-one-cacheline.32:tps = 217154.173823 (including
connections establishing)

There's no statistically significant difference between those two data
sets; if anything, the results with the patch look like they might be
worse, although I believe that's an artifact - some runs seem to
mysteriously come out in the 170-180 rangeinstead of the 215-225
range, with nothing in between. But even if you only average the good
runs out of each set there's no material difference.

Having said that, I am coming around to the view that we should apply
this patch anyway, just because it reduces memory consumption. Since
the size change crosses a power-of-two boundary, I believe it will
actually cut down the size of a palloc'd chunk for a SnapshotData
object from 128 bytes to 64 bytes. I doubt we can measure the benefit
of that even on a microbenchmark unless someone has a better idea for
making PostgreSQL take ridiculous numbers of snapshots than I do. It
still seems like a good idea, though: a penny saved is a penny earned.

With response to Tom's objections downthread, I don't think that the
new ordering is significantly more confusing than the old one.
xcnt/subxcnt/xip/subxip doesn't seem measurably less clear than
xcnt/xip/subxcnt/subxip, and we're not normally in the habit of
letting such concerns get in the way of squeezing out alignment
padding.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Avoiding repeated snapshot computation at 2011-11-26 22:49:19 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Greg Smith	2011-12-13 17:58:05	Re: Command Triggers
Previous Message	Jim Nasby	2011-12-13 17:48:22	Re: xlog location arithmetic