Re: WIP patch for latestCompletedXid method of computing snapshot xmax

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: WIP patch for latestCompletedXid method of computing snapshot xmax
Date: 2007-09-08 20:21:57
Message-ID: 28198.1189282917@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

I wrote:
> This patch implements Florian's idea about how to manage snapshot xmax
> without the ugly and performance-losing tactic of taking XidGenLock and
> ProcArrayLock at the same time. I had to do a couple of slightly klugy
> things to get bootstrap and prepared transactions to work, but on the
> whole it seems at least as clean as the code we have now. Comments?

I spent a fair amount of time this afternoon trying to measure the
performance impact of this patch using pgbench, without a lot of success
--- as far as average transaction rates go, it's a wash compared to CVS
HEAD. However, it eventually struck me to look at the distribution of
transaction times, using pgbench's -l logging output, and on that basis
it is clear that getting rid of the XidGenLock interaction has a good
deal of use in eliminating outlier times. My desktop machine has a
single consumer-grade IDE drive, and even with fsync off and
synchronous_commit off, it can barely make 190 tps sustained pgbench
throughput --- it's just disk write bound all the time. On a run with 8
clients, 10000 transactions per client, DB scale factor 25, I get this
distribution of transaction times from CVS HEAD:

postgres=# select usec/1000000, count(*) from plhead group by 1 order by 1;
?column? | count
----------+-------
0 | 79306
1 | 290
2 | 116
3 | 65
4 | 82
5 | 30
6 | 31
7 | 32
10 | 8
11 | 8
13 | 16
14 | 5
15 | 3
20 | 8
(14 rows)

and this from HEAD plus the patch:

postgres=# select usec/1000000, count(*) from plpatch group by 1 order by 1;
?column? | count
----------+-------
0 | 79305
1 | 325
2 | 85
3 | 49
4 | 68
5 | 50
6 | 45
7 | 35
8 | 14
9 | 8
10 | 6
11 | 10
(12 rows)

The worst-case transaction time has dropped by nearly a factor of 2:

postgres=# select * from plhead order by usec desc limit 20;
client | trans | usec | f | epoch | lsb
--------+-------+----------+---+------------+--------
2 | 6379 | 20621910 | 0 | 1189280557 | 664207
6 | 5992 | 20621175 | 0 | 1189280557 | 665970
7 | 5795 | 20621024 | 0 | 1189280557 | 666353
1 | 6327 | 20620833 | 0 | 1189280557 | 663606
3 | 6463 | 20620277 | 0 | 1189280557 | 663895
4 | 6383 | 20620260 | 0 | 1189280557 | 664000
5 | 6209 | 20620077 | 0 | 1189280557 | 665060
0 | 6269 | 20619875 | 0 | 1189280557 | 664935
6 | 8182 | 15191784 | 0 | 1189280655 | 87859
3 | 8810 | 15191637 | 0 | 1189280655 | 86802
2 | 8700 | 15185120 | 0 | 1189280655 | 86742
5 | 8479 | 14078513 | 0 | 1189280653 | 978339
1 | 8618 | 14077106 | 0 | 1189280653 | 978216
7 | 7930 | 14076905 | 0 | 1189280653 | 978832
4 | 8704 | 14076429 | 0 | 1189280653 | 977877
0 | 8557 | 14076249 | 0 | 1189280653 | 977477
0 | 6717 | 13932179 | 0 | 1189280576 | 65288
1 | 6775 | 13931973 | 0 | 1189280576 | 65387
6 | 6410 | 13931493 | 0 | 1189280576 | 67192
7 | 6201 | 13931140 | 0 | 1189280576 | 69247
(20 rows)

postgres=# select * from plpatch order by usec desc limit 20;
client | trans | usec | f | epoch | lsb
--------+-------+----------+---+------------+--------
6 | 6008 | 11833702 | 0 | 1189281093 | 646851
0 | 6140 | 11833041 | 0 | 1189281093 | 645738
2 | 6289 | 11809343 | 0 | 1189281093 | 616734
4 | 6315 | 11808044 | 0 | 1189281093 | 617505
3 | 6344 | 11807762 | 0 | 1189281093 | 616970
7 | 5802 | 11807641 | 0 | 1189281093 | 617932
5 | 6183 | 11806964 | 0 | 1189281093 | 618060
1 | 6163 | 11805494 | 0 | 1189281093 | 616679
7 | 8175 | 11027973 | 0 | 1189281239 | 675499
2 | 8725 | 11019066 | 0 | 1189281239 | 674305
5 | 8828 | 10997331 | 0 | 1189281239 | 674953
0 | 8541 | 10987629 | 0 | 1189281239 | 673773
1 | 8799 | 10713734 | 0 | 1189281239 | 673217
4 | 8897 | 10705975 | 0 | 1189281239 | 672743
6 | 8364 | 10702875 | 0 | 1189281239 | 677163
3 | 8814 | 10701467 | 0 | 1189281239 | 674369
3 | 9223 | 9158554 | 0 | 1189281258 | 202934
5 | 9234 | 9143744 | 0 | 1189281258 | 203073
7 | 8493 | 9099174 | 0 | 1189281258 | 204092
4 | 9306 | 9097074 | 0 | 1189281258 | 202402
(20 rows)

So on the strength of that, I'm going to go ahead and commit the patch,
but I'd be interested to see benchmarks from people with access to
better hardware.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2007-09-08 20:26:15 Re: Just-in-time Background Writer Patch+Test Results
Previous Message Kenneth Marshall 2007-09-08 20:21:22 Re: Hash index todo list item

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-09-08 20:39:28 Re: HOT patch - version 15
Previous Message Simon Riggs 2007-09-08 20:13:17 Re: HOT patch - version 15