Re: MVCC for massively parallel inserts

From: Vivek Khera <khera(at)kcilink(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: MVCC for massively parallel inserts
Date: 2004-01-08 21:48:25
Message-ID: x7r7yayuvq.fsf@yertle.int.kciLink.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>>>>> "GS" == Greg Stark <gsstark(at)mit(dot)edu> writes:

GS> I would agree and if you really need the I/O bandwidth you can go
GS> to much larger stripe sets than even this. The documentation I've
GS> seen before suggested there were benefits up to stripe sets as
GS> large as twelve disks across. That would be 24 drives if you're
GS> also doing mirroring.

I did a bunch of testing with a 14 disk SCSI array. I found that RAID5 was
best over RAID10 and RAID50.

GS> Ideally separating WAL, index, and heap files is good, but you
GS> would have to experiment to see which works out fastest for a
GS> given number of drives.

I found that putting the WAL on its own array (in my case a mirror on
the other RAID controller channel) helped quite a bit. I don't think
it is easy to split off index files to alternate locations with Postgres.

Increasing the number of checkpoint segments was one of the biggest
improvements I observed for mass-insert performance (as tested while
doing a restore on a multi-million row database.)

The combination of having the WAL on a separate disk, and letting that
grow to be quite large has been very good for my performance and also
for reducing disk bandwidth requirements.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: khera(at)kciLink(dot)com Rockville, MD +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-01-08 21:53:39 Re: 7.4, 'group by' default ordering?
Previous Message Guillaume Houssay 2004-01-08 21:42:40 vacuum