Re: exceptionally large UPDATE

From: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: exceptionally large UPDATE
Date: 2010-10-29 16:08:54
Message-ID: 20101029180854.5ca09e88@dawn.webthatworks.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 29 Oct 2010 10:21:14 -0400
Vick Khera <vivek(at)khera(dot)org> wrote:

> On Thu, Oct 28, 2010 at 1:06 PM, Ivan Sergio Borgonovo
> <mail(at)webthatworks(dot)it> wrote:
> > What I'm planning to do is:
> > max_connections = 5
> > shared_buffers = 240M
> > work_mem = 90MB
> > maintenance_work_mem = 1GB
> > max_fsm_pages = 437616
> > max_fsm_relations = 1200
> > checkpoint_segments = 70
> > default_statistics_target = 30
> > #log_min_duration_statement = 1000

> default_statistics_target = 100 is the new "default" for newer
> postgres, and with good reason... try that.
>
> if you boost your checkpoint_segments, also twiddle the
> checkpoint_timeout (increase it) and checkpoint_completion_target
> (something like 0.8 would be good, depending on how fast your disks
> are) values to try to smooth out your I/O (ie, keep it from
> bursting at checkpoint timeout). Is 5 connections really enough
> for you?

No. 5 is too few.
OK... this is what I end up with:

max_connections = 100

shared_buffers = 240M
work_mem = 90MB
maintenance_work_mem = 1GB

max_fsm_pages = 437616
max_fsm_relations = 1200

default_statistics_target = 100

checkpoint_segments = 70
checkpoint_timeout = 10min
checkpoint_completion_target = 0.6 #(not very fast drives in raid5)
#log_min_duration_statement = 1000

random_page_cost = 3.0

I tested this on a RAID10 SATA, 8Gb RAM and 2x4cores Xeons

- updating 227985 records over roughly 1.4M took 197744.374 ms
- recreating the gin index took 313962.162 ms
- commit took 7699.595 ms
- vacuum analyse 188261.481 ms

The total update took around 13min.
I've just heard that a similar update on a slower box (RAID1 SAS,
4Gb, 2x2Cores Xeon) running MS SQL took over 30min.
Considering MUCH less pk/fk, constraint and actions where defined on
the MS SQL DB, things now look much better for postgres.

Furthermore postgresql full text search kicks ass to the MS SQL box
even on the slowest box of all (RAID5 SATA, 4Gb, 2x1core HT Xeon,
over 6 years old).

I'll take note of performance even on the slower box as soon as I'll
have large updates, still I'm looking how to make it faster.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Dean Gibson (DB Administrator) 2010-10-29 17:33:30 9.0 replication -- multiple hot_standby servers
Previous Message Cole, Tavin 2010-10-29 16:08:31 xor(bytea,bytea)