Re: Scaling PostgreSQL-9

From: "Igor Neyman" <ineyman(at)perceptron(dot)com>
To: "sandeep prakash dhumale" <sandy9940(at)rediffmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Scaling PostgreSQL-9
Date: 2010-09-28 19:13:34
Message-ID: F4C27E77F7A33E4CA98C19A9DC6722A20690CC8F@EXCHANGE.corp.perceptron.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> -----Original Message-----
> From: sandeep prakash dhumale [mailto:sandy9940(at)rediffmail(dot)com]
> Sent: Tuesday, September 28, 2010 6:32 AM
> To: pgsql-general(at)postgresql(dot)org
> Subject: Scaling PostgreSQL-9
>
> Hello All,
>
> Need some help in scaling PostgreSQL:
>
> I have a table with 400M records with 5 int columns having
> index only on 1 column.
>
> Rows are updated by a perl script which takes 10k numbers in
> one transactions and fires single single update in a loop on
> database keeping track of the result returned . If zero
> returned then at later stage it does an insert. In short if
> the record is present in the DB then it gets updated and if
> not then get inserted. > 80% the records are always there in
> the DB so updates are more.
>
> We need to speed up this process as it takes about 150 sec to
> complete 10k batch. From database logs on the avg each update
> takes about 15ms.
>

Your problem is that you process one record at a time in your loop,
Meaning you have to make 10k trips to the database to process 10k
records.

Try creating "staging" table in the database,
copy all the records from your source into staging table,
i.e. using COPY command if your source is a file.
Then using couple sql statements:
insert ... where not exists (select ...)
Update ... Where exists...

Insert new and update existing records.
Here you work with data sets inside the database,
which should be much faster then procedural perl script.

> I tried to do a bulk delete of 1M numbers and copy of the
> same but no luck so far. Delete and copy also take a longer
> time more than 1 hour each.
>

How did you do copy? Again using perl script to loop through 1M records
one at a time?


> Few Details:
>
> PostgreSQL 9.0.0 on 11.0 open SuSe-64bit,Shared
> buffer=2.5GB,effective_cache_size=20GB,checkpoint_segments=200
> (raised for bulkloading)
>
> Server: dell dual quadcore ,32GB RAM, DB partition on RAID 10
> and pg_xlog on RAID 1.
>
> p.s. Previously we were having slony read only slave on 8.4.2
> , where delete was fast about 7 min and copy 6 min, we moved
> to PostgreSQL 9 for read only Stand by slave to remove
> overhead caused by slony due to triggers (also the slave was
> always lagging in case of bulkloads on master) in the hope
> of speeding up the process.
>
> Any help would be much appriciated ...
>
> With Regards
> sandy
>
>

Regards,
Igor Neyman

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Brian Hirt 2010-09-28 19:30:49 Re: pg_upgrade
Previous Message Sam Mason 2010-09-28 18:56:24 Re: huge difference in performance between MS SQL and pg 8.3 on UPDATE with full text search