Skip site navigation (1) Skip section navigation (2)

Re: insert vs select into performance

From: Thomas Finneid <tfinneid(at)student(dot)matnat(dot)uio(dot)no>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: insert vs select into performance
Date: 2007-07-18 19:08:08
Message-ID: 469E6518.5000203@ifi.uio.no (view raw or flat)
Thread:
Lists: pgsql-performance
PFC wrote:
>> Unfortunately its not fast enough, it needs to be done in no more than 
>> 1-2 seconds, ( and in production it will be maybe 20-50 columns of 
>> data, perhaps divided over 5-10 tables.)
>> Additionally it needs to scale to perhaps three times as many columns 
>> and perhaps 2 - 3 times as many rows in some situation within 1 seconds.
>> Further on it needs to allow for about 20 - 50 clients reading much of 
>> that data before the next batch of data arrives.
> 
>     Wow. What is the application ?

Geological surveys, where they perform realtime geo/hydro-phone shots of 
areas of the size of 10x10km every 3-15 seconds.


> test=> CREATE OR REPLACE FUNCTION test_insert( )
>     RETURNS VOID
>     LANGUAGE plpgsql
>     AS
> $$
> DECLARE
>     _i INTEGER;
> BEGIN
>     FOR _i IN 0..100000 LOOP
>         INSERT INTO test (a,b,c,d,e,f) VALUES (1,2,3,4,5, _i);
>     END LOOP;
> END;
> $$;
> CREATE FUNCTION
> Temps : 51,948 ms
> 
> test=> SELECT test_insert();
>  test_insert
> -------------
> 
> (1 ligne)
> 
> Temps : 1885,382 ms

I tested this one and it took 4 seconds, compared to the jdbc insert 
which took 14 seconds, so its a lot faster. but not as fast as the 
SELECT INTO.

I also tested an INSERT INTO FROM SELECT, which took 1.8 seconds, now we 
are starting to talk about real performance.


>     However COPY is much faster because the parsing overhead and 
> de-escaping of data is faster. COPY is optimized for throughput.
> 
>     So, advice :
> 
>     For optimum throughput, have your application build chunks of data 
> into text files and use COPY. Or if your client lib supports the copy 
> interface, use it.

I did test COPY, i.e. the jdbc COPY patch for pg 8.1, it performs at 
approx 1.8 seconds :) The test was done with text input, I am going to 
test it with binary input, which I expect will increase the performance 
with 20-50%.

All these test have ben performed on a laptop with a Kubuntu 6.10 
version of pg 8.1 without any special pg performance tuning. So I expect 
that compiling lates pg and doing some tuning on it and testing it on 
the a representative server will give it an additional boost in performance.

The key here is that with abundance in performance, I can experiment 
with the solution in a completely different way than if I had any 
"artificial" restrictions.

>     You will need a fast disk system with xlog and data on separate 
> disks, several CPU cores (1 insert thread will max out 1 core, use the 
> others for selects), lots of RAM so index updates don't need to seek, 
> and tuning of bgwriter and checkpoints to avoid load spikes.

will have a look at it.

regards

thomas

In response to

pgsql-performance by date

Next:From: Thomas FinneidDate: 2007-07-18 19:11:20
Subject: Re: insert vs select into performance
Previous:From: Jim C. NasbyDate: 2007-07-18 18:52:51
Subject: Re: When/if to Reindex

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group