Skip site navigation (1) Skip section navigation (2)

Re: Populating large tables with occasional bad values

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: "John T(dot) Dow" <john(at)johntdow(dot)com>
Cc: "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Populating large tables with occasional bad values
Date: 2008-06-11 16:20:46
Message-ID: 484FFB5E.8000203@postnewspapers.com.au (view raw or flat)
Thread:
Lists: pgsql-jdbc
Sorry for the self-reply, but a correction and an example are really 
required.

Craig Ringer wrote:

> OK, so you're waiting for each INSERT to complete before issuing the 
> next. That means that over the Internet you add *at* *least* (2*latency) 
> to the time it takes to complete each insert, where `latency' is the 
> round trip time for a packet between the DB client and server.

Whoops. At least `latency' not 2*latency. I was thinking one-way latency 
and got confused at some point into saying round trip instead. Sorry.

> That gets expensive fast. My latency from my home DSL to my work's DSL 
> is 12ms - and I'm in the same city and on the same ISP as work is, as 
> well as connected to the same peering point. I regularly see latencies 
> of > 150ms to hosts within Australia.

Here's an example. The psql client is running on my desktop, and 
connected to work's Pg server.

I set \timing in psql so psql reports the total time from when the local 
client issues the query to when it receives the reply.

I then issue an insert with EXPLAIN ANALYZE so the server reports how 
long it took the server to execute the query.

test=# \timing
Timing is on
test=# explain analyze insert into dummy (pk, value)
test-# values (101,'thing');
                                      QUERY PLAN

--------------------------------------------------------------------------------

  Result  (cost=0.00..0.01 rows=1 width=0)
          (actual time=0.001..0.002 rows=1 loops=1)
  Total runtime: 0.090 ms
(2 rows)

Time: 21.914 ms


You can see that the sever took only 0.09ms to execute the query, but to 
the client it appeared to take 21ms.

If I ssh to the server and connect with psql locally over a unix domain 
socket or the loopback interface, I get these results instead:



test=# explain analyze insert into dummy (pk, value)
test-# values (102,'thing');
                                      QUERY PLAN 

------------------------------------------------------------------------------------
  Result  (cost=0.00..0.01 rows=1 width=0)
          (actual time=0.002..0.002 rows=1 loops=1)
  Total runtime: 0.056 ms
(2 rows)

Time: 0.530 ms



... which is 40 times faster from client query issue to client query 
completion despite the negligible server execution speed difference.

multiple row INSERTs, use of COPY, etc will all help combat this.

--
Craig Ringer\

In response to

Responses

pgsql-jdbc by date

Next:From: John T. DowDate: 2008-06-11 16:39:36
Subject: Re: Populating large tables with occasional bad values
Previous:From: Craig RingerDate: 2008-06-11 16:06:01
Subject: Re: Populating large tables with occasional bad values

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group