Quick Links

Re: Populating large tables with occasional bad values

From:	Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To:	"John T(dot) Dow" <john(at)johntdow(dot)com>
Cc:	"pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject:	Re: Populating large tables with occasional bad values
Date:	2008-06-11 16:20:46
Message-ID:	484FFB5E.8000203@postnewspapers.com.au
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-jdbc

Sorry for the self-reply, but a correction and an example are really
required.

Craig Ringer wrote:

> OK, so you're waiting for each INSERT to complete before issuing the
> next. That means that over the Internet you add *at* *least* (2*latency)
> to the time it takes to complete each insert, where `latency' is the
> round trip time for a packet between the DB client and server.

Whoops. At least `latency' not 2*latency. I was thinking one-way latency
and got confused at some point into saying round trip instead. Sorry.

> That gets expensive fast. My latency from my home DSL to my work's DSL
> is 12ms - and I'm in the same city and on the same ISP as work is, as
> well as connected to the same peering point. I regularly see latencies
> of > 150ms to hosts within Australia.

Here's an example. The psql client is running on my desktop, and
connected to work's Pg server.

I set \timing in psql so psql reports the total time from when the local
client issues the query to when it receives the reply.

I then issue an insert with EXPLAIN ANALYZE so the server reports how
long it took the server to execute the query.

test=# \timing
Timing is on
test=# explain analyze insert into dummy (pk, value)
test-# values (101,'thing');
QUERY PLAN

--------------------------------------------------------------------------------

Result (cost=0.00..0.01 rows=1 width=0)
(actual time=0.001..0.002 rows=1 loops=1)
Total runtime: 0.090 ms
(2 rows)

Time: 21.914 ms

You can see that the sever took only 0.09ms to execute the query, but to
the client it appeared to take 21ms.

If I ssh to the server and connect with psql locally over a unix domain
socket or the loopback interface, I get these results instead:

test=# explain analyze insert into dummy (pk, value)
test-# values (102,'thing');
QUERY PLAN

------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
(actual time=0.002..0.002 rows=1 loops=1)
Total runtime: 0.056 ms
(2 rows)

Time: 0.530 ms

... which is 40 times faster from client query issue to client query
completion despite the negligible server execution speed difference.

multiple row INSERTs, use of COPY, etc will all help combat this.

--
Craig Ringer\

In response to

Re: Populating large tables with occasional bad values at 2008-06-11 16:06:01 from Craig Ringer

Responses

Re: Populating large tables with occasional bad values at 2008-06-11 16:39:36 from John T. Dow

Browse pgsql-jdbc by date

	From	Date	Subject
Next Message	John T. Dow	2008-06-11 16:39:36	Re: Populating large tables with occasional bad values
Previous Message	Craig Ringer	2008-06-11 16:06:01	Re: Populating large tables with occasional bad values