Re: [HACKERS] 8.2 features?

From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject: Re: [HACKERS] 8.2 features?
Date: 2006-07-19 04:20:05
Message-ID: 44BDB2F5.7080003@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com> writes:
>
>>Strange. Last time I checked I thought MySQL dump used 'multivalue
>>lists in inserts' for dumps, for the same reason that we use COPY
>
> I think Andrew identified the critical point upthread: they don't try
> to put an unlimited number of rows into one INSERT, only a megabyte
> or so's worth. Typical klugy-but-effective mysql design approach ...

OK, so given that we don't need to be able to do 1 million
multi-targetlist insert statements, here is rev 2 of the patch.

It is just slightly more invasive, but performs *much* better. In fact,
it can handle as many targetlists as you have memory to deal with. It
also deals with DEFAULT values in the targetlist.

I've attached a php script that I used to do crude testing. Basically I
tested 3 cases in this order:

single-INSERT-multi-statement:
------------------------------
"INSERT INTO foo2a (f1,f2) VALUES (1,2);"
-- repeat statement $loopcount times

single-INSERT-at-once:
----------------------
"INSERT INTO foo2b (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2)
VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2)..."
-- build a single SQL string by looping $loopcount times,
-- and execute it all at once

multi-INSERT-at-once:
---------------------
"INSERT INTO foo2c (f1,f2) VALUES (1,2),(1,2),(1,2)..."
-- build a single SQL string by looping $loopcount times,
-- and execute it all at once

Here are the results:
$loopcount = 100000;
single-INSERT-multi-statement Elapsed time is 34 seconds
single-INSERT-at-once Elapsed time is 7 seconds
multi-INSERT-at-once Elapsed time is 4 seconds
about 370MB peak memory usage

$loopcount = 200000;
single-INSERT-multi-statement Elapsed time is 67 seconds
single-INSERT-at-once Elapsed time is 12 seconds
multi-INSERT-at-once Elapsed time is 9 seconds
about 750MB peak memory usage

$loopcount = 300000;
single-INSERT-multi-statement Elapsed time is 101 seconds
single-INSERT-at-once Elapsed time is 18 seconds
multi-INSERT-at-once Elapsed time is 13 seconds
about 1.1GB peak memory usage

Somewhere beyond this, my machine goes into swap hell, and I didn't have
the patience to wait for it to complete :-)

It would be interesting to see a side-by-side comparison with MySQL
since that seems to be our benchmark on this feature. I'll try to do
that tomorrow if no one beats me to it.

There is only one downside to the current approach that I'm aware of.
The command-result tag is only set by the "original" query, meaning that
even if you insert 300,000 rows using this method, the command-result
tag looks like "INSERT 0 1"; e.g.:

regression=# create table foo2(f1 int default 42,f2 int default 6);
CREATE TABLE
regression=# insert into foo2 (f1,f2) values
(default,12),(default,10),(115,21);
INSERT 0 1
regression=# select * from foo2;
f1 | f2
-----+----
42 | 12
42 | 10
115 | 21
(3 rows)

Any thoughts on how to fix that?

Thanks,

Joe

Attachment Content-Type Size
multi-insert-r2.diff text/x-patch 4.7 KB
test-insert.php application/x-php 2.0 KB

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Peter Eisentraut 2006-07-19 16:54:22 Re: I'd like to translate the english manual into Korean!
Previous Message Tom Lane 2006-07-19 03:32:34 Re: [HACKERS] 8.2 features?

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2006-07-19 04:24:55 Re: Progress bar updates
Previous Message Alvaro Herrera 2006-07-19 03:52:36 Re: url for TODO item, is it right?

Browse pgsql-patches by date

  From Date Subject
Next Message Martijn van Oosterhout 2006-07-19 06:45:38 Re: [HACKERS] pg_regress in C
Previous Message Tom Lane 2006-07-19 03:32:34 Re: [HACKERS] 8.2 features?