From: | Joe Conway <mail(at)joeconway(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de> |
Subject: | Re: [HACKERS] 8.2 features? |
Date: | 2006-07-19 04:20:05 |
Message-ID: | 44BDB2F5.7080003@joeconway.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs pgsql-hackers pgsql-patches |
Tom Lane wrote:
> Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com> writes:
>
>>Strange. Last time I checked I thought MySQL dump used 'multivalue
>>lists in inserts' for dumps, for the same reason that we use COPY
>
> I think Andrew identified the critical point upthread: they don't try
> to put an unlimited number of rows into one INSERT, only a megabyte
> or so's worth. Typical klugy-but-effective mysql design approach ...
OK, so given that we don't need to be able to do 1 million
multi-targetlist insert statements, here is rev 2 of the patch.
It is just slightly more invasive, but performs *much* better. In fact,
it can handle as many targetlists as you have memory to deal with. It
also deals with DEFAULT values in the targetlist.
I've attached a php script that I used to do crude testing. Basically I
tested 3 cases in this order:
single-INSERT-multi-statement:
------------------------------
"INSERT INTO foo2a (f1,f2) VALUES (1,2);"
-- repeat statement $loopcount times
single-INSERT-at-once:
----------------------
"INSERT INTO foo2b (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2)
VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2)..."
-- build a single SQL string by looping $loopcount times,
-- and execute it all at once
multi-INSERT-at-once:
---------------------
"INSERT INTO foo2c (f1,f2) VALUES (1,2),(1,2),(1,2)..."
-- build a single SQL string by looping $loopcount times,
-- and execute it all at once
Here are the results:
$loopcount = 100000;
single-INSERT-multi-statement Elapsed time is 34 seconds
single-INSERT-at-once Elapsed time is 7 seconds
multi-INSERT-at-once Elapsed time is 4 seconds
about 370MB peak memory usage
$loopcount = 200000;
single-INSERT-multi-statement Elapsed time is 67 seconds
single-INSERT-at-once Elapsed time is 12 seconds
multi-INSERT-at-once Elapsed time is 9 seconds
about 750MB peak memory usage
$loopcount = 300000;
single-INSERT-multi-statement Elapsed time is 101 seconds
single-INSERT-at-once Elapsed time is 18 seconds
multi-INSERT-at-once Elapsed time is 13 seconds
about 1.1GB peak memory usage
Somewhere beyond this, my machine goes into swap hell, and I didn't have
the patience to wait for it to complete :-)
It would be interesting to see a side-by-side comparison with MySQL
since that seems to be our benchmark on this feature. I'll try to do
that tomorrow if no one beats me to it.
There is only one downside to the current approach that I'm aware of.
The command-result tag is only set by the "original" query, meaning that
even if you insert 300,000 rows using this method, the command-result
tag looks like "INSERT 0 1"; e.g.:
regression=# create table foo2(f1 int default 42,f2 int default 6);
CREATE TABLE
regression=# insert into foo2 (f1,f2) values
(default,12),(default,10),(115,21);
INSERT 0 1
regression=# select * from foo2;
f1 | f2
-----+----
42 | 12
42 | 10
115 | 21
(3 rows)
Any thoughts on how to fix that?
Thanks,
Joe
Attachment | Content-Type | Size |
---|---|---|
multi-insert-r2.diff | text/x-patch | 4.7 KB |
test-insert.php | application/x-php | 2.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2006-07-19 16:54:22 | Re: I'd like to translate the english manual into Korean! |
Previous Message | Tom Lane | 2006-07-19 03:32:34 | Re: [HACKERS] 8.2 features? |
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2006-07-19 04:24:55 | Re: Progress bar updates |
Previous Message | Alvaro Herrera | 2006-07-19 03:52:36 | Re: url for TODO item, is it right? |
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2006-07-19 06:45:38 | Re: [HACKERS] pg_regress in C |
Previous Message | Tom Lane | 2006-07-19 03:32:34 | Re: [HACKERS] 8.2 features? |