From: | Robert Poor <rdpoor(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | configuring queries for concurrent updates |
Date: | 2012-06-23 06:33:02 |
Message-ID: | CAGHqdqWJdw0cG6+-HjpX+v2wE4-4+udm5aCMUMFS5rSaGpQ1uw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
[std_disclaimer]I'm not a DBA and I'm running PostgreSQL on a
quad-core Intel laptop. You may read on after you stop
laughing...[/std_disclaimer]
I've written a version of UPSERT designed to import a large number of
records. It works in two steps: it UPDATEs incumbent records that
share common keys with the new records, then INSERTs new records that
do not share common keys with the incumbent records. The gist of it
is:
SAVEPOINT ...;
-- update records from subquery that share common keys with tbl
UPDATE tbl
SET f1 = X.f1, f2 = X.f2, ...
FROM (subquery) AS X
WHERE tbl.k1 = X.k1 AND tbl.k2 = X.k2 AND ...;
-- insert records from subquery that do NOT share common keys with tbl
INSERT INTO tbl (f1, f2, ...)
SELECT Y.f1, Y.f2, ...
FROM (subquery) AS Y
LEFT OUTER JOIN tbl
ON tbl.k1 = Y.k1 AND tbl.k2 = Y.k2 AND ...
WHERE tbl.id IS NULL;
RELEASE SAVEPOINT ...;
The (subquery) frequently generates 5000 records for update/insert.
Three Unix processes are running the import process concurrently, all
updating/inserting records into a common table.
Most of the time this works, but I've observed examples where an
import process will
* get a "PG::Error: ERROR: current transaction is aborted, commands
ignored until end of transaction block"
* get a "PG::Error: ERROR: deadlock detected"
* go catatonic for several minutes
This leads me to believe that my SAVEPOINT / RELEASE SAVEPOINT is not
the correct way to protect the system from multiprocessing mischief.
I've read Tom Lane's presentation on concurrency
(http://www.postgresql.org/files/developer/concurrency.pdf) but I
haven't been able to figure out the right solution.
[For what it's worth, with the data I'm importing, INSERTs will far
outnumber the UPDATES.]
Since I someday plan to spin this code up on a big system with >> 3
import processes, I'd like to get this right.
Hints and solutions warmly accepted. Thanks.
- rdp
From | Date | Subject | |
---|---|---|---|
Next Message | Jasen Betts | 2012-06-23 08:51:33 | Re: efficiency of wildcards at both ends |
Previous Message | Jeff Davis | 2012-06-23 00:11:53 | Re: retrieving function raise messages in ecpg embedded sql code |