INSERT .. SELECT should redo SELECT if a duplicate key is found

From: Cesar Eduardo Barros <cesarb(at)elnetcorp(dot)com(dot)br>
To: pgsql-bugs(at)postgresql(dot)org
Subject: INSERT .. SELECT should redo SELECT if a duplicate key is found
Date: 2002-07-06 15:18:56
Message-ID: 20020706151856.GA26802@cerberus.elnet.grupomk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


When you are trying to insert a row in a table with a UNIQUE constraint,
unless it already exists, you can try something like:

INSERT INTO table (id) SELECT val WHERE NOT EXISTS (SELECT 1 FROM table WHERE id = val)

However, this does not work as expected if another backend inserts a row
with the same unique column(s).

Example:

psql 1:
teste=# select version();
version
---------------------------------------------------------------
PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.95.4
(1 row)

teste=# create table teste (id integer primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'teste_pkey' for table 'teste'
CREATE
teste=# begin;
BEGIN
teste=# insert into teste select 5 where not exists (select 1 from teste where id = 5);
INSERT 826780 1

psql 2:
teste=# begin;
BEGIN
teste=# insert into teste select 5 where not exists (select 1 from teste where id = 5);
[sits there waiting for the other backend]

psql 1:
teste=# commit;
COMMIT

psql 2:
ERROR: Cannot insert a duplicate key into unique index teste_pkey
teste=# commit;
COMMIT

The result I expected would be that it would redo the whole query after
the lock is released, instead of just redoing the INSERT part (and not
the SELECT).

Without it, I'm forced to use table locks to avoid the concurrent insert
(which was what I was trying to avoid with the subselect).

Adding FOR UPDATE on the subselect (done by repeating the subselect with
FOR UPDATE as a standalone query before the insert, since subselects
can't have FOR UPDATE) makes no difference (of course, there's no row
for it to lock yet).

The problem is that the INSERT is being split in two atomic pieces
(obtaining the data to insert with a SELECT and inserting it) when it
should be just one atomic piece. Postgres tries to do the right thing by
waiting if another transaction has inserted something with the same
unique values, but it should check if the conditions in the SELECT part
of the query still hold true after the blocking transaction is commited.
If that was done, it would notice that the "not exists (select 1 from
teste where id = 5)" part is now false for that row, and refrain from
inserting it, which is the Right Thing.

The documentation says reevaluating the WHERE condition is done for
UPDATE, DELETE and SELECT FOR UPDATE, but says nothing about INSERT. My
opinion is that INSERT should also reevaluate the WHERE condition.

--
Cesar Eduardo Barros
ElNet Hightech -- Administrador de Sistemas Unix
cesarb(at)elnetcorp(dot)com(dot)br

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Cesar Eduardo Barros 2002-07-06 16:10:36 Move constant evaluation to inside IN subselect
Previous Message Bruce Momjian 2002-07-03 15:52:23 Re: Bug #703: granted permisions lost next day