Re: insert fail gracefully if primary key already exists

From: Bartosz Dmytrak <bdmytrak(at)eranet(dot)pl>
To: Alessandro Gagliardi <alessandro(at)path(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: insert fail gracefully if primary key already exists
Date: 2012-02-17 21:18:48
Message-ID: CAD8_UcbeNKuj5szmVfR=P9AeDG2aopLjAvOcLbKPkJ20FsFopw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

To answer Your trigger question:
in my example trigger is devided into 2 parts:

1. trigger itself: checkTrigger
2. trigger funtion: "myschema"."checkTriggerFunction"

Trigger is discussed in doc:
http://www.postgresql.org/docs/9.1/static/sql-createtrigger.html
according to this part of doumentation: "*The trigger will be associated
with the specified table or view and will execute the specified function
function_name** when certain events occur."* I think this describes
trigger in essential part, so my intention is to create object (trigger)
related with table and execute certain action (trigger function) when
specific event occurs (this event is INSERT into this table). I also would
like to perform this action before insert to ignore insert in some case
(that is why trigger function returns NULL under certain condition)

Another part is trigger function -
http://www.postgresql.org/docs/9.1/static/plpgsql-trigger.html
this is special kind of function (has to be marked as return TRIGGER). This
function is intended to do the "job". Plpgsql trigger functions are very
smart - they could recognize the table on which trigger is fired, event
(INSERT, UPDATE, DELETE), and other interesting things - full description
of possible trigger variables is under above link.
Very interesting are NEW and OLD variables. These variables contain record
of the table on which trigger function is called (only for row level
trigger - in my example trigger declaration FOR EACH ROW).
NEW holds row being inserted - just before it is written to the table,
could be modified in fly (is null for delete). OLD holds existing row -
just before update (is null for insert). That is why in my example it is
possible to query against Your table (with known structure) like this:

SELECT 1 FROM "myschema".mytable WHERE "MyKey" = NEW."MyKey"

this exactly means: select whatever (1) from required table (mytable) where
Primary Key column ("MyKey") value match value from inserted row Primary
Key column (NEW).

It is possible to write one generic trigger functions to work with few
tables, so eg. for audit table and log all changes in the rows, it is
possible to write generic trigger function using TG_TABLE_NAME, TG_TABLE_SCHEMA
and TG_OP variables - simple example is here:
http://www.postgresql.org/docs/9.1/static/plpgsql-trigger.html#PLPGSQL-TRIGGER-AUDIT-EXAMPLE

if trigger function (row level) returns NEW then NEW record is written to
table (could be modified by trigger function). If trigger function returns
NULL then nothing is written to table, because "NEW" record is modified to
be nothing.

hope this will help You to understand my example.

Regards,
Bartek

2012/2/17 Alessandro Gagliardi <alessandro(at)path(dot)com>

> First, to answer your question, in this case, I do not expect the data to
> be different from what is already stored, so I should be able to safely
> ignore the insert if the primary key already exists. (Though an answer to
> the alternative "UPSERT" question would also be welcome as that may prove
> to be an issue as well.)
>
> I tried to do something like
> http://www.postgresql.org/docs/9.0/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPINGonly like this:
>
> BEGIN
> INSERT INTO blocks (block_id, user_id, created, shared, private,
> moment_type) VALUES (E'4f3adf9cfa63b31c77000b7c',
> E'4ee02f599c6c3f6d360076ce', '2012-02-14T22:26:36.721980'::timestamp,
> false, false, E'ambient')
>

Looks like semicolon here is missed :) Semicolon is required after SQL
statement.

> EXCEPTION
> WHEN unique_violation THEN
> NULL;
> END;
>
> But I get:
>
> ERROR: syntax error at or near "INSERT"
> SQL state: 42601
> Character: 11
>
> I feel like I must be doing something stupid here, but I can't see what it
> is.
>
> At the same time, I was also thinking using a trigger might be preferable
> so I don't have to mess with application code. But I'm a little mystified
> by how your example would work. The checkTriggerFunction makes reference
> to NEW."MyKey" and yet there is no mention of that in checkTrigger. Does
> it automagically divine that from the insert statement?
>
> Thanks,
> -Alessandro
>
>
> On Tue, Feb 14, 2012 at 12:40 AM, Bartosz Dmytrak <bdmytrak(at)eranet(dot)pl>wrote:
>
>> You can user few different solutions:
>>
>> 1. You can perform insert using stored procedure. Inside this procedure
>> You can use BEGIN...EXCEPTION...END construct - this will handle duplicate
>> key error. You can find this useful:
>> http://www.postgresql.org/docs/9.1/static/plpgsql-control-structures.html - there
>> is example of such approach.
>>
>> 2. You can also check if such row exists before insert (also in stored
>> procedure):
>> CREATE OR REPLACE FUNCTION "myschema"."InsertIfNotExists" (IN "vKeyValue"
>> int, IN "vValue1" text, IN "vValue2" text)
>> RETURNS VOID
>> AS
>> $BODY$
>> BEGIN
>> IF NOT EXISTS (SELECT 1 FROM "myschema".mytable WHERE "MyKey" =
>> "vKeyValue") THEN
>> INSERT INTO "myschema".mytable ("MyKey", "Value1", "Value2")
>> VALUES ("vKeyValue", "vValue1", "vValue2");
>>
>> END IF;
>> END;
>> $BODY$ LANGUAGE plpgsql
>>
>> 3. You can create trigger before insert which will check if record
>> already exists:
>> trigger function could look like this:
>> CREATE OR REPLACE FUNCTION "myschema"."checkTriggerFunction" ()
>> RETURNS TRIGGER
>> AS
>> $BODY$
>> BEGIN
>> IF EXISTS (SELECT 1 FROM "myschema".mytable WHERE "MyKey" = NEW."MyKey")
>> THEN
>> RETURN NULL;
>> ELSE
>> RETURN NEW;
>> END IF;
>> END;
>> $BODY$
>> LANGUAGE plpgsql;
>>
>> and trigger:
>> CREATE TRIGGER "checkTrigger"
>> BEFORE INSERT
>> ON "myschema".mytable
>> FOR EACH ROW
>> EXECUTE PROCEDURE "myschema"."checkTriggerFunction"();
>>
>>
>> Another question is: do You want to update record if primary key exists
>> or simply ignore insert?
>> It should be noticed, every check impacts performance.
>>
>> Regards,
>> Bartek
>>
>>
>>
>> 2012/2/13 Alessandro Gagliardi <alessandro(at)path(dot)com>
>>
>>> Sorry I wasn't clear. The application is written in Python. It services
>>> requests that may go to an external API or to our production database
>>> (which is Mongo). I should add that we have a somewhat unusual situation in
>>> which we have two parallel databases: MongoDB for OLTP and newer PostgreSQL
>>> for OLAP. Because Postgres is only consuming, it would waste precious
>>> cycles to have the client check to see if a record already exists in
>>> Postgres before trying to insert it. I'd rather let Postgres deal with that
>>> (which it does well enough anyway with my primary key constraints). My
>>> Postgres instance is hosted by Heroku and they provide me with a log which
>>> is currently being spammed by these "duplicate key value violates unique
>>> constraint" errors making it impossible to see if there are other errors I
>>> need to be paying more attention to.
>>>
>>>
>>> On Sat, Feb 11, 2012 at 12:08 AM, Bartosz Dmytrak <bdmytrak(at)eranet(dot)pl>wrote:
>>>
>>>> Could You provide more details?
>>>> PostgreSQL does not allow cross database queries.
>>>>
>>>>
>>>> Regards,
>>>> Bartek
>>>>
>>>>
>>>> 2012/2/10 Alessandro Gagliardi <alessandro(at)path(dot)com>
>>>>
>>>>> I have a situation where I am gradually filling in a table from
>>>>> another database as the need requires. As it currently is, whenever a
>>>>> record comes up, it tries to insert it into my database. If the row is
>>>>> already there, the primary key collides and the insert fails. And that's
>>>>> fine except that it makes for a really cluttered log (making it hard to
>>>>> find errors that I really do need to pay attention to). I'm wondering if
>>>>> there's a better (i.e. cleaner, but possibly also more efficient) way to do
>>>>> this.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> Thank you in advance,
>>>>> -Alessandro
>>>>>
>>>>
>>>>
>>>
>>
>

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Alessandro Gagliardi 2012-02-17 21:24:11 Re: insert fail gracefully if primary key already exists
Previous Message Merlin Moncure 2012-02-17 19:39:27 Re: insert fail gracefully if primary key already exists