Re: insert fail gracefully if primary key already exists

From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: Bartosz Dmytrak <bdmytrak(at)eranet(dot)pl>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: insert fail gracefully if primary key already exists
Date: 2012-02-17 21:24:11
Message-ID: CAAB3BB+=Nnxh2kRthWmyN1trHUC-ri3rnYp58dvpRB92TZUHwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Very cool. I ended up trying it and it seems to be working great. And no
need to change anything in the API!

Thank you!

-Alessandro

On Fri, Feb 17, 2012 at 1:18 PM, Bartosz Dmytrak <bdmytrak(at)eranet(dot)pl> wrote:

> To answer Your trigger question:
> in my example trigger is devided into 2 parts:
>
> 1. trigger itself: checkTrigger
> 2. trigger funtion: "myschema"."checkTriggerFunction"
>
> Trigger is discussed in doc:
> http://www.postgresql.org/docs/9.1/static/sql-createtrigger.html
> according to this part of doumentation: "*The trigger will be associated
> with the specified table or view and will execute the specified function
> function_name** when certain events occur."* I think this describes
> trigger in essential part, so my intention is to create object (trigger)
> related with table and execute certain action (trigger function) when
> specific event occurs (this event is INSERT into this table). I also would
> like to perform this action before insert to ignore insert in some case
> (that is why trigger function returns NULL under certain condition)
>
> Another part is trigger function -
> http://www.postgresql.org/docs/9.1/static/plpgsql-trigger.html
> this is special kind of function (has to be marked as return TRIGGER).
> This function is intended to do the "job". Plpgsql trigger functions are
> very smart - they could recognize the table on which trigger is fired,
> event (INSERT, UPDATE, DELETE), and other interesting things - full
> description of possible trigger variables is under above link.
> Very interesting are NEW and OLD variables. These variables contain record
> of the table on which trigger function is called (only for row level
> trigger - in my example trigger declaration FOR EACH ROW).
> NEW holds row being inserted - just before it is written to the table,
> could be modified in fly (is null for delete). OLD holds existing row -
> just before update (is null for insert). That is why in my example it is
> possible to query against Your table (with known structure) like this:
>
> SELECT 1 FROM "myschema".mytable WHERE "MyKey" = NEW."MyKey"
>
> this exactly means: select whatever (1) from required table (mytable)
> where Primary Key column ("MyKey") value match value from inserted row
> Primary Key column (NEW).
>
> It is possible to write one generic trigger functions to work with few
> tables, so eg. for audit table and log all changes in the rows, it is
> possible to write generic trigger function using TG_TABLE_NAME, TG_TABLE_SCHEMA
> and TG_OP variables - simple example is here:
> http://www.postgresql.org/docs/9.1/static/plpgsql-trigger.html#PLPGSQL-TRIGGER-AUDIT-EXAMPLE
>
> if trigger function (row level) returns NEW then NEW record is written to
> table (could be modified by trigger function). If trigger function returns
> NULL then nothing is written to table, because "NEW" record is modified to
> be nothing.
>
> hope this will help You to understand my example.
>
> Regards,
> Bartek
>
>
> 2012/2/17 Alessandro Gagliardi <alessandro(at)path(dot)com>
>
>> First, to answer your question, in this case, I do not expect the data to
>> be different from what is already stored, so I should be able to safely
>> ignore the insert if the primary key already exists. (Though an answer to
>> the alternative "UPSERT" question would also be welcome as that may prove
>> to be an issue as well.)
>>
>> I tried to do something like
>> http://www.postgresql.org/docs/9.0/static/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPINGonly like this:
>>
>> BEGIN
>> INSERT INTO blocks (block_id, user_id, created, shared, private,
>> moment_type) VALUES (E'4f3adf9cfa63b31c77000b7c',
>> E'4ee02f599c6c3f6d360076ce', '2012-02-14T22:26:36.721980'::timestamp,
>> false, false, E'ambient')
>>
>
> Looks like semicolon here is missed :) Semicolon is required after SQL
> statement.
>
>
>> EXCEPTION
>> WHEN unique_violation THEN
>> NULL;
>> END;
>>
>> But I get:
>>
>> ERROR: syntax error at or near "INSERT"
>> SQL state: 42601
>> Character: 11
>>
>> I feel like I must be doing something stupid here, but I can't see what
>> it is.
>>
>> At the same time, I was also thinking using a trigger might be preferable
>> so I don't have to mess with application code. But I'm a little mystified
>> by how your example would work. The checkTriggerFunction makes reference
>> to NEW."MyKey" and yet there is no mention of that in checkTrigger. Does
>> it automagically divine that from the insert statement?
>>
>> Thanks,
>> -Alessandro
>>
>>
>> On Tue, Feb 14, 2012 at 12:40 AM, Bartosz Dmytrak <bdmytrak(at)eranet(dot)pl>wrote:
>>
>>> You can user few different solutions:
>>>
>>> 1. You can perform insert using stored procedure. Inside this procedure
>>> You can use BEGIN...EXCEPTION...END construct - this will handle duplicate
>>> key error. You can find this useful:
>>> http://www.postgresql.org/docs/9.1/static/plpgsql-control-structures.html - there
>>> is example of such approach.
>>>
>>> 2. You can also check if such row exists before insert (also in stored
>>> procedure):
>>> CREATE OR REPLACE FUNCTION "myschema"."InsertIfNotExists" (IN
>>> "vKeyValue" int, IN "vValue1" text, IN "vValue2" text)
>>> RETURNS VOID
>>> AS
>>> $BODY$
>>> BEGIN
>>> IF NOT EXISTS (SELECT 1 FROM "myschema".mytable WHERE "MyKey" =
>>> "vKeyValue") THEN
>>> INSERT INTO "myschema".mytable ("MyKey", "Value1", "Value2")
>>> VALUES ("vKeyValue", "vValue1", "vValue2");
>>>
>>> END IF;
>>> END;
>>> $BODY$ LANGUAGE plpgsql
>>>
>>> 3. You can create trigger before insert which will check if record
>>> already exists:
>>> trigger function could look like this:
>>> CREATE OR REPLACE FUNCTION "myschema"."checkTriggerFunction" ()
>>> RETURNS TRIGGER
>>> AS
>>> $BODY$
>>> BEGIN
>>> IF EXISTS (SELECT 1 FROM "myschema".mytable WHERE "MyKey" = NEW."MyKey")
>>> THEN
>>> RETURN NULL;
>>> ELSE
>>> RETURN NEW;
>>> END IF;
>>> END;
>>> $BODY$
>>> LANGUAGE plpgsql;
>>>
>>> and trigger:
>>> CREATE TRIGGER "checkTrigger"
>>> BEFORE INSERT
>>> ON "myschema".mytable
>>> FOR EACH ROW
>>> EXECUTE PROCEDURE "myschema"."checkTriggerFunction"();
>>>
>>>
>>> Another question is: do You want to update record if primary key exists
>>> or simply ignore insert?
>>> It should be noticed, every check impacts performance.
>>>
>>> Regards,
>>> Bartek
>>>
>>>
>>>
>>> 2012/2/13 Alessandro Gagliardi <alessandro(at)path(dot)com>
>>>
>>>> Sorry I wasn't clear. The application is written in Python. It services
>>>> requests that may go to an external API or to our production database
>>>> (which is Mongo). I should add that we have a somewhat unusual situation in
>>>> which we have two parallel databases: MongoDB for OLTP and newer PostgreSQL
>>>> for OLAP. Because Postgres is only consuming, it would waste precious
>>>> cycles to have the client check to see if a record already exists in
>>>> Postgres before trying to insert it. I'd rather let Postgres deal with that
>>>> (which it does well enough anyway with my primary key constraints). My
>>>> Postgres instance is hosted by Heroku and they provide me with a log which
>>>> is currently being spammed by these "duplicate key value violates unique
>>>> constraint" errors making it impossible to see if there are other errors I
>>>> need to be paying more attention to.
>>>>
>>>>
>>>> On Sat, Feb 11, 2012 at 12:08 AM, Bartosz Dmytrak <bdmytrak(at)eranet(dot)pl>wrote:
>>>>
>>>>> Could You provide more details?
>>>>> PostgreSQL does not allow cross database queries.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Bartek
>>>>>
>>>>>
>>>>> 2012/2/10 Alessandro Gagliardi <alessandro(at)path(dot)com>
>>>>>
>>>>>> I have a situation where I am gradually filling in a table from
>>>>>> another database as the need requires. As it currently is, whenever a
>>>>>> record comes up, it tries to insert it into my database. If the row is
>>>>>> already there, the primary key collides and the insert fails. And that's
>>>>>> fine except that it makes for a really cluttered log (making it hard to
>>>>>> find errors that I really do need to pay attention to). I'm wondering if
>>>>>> there's a better (i.e. cleaner, but possibly also more efficient) way to do
>>>>>> this.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> Thank you in advance,
>>>>>> -Alessandro
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Bartosz Dmytrak 2012-02-17 21:39:18 Re: insert fail gracefully if primary key already exists
Previous Message Bartosz Dmytrak 2012-02-17 21:18:48 Re: insert fail gracefully if primary key already exists