Re: Simple Atomic Relationship Insert

From: Roxanne Reid-Bennett <rox(at)tara-lu(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Simple Atomic Relationship Insert
Date: 2015-01-16 04:57:40
Message-ID: 54B89A44.9070906@tara-lu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 1/15/2015 6:12 PM, Robert DiFalco wrote:
> FWIW I was using the select_hometown_id FUNCTION like this:
>
> INSERT INTO users(...) values(..., select_hometown_id('Portland, OR'));

try this: (if you still get deadlocks, uncomment the advisory lock
[thanks Daniel] and try again)
Logically I suppose it might run faster to do the select, then insert
"if". I almost always write these as insert first - because it's the
more restrictive lock.

CREATE OR REPLACE FUNCTION select_hometown_id(hometown_name VARCHAR) RETURNS
INTEGER AS
$BODY$
DECLARE
v_id integer;
BEGIN
-- perform pg_advisory_xact_lock(hashtext(hometown_name));
BEGIN
insert into hometowns (name)
select hometown_name where not exists (select id from hometowns
where name = hometown_name)
returning id into v_id;
IF (v_id IS NULL) THEN
select id into v_id from hometowns where name = hometown_name;
END IF;
return v_id;
EXCEPTION
WHEN OTHERS THEN
-- choose your poison, this really shouldn't get here
END;

return null;
END;
$BODY$
LANGUAGE plpgsql;

>
>
>
> On Tue, Jan 13, 2015 at 2:21 PM, Brian Dunavant <brian(at)omniti(dot)com
> <mailto:brian(at)omniti(dot)com>> wrote:
>
> The loop to run it twice handles that yes. I don't think that buys
> you anything over a more traditional non-cte method though. I'd run
> them a few thousand times to see if there's any difference in runtimes
> but my guess is the CTE version would be slightly slower here.
>
> v_id integer;
> BEGIN;
> select id into v_id from hometowns where name = hometown_name;
> BEGIN
> insert into hometowns (name)
> select hometown_name where v_id is null
> returning id into v_id;
> EXCEPTION WHEN unique_violation
> THEN
> select id into v_id from hometowns where name = hometown_name;
> END;
> insert into users (name, hometown_id)
> values ('Robert', v_id);
> END;
>
> On Tue, Jan 13, 2015 at 5:06 PM, Robert DiFalco
> <robert(dot)difalco(at)gmail(dot)com <mailto:robert(dot)difalco(at)gmail(dot)com>> wrote:
> > This seems to get rid of the INSERT race condition.
> >
> > CREATE OR REPLACE FUNCTION select_hometown_id(hometown_name
> VARCHAR) RETURNS
> > INTEGER AS $
> > DECLARE hometown_id INTEGER;
> > BEGIN
> > LOOP
> > BEGIN
> > WITH sel AS (
> > SELECT id FROM hometowns WHERE name = hometown_name
> > ), ins AS (
> > INSERT INTO hometowns (name)
> > SELECT hometown_name
> > WHERE NOT EXISTS(SELECT 1 FROM sel)
> > RETURNING id
> > )
> > SELECT id INTO hometown_id FROM ins UNION ALL SELECT id
> FROM sel;
> > RETURN hometown_id;
> >
> > EXCEPTION WHEN unique_violation
> > THEN
> > END;
> > END LOOP;
> > END;
> > $ LANGUAGE plpgsql;
> >
> >
> > On Tue, Jan 13, 2015 at 1:53 PM, Brian Dunavant
> <brian(at)omniti(dot)com <mailto:brian(at)omniti(dot)com>> wrote:
> >>
> >> With the single CTE I don't believe you can do a full upsert
> loop. If
> >> you're doing this inside of a postgres function, your changes are
> >> already atomic, so I don't believe by switching you are buying
> >> yourself much (if anything) by using a CTE query instead of
> something
> >> more traditional here.
> >>
> >> The advantages of switching to a CTE would be if this code was all
> >> being done inside of the app code with multiple queries.
> >>
> >> On Tue, Jan 13, 2015 at 4:39 PM, Robert DiFalco
> >> <robert(dot)difalco(at)gmail(dot)com <mailto:robert(dot)difalco(at)gmail(dot)com>> wrote:
> >> > Well, traditionally I would create a LOOP where I tried the
> SELECT, if
> >> > there
> >> > was nothing I did the INSERT, if that raised an exception I
> would repeat
> >> > the
> >> > LOOP.
> >> >
> >> > What's the best way to do it with the CTE? Currently I have the
> >> > following
> >> > which gives me Duplicate Key Exceptions when two sessions try
> to insert
> >> > the
> >> > same record at the same time.
> >> >
> >> > CREATE OR REPLACE FUNCTION select_hometown_id(hometown_name
> VARCHAR)
> >> > RETURNS
> >> > INTEGER AS $
> >> > DECLARE hometown_id INTEGER;
> >> > BEGIN
> >> > WITH sel AS (
> >> > SELECT id FROM hometowns WHERE name = hometown_name
> >> > ), ins AS (
> >> > INSERT INTO hometowns (name)
> >> > SELECT hometown_name
> >> > WHERE NOT EXISTS(SELECT 1 FROM sel)
> >> > RETURNING id
> >> > )
> >> > SELECT id INTO hometown_id FROM ins UNION ALL SELECT id
> FROM sel;
> >> > RETURN hometown_id;
> >> > END;
> >> > $ LANGUAGE plpgsql;
> >> >
> >> > And that is no bueno. Should I just put the whole thing in a
> LOOP?
> >
> >
>
>

--
[At other schools] I think the most common fault in general is to teach students how to pass exams instead of teaching them the science.
Donald Knuth

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2015-01-16 05:09:35 Re: Casting hstore to json
Previous Message Sameer Kumar 2015-01-16 04:16:28 Re: Check that streaming replica received all data after master shutdown