Re: Danger of idiomatic plpgsql loop for merging data

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: "J(dot) Greg Davidson" <jgd(at)well(dot)com>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>, lynn <lynn(at)creditlink(dot)com>, "/Blank Page/" <blankpage2008(at)gmail(dot)com>
Subject: Re: Danger of idiomatic plpgsql loop for merging data
Date: 2010-07-29 14:06:58
Message-ID: AANLkTi=n7=h=O3RXi-rGC2CUt7h2==fPyDXHX7H3RYHa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Jul 28, 2010 at 5:27 PM, J. Greg Davidson <jgd(at)well(dot)com> wrote:
> Hi fellow PostgreSQL hackers,
>
> I just got burned by the idiomatic loop
> documented in the PostgreSQL manual as
>
> Example 39-2. Exceptions with UPDATE/INSERT
>
> I have now replaced this "standard" idiom
> with a safer one described below.
>
> What went wrong:
>
> It seems that the table I was either
> inserting into or selecting from had
> a trigger inserting some associated
> data which was sometimes raising a
> unique_violation exception, turning the
> "standard" idiom into an infinite loop!
>
> My (simplified) old code looked like this:
>
> CREATE TABLE foos (
>  foo_ foo PRIMARY KEY DEFAULT next_foo();
>  name_ text UNIQUE NOT NULL;
> );
>
> CREATE OR REPLACE
> FUNCTION get_foo(text) RETURNS foo AS $$
> DECLARE
>  _foo foo;
> BEGIN
>  LOOP
>    SELECT foo_ INTO _foo
>      FROM foos WHERE name_ = $1;
>    IF FOUND THEN RETURN _foo; END IF;
>    BEGIN
>      INSERT INTO foos(name_) VALUES($1);
>    EXCEPTION
>      WHEN unique_violation THEN
>      -- maybe another thread?
>    END;
>  END LOOP;
> END;
> $$ LANGUAGE plpgsql STRICT;
>
> My (simplified) new code is longer but
> more flexible, safer and adds logging:
>
> CREATE OR REPLACE
> FUNCTION old_foo(text) RETURNS foo AS $$
>  SELECT foo_ FROM foos WHERE name_ = $1
> $$ LANGUAGE SQL STRICT;
>
> CREATE OR REPLACE
> FUNCTION new_foo(text) RETURNS foo AS $$
> DECLARE
>  this regprocedure := 'new_foo(text)';
>  _foo foo;
> BEGIN
>  INSERT INTO foos(name_) VALUES ($1)
>    RETURNING foo_ INTO _foo;
>  RETURN _ref;
> EXCEPTION
>  WHEN unique_violation THEN
>    -- maybe another thread?
>    RAISE NOTICE '% "%" unique_violation', this, $1;
>    RETURN NULL;
> END;
> $$ LANGUAGE plpgsql STRICT;
>
> CREATE OR REPLACE
> FUNCTION get_foo(text) RETURNS foo AS $$
>  SELECT COALESCE(
>    old_foo($1), new_foo($1), old_foo($1)
>  )
> $$ LANGUAGE sql STRICT;

hm. It's theoretically possible, although highly unlikely, in read
committed for this to return null. The loop is the *only* way to
guarantee insert/update without retrying in the application.

I do however like the use of coalesce as a sugary SQL way if doing 'if
else'. Also nice use of regprocedure in the logging. However the
original problem of the exception handler catching the wrong unique
violation is an bug ISTM in your exception handling -- that should
have been caught in the cascaded handler and handled/rethrown in such
a way as to not loop you.

Couple of style notes (my 0.02$). you've obviously put a lot of
thought into your style but I found your example a bit hard to follow
at first, so I'll present an alternative:
*) Highly prefer named input args in pl/pgsql (don't use $1, etc).
*) I prefer to suffix composite types _t -- your example would be
easier to read:

create type foo_t as (...)
CREATE TABLE foos (
foo foo_t PRIMARY KEY DEFAULT next_foo();
name_ text UNIQUE NOT NULL;
);

*) Don't like the suffix underscore (what does it mean?).
*) I personally reserve plurals for arrays not tables. most people
don't though so no big deal:

create table foo
(
foo foo_t, -- you can do this
foos foo_t[]
);

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2010-07-29 15:22:09 Re: Dynamic data model, locks and performance
Previous Message Tom Lane 2010-07-29 14:03:07 Re: [GENERAL] Incorrect FTS result with GIN index