Re: Danger of idiomatic plpgsql loop for merging data

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: "J(dot) Greg Davidson" <jgd(at)well(dot)com>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>, lynn <lynn(at)creditlink(dot)com>, "/Blank Page/" <blankpage2008(at)gmail(dot)com>
Subject: Re: Danger of idiomatic plpgsql loop for merging data
Date: 2010-07-29 16:11:37
Message-ID: AANLkTikQ9wYuWzZPUAh1egLOvxS5woZ0cAG+LY+OP1E5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Jul 29, 2010 at 10:06 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Wed, Jul 28, 2010 at 5:27 PM, J. Greg Davidson <jgd(at)well(dot)com> wrote:
>> Hi fellow PostgreSQL hackers,
>>
>> I just got burned by the idiomatic loop
>> documented in the PostgreSQL manual as
>>
>> Example 39-2. Exceptions with UPDATE/INSERT
>>
>> I have now replaced this "standard" idiom
>> with a safer one described below.
>>
>> What went wrong:
>>
>> It seems that the table I was either
>> inserting into or selecting from had
>> a trigger inserting some associated
>> data which was sometimes raising a
>> unique_violation exception, turning the
>> "standard" idiom into an infinite loop!
>>
>> My (simplified) old code looked like this:
>>
>> CREATE TABLE foos (
>>  foo_ foo PRIMARY KEY DEFAULT next_foo();
>>  name_ text UNIQUE NOT NULL;
>> );
>>
>> CREATE OR REPLACE
>> FUNCTION get_foo(text) RETURNS foo AS $$
>> DECLARE
>>  _foo foo;
>> BEGIN
>>  LOOP
>>    SELECT foo_ INTO _foo
>>      FROM foos WHERE name_ = $1;
>>    IF FOUND THEN RETURN _foo; END IF;
>>    BEGIN
>>      INSERT INTO foos(name_) VALUES($1);
>>    EXCEPTION
>>      WHEN unique_violation THEN
>>      -- maybe another thread?
>>    END;
>>  END LOOP;
>> END;
>> $$ LANGUAGE plpgsql STRICT;
>>
>> My (simplified) new code is longer but
>> more flexible, safer and adds logging:
>>
>> CREATE OR REPLACE
>> FUNCTION old_foo(text) RETURNS foo AS $$
>>  SELECT foo_ FROM foos WHERE name_ = $1
>> $$ LANGUAGE SQL STRICT;
>>
>> CREATE OR REPLACE
>> FUNCTION new_foo(text) RETURNS foo AS $$
>> DECLARE
>>  this regprocedure := 'new_foo(text)';
>>  _foo foo;
>> BEGIN
>>  INSERT INTO foos(name_) VALUES ($1)
>>    RETURNING foo_ INTO _foo;
>>  RETURN _ref;
>> EXCEPTION
>>  WHEN unique_violation THEN
>>    -- maybe another thread?
>>    RAISE NOTICE '% "%" unique_violation', this, $1;
>>    RETURN NULL;
>> END;
>> $$ LANGUAGE plpgsql STRICT;
>>
>> CREATE OR REPLACE
>> FUNCTION get_foo(text) RETURNS foo AS $$
>>  SELECT COALESCE(
>>    old_foo($1), new_foo($1), old_foo($1)
>>  )
>> $$ LANGUAGE sql STRICT;
>
> hm.  It's theoretically possible, although highly unlikely, in read
> committed for this to return null.  The loop is the *only* way to
> guarantee insert/update without retrying in the application.
>
> I do however like the use of coalesce as a sugary SQL way if doing 'if
> else'.  Also nice use of regprocedure in the logging.  However the
> original problem of the exception handler catching the wrong unique
> violation is an bug ISTM in your exception handling -- that should
> have been caught in the cascaded handler and handled/rethrown in such
> a way as to not loop you.

Probably better would be to have the unique constraint handler grep
sqlerrm to ensure that the violation is coming from the table being
merged (otherwise simply rethrow). That's a lot safer/faster than
using sub-transaction in every dependent trigger. Also ISTM any
infinite loop like that could use a safety mechanism for paranoia
purposes. If you _do_ rethrow, make sure to use the version of raise
that supports throwing the specific sql error code.

The fact that this is so difficult to get right is really an
indictment of the SQL language TBH (it was only introduced in 2008).
This is a perfectly reasonable thing to want to do but a proper
implementation is quite nasty. This is highly requested feature and
our workaround is (let's be frank) a kludge.

merlin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Oleg Bartunov 2010-07-29 16:13:32 Re: [GENERAL] Incorrect FTS result with GIN index
Previous Message Jacqui Caren-home 2010-07-29 15:59:07 Re: How Big is Too Big for Tables?