Re: coalesce in plpgsql, and other style questions

From: Daniel Staal <DStaal(at)usa(dot)net>
To: <pgsql-novice(at)postgresql(dot)org>
Subject: Re: coalesce in plpgsql, and other style questions
Date: 2012-06-13 13:52:03
Message-ID: 0a0291c5528621c342d1559cd6ea34f1@mail.magehandbook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On 2012-06-12 20:42, Ross Boylan wrote:
> Thanks for your responses. Some followup below.
> On Tue, 2012-06-12 at 16:30 -0700, Jeff Davis wrote:
>> On Tue, 2012-06-12 at 10:46 -0700, Ross Boylan wrote:
>> > I just wrote my first pl/pgsql function, and would appreciate any
>> > comments people have on it. I'll be writing a bunch of similar
>> > functions, with semantics "give me the id of the object if exists,
>> > otherwise create it and give me the id."
>> >
>> > My solution seems excessively procedural to me.
>>
>> I think what you're trying to do is procedural, and that's not
>> necessarily bad. You are trying to get an existing ID if available,
>> or
>> assign a new ID if not; and I assume you are doing so to save the
>> space
>> associated with storing the full names (essentially like dictionary
>> encoding).
>>
>> > I thought I could get
>> > the right semantics with something like
>> > select coalesce((select id from mytable where name='foo'),
>> > (insert into mytable ('name') values('foo') returning id))
>> > but I could not get that to work in plgsql.
>>
>> In pl/pgsql, you can't just use SELECT, you have to use something
>> like
>> SELECT INTO the_id ... or use PERFORM.
>>
>> Also, you have a potential race condition there, because someone
>> might
>> insert "foo" into mytable after the select (first arg to coalesce)
>> and
>> before the insert (second arg to coalesce).
> Practically, it's just me so there shouldn't be any risk. But I'd
> like
> to understand the general issue. I thought transaction would take
> care
> of this, so that within a transaction the state of the database does
> not
> change from actions in other sessions. Then if I commit and have
> conflict, the commit fails.
>
> I guess the sequence I'm using for did assures that did is unique
> across
> all transactions, and so the 2 transactions would not be in conflict,
> since they have different primary keys.
>
> As you said later, a unique constraint on one/some of the other
> fields
> will at least prevent bogus records that are "the same" from my point
> of
> view. I was a bit on the fence about making it unique, since I
> thought
> I might have different hosts that shared the same name. But those
> scenarios are unlikely, and I really don't want to deal with it if it
> does happen.
>
> But is my whole model of how transactions are operating off? I'm
> basically generalizing from Gemstone, an object database.

A transaction means you see a consistent view of the data in the
database - that view is consistent and valid through the life of the
transaction, but no other transaction needs to see the same view. Other
transactions may be going on at the same time, and changing data as they
see fit - you will see those changes within your transaction, but only
once they've finished their transaction. Transactions do not lock
tables or rows, even at the serializable level.

What you appear to be thinking is that the transaction locks the
tables, and then tries to do a resolution of the lock at the end of the
transaction - serializable comes close to that, if everyone is modifying
the same table, but not quite. Even under serializable, you'd still be
able to insert invalid data using the 'select, then insert if not in
select' logic. (The difference is that under serializable, you couldn't
check to see if it had happened afterwards - at least not in the same
transaction.)

You need either a unique constraint - so the database checks as part of
the insert operation whether there is another of the same value - or a
table lock. A table lock would allow you to set that no one else is
allowed to modify the table until you are done with it.

Your basic assumption for how transactions work is close to how simple
databases work, that basically lock the whole table or database while
you are working on it. Which is simple and cheap to do - as long as you
aren't doing a whole lot in the database. If you have a lot of action
going on in the database, the lock resolution eventually overwhelms the
simplicity - which is the whole reason why there are databases like
Postgresql, which can maintain good performance and data integrity
without that locking.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author. Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes. This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Merlin Moncure 2012-06-13 14:02:22 Re: coalesce in plpgsql, and other style questions
Previous Message Ross Boylan 2012-06-13 01:57:17 Re: coalesce in plpgsql, and other style questions