Re: Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function

From: Greg Stark <gsstark(at)mit(dot)edu>
To: justin <justin(at)emproshunts(dot)com>
Cc: APseudoUtopia <apseudoutopia(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Switching from MySQL: ON DUPLICATE KEY UPDATE, plpgsql function
Date: 2009-06-28 03:02:51
Message-ID: 407d949e0906272002m3a208defhf374b9b3612fee26@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Jun 28, 2009 at 2:13 AM, justin<justin(at)emproshunts(dot)com> wrote:
>
> if you want to do something like this either do a test first to see if the
> key is present in the table, update or do an insert like this
> There is no reason to do a loop in the function waiting for a lock to
> clear.   Postgresql Locks do not work like MySQL.

The latter is exactly what he posted, you *do* have to loop because
two sessions can try to do the update, find no records, and then both
try to insert and fail.

The problem is that the example posted is for a single update/insert.
The problem you're solving is for merging in a whole set of changes.
That's a bit more painful.

I think you're going to want something like;

UPDATE forums_readposts
SET lastpostread=(
select lastpost
from forums_topics
where id=threadid
)
WHERE userid=_id

INSERT INTO forums_readposts
(userid,threadid,lastpostread)
(select _userid, id, lastpost
from forums_topics
where id not in (
select threadid
from forum_readposts existing
where existing.userid=_userid
)
)

(you might want to experiment with that as an NOT EXISTS as there are
still cases where one is optimized better than the other due to the
standard's required null behaviour)

You have a few options here. You could just decide concurrency for
this operation really isn't important and use something to serialize
this operation. For example you could lock the user record with an
explicit select for update on the user record and the commit
immediately afterward.

Or you could catch the exception around the insert and assume if that
happened you don't have to bother retrying because the other
transaction you collided with is presumably doing the same thing. That
would break if a user hit "catch up" and simultaneously clicked on a
new thread he hadn't read before in another window.

Or you could do the same kind of loop around this, just always doing
the insert since it should insert 0 records if there are no missing
threads.

You could skip the insert entirely if the number of records updated in
matches the number of threads and you have that number handy. That
would be an especially good idea if you catch the exception around the
insert since exceptions are moderately expensive. They create a
subtransaction. Probably not a factor for an operation like this which
isn't dominating the workload.

--
greg
http://mit.edu/~gsstark/resume.pdf

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adam Rich 2009-06-28 03:52:22 Date math
Previous Message Tom Lane 2009-06-28 02:19:15 Re: [Q] sequence and index name limits in 8.4