Re: Revisiting UPDATE FROM ... ORDER BY not respected

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Revisiting UPDATE FROM ... ORDER BY not respected
Date: 2011-04-13 00:14:50
Message-ID: 19952.1302653690@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca> writes:
> A few years ago I asked about creating a single UPDATE statement to assign
> id's from a sequence, with the sequences applied in a particular order. In
> other words, order the table, then apply nextval-generated id's to the id
> field in question.
> Here is the original post:
> http://archives.postgresql.org/pgsql-general/2009-04/msg01123.php
> The explanation and solution provided made total sense. To my surprise,
> every now and then users still report sequences being applied out of order.

What PG versions are they using? Pre-8.2 this couldn't have been relied
on at all, because the planner would still try to flatten the subselect.

> Below is the code in question. I admit that I am completely baffled. Does
> anyone have any clues?

I don't think this really works if multiple processes try to update the
table concurrently --- does that ever happen in your apps? Two
processes starting to execute this query at about the same time would
assign distinct sets of IDs for the same (or at least overlapping) sets
of rows, and then only one of them would actually get to update any
particular row. Depending on chances of timing, that could easily
result in an out-of-order set of updates reaching commit.

Another thought is that in READ COMMITTED mode, you're at risk of extra
evaluations of nextval(), because the query tree will be re-evaluated
after waiting out a conflicting update to a target row. The query as
given seems safe against that effect when competing against other
occurrences of itself because of the "WHERE my_table.id IS NULL" bit ---
if somebody else beats you to a row update, the newer nextval value will
just get dropped on the floor (if it's even generated at all, which it
might not be). However, if there are ever concurrent updates to the
target row that don't change the id column to non-null, you'd possibly
have a problem from re-evaluations of nextval().

You could probably defend against both of those effects by taking a
table lock that prevents other updates while you do this. On the whole,
though, I rather wonder why you're insisting on sequential assignments
at all. If this operation can be done on a whim by independent
processes then sequentiality is likely to be approximate at best anyway.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Vinzenz Bildstein 2011-04-13 00:36:46 Postgres 9.0 + LDAP
Previous Message Shianmiin 2011-04-12 23:09:21 Re: PostgreSQL backend process high memory usage issue