Re: delete statement returning too many results

From: Arlo Louis O'Keeffe <gnomelver(at)k5d(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Harmen <harmen(at)lijzij(dot)de>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Ron <ronljohnsonjr(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: delete statement returning too many results
Date: 2022-12-04 21:07:50
Message-ID: EA76C5B7-8B31-4966-95B4-ED253D88AFA7@k5d.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> On 29. Nov 2022, at 18:35, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Harmen <harmen(at)lijzij(dot)de> writes:
>> On Mon, Nov 28, 2022 at 12:11:53PM -0500, Tom Lane wrote:
>>> So basically it's unsafe to run the sub-select more than once,
>>> but the query as written leaves it up to the planner whether
>>> to do that. I'd suggest rephrasing as [...]
>
>> I'm not the original poster, but I do use similar constructions for simple
>> postgres queues. I've been trying for a while, but I don't understand where the
>> extra rows come from, or what's "silent" about SKIP LOCKED.
>
> Sorry, I should not have blamed SKIP LOCKED in particular; this
> construction will misbehave with or without that. The issue is with
> using SELECT FOR UPDATE inside a DELETE or UPDATE that then modifies
> the row that the subquery returned. The next execution of the subquery
> will, or should, return a different row: either some not-deleted row,
> or the modified row. So in this context, the result of the subquery
> is volatile. The point of putting it in a MATERIALIZED CTE is to
> lock the result down regardless of that.
>
>> Because we get different results depending on the plan postgres picks, I can
>> see two options: either the query is broken, or postgres is broken.
>
> You can argue that the planner should treat volatile subqueries
> differently than it does today. But the only reasonable way of
> tightening the semantics would be to force re-execution of such a
> subquery every time, even when it's not visibly dependent on the
> outer query. That would be pretty bad for performance, and I doubt
> it would make the OP happy in this example, because what it would
> mean is that his query "fails" every time not just sometimes.
> (Because of that, I don't have too much trouble concluding that
> the query is broken, whether or not you feel that postgres is
> also broken.)
>
> The bigger picture here is that we long ago decided that the planner
> should not inquire too closely into the volatility of subqueries,
> primarily because there are use-cases where people intentionally rely
> on them not to be re-executed. As an example, these queries give
> different results:
>
> regression=# select random() from generate_series(1,3);
> random
> ---------------------
> 0.7637195395988317
> 0.09569374432524946
> 0.490132093120365
> (3 rows)
>
> regression=# select (select random()) from generate_series(1,3);
> random
> --------------------
> 0.9730230633436501
> 0.9730230633436501
> 0.9730230633436501
> (3 rows)
>
> In the second case, the sub-select is deemed to be independent
> of the outer query and executed only once. You can argue that
> if that's what you want you should be forced to put the sub-select
> in a materialized CTE to make that plain. But we felt that that
> would make many more people unhappy than happy, so we haven't
> done it. Maybe the question could be revisited once all PG
> versions lacking the MATERIALIZED syntax are long dead.
>
> regards, tom lane

Thanks for the thorough explanation. That seems very reasonable.

The CTE query works well for my use case.

Thanks!

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Karsten Hilbert 2022-12-04 22:25:27 Re: Q: error on updating collation version information
Previous Message Ron 2022-12-04 19:30:40 Re: Stored procedure code no longer stored in v14 and v15, changed behaviour