Quick Links

Re: PL/pgSQL 2

From:	Kevin Grittner <kgrittn(at)ymail(dot)com>
To:	Marko Tiikkaja <marko(at)joh(dot)to>, Joel Jacobson <joel(at)trustly(dot)com>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PL/pgSQL 2
Date:	2014-09-02 15:08:29
Message-ID:	1409670509.90792.YahooMailNeo@web122302.mail.ne1.yahoo.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Marko Tiikkaja <marko(at)joh(dot)to> wrote:
> On 9/2/14 4:26 PM, Kevin Grittner wrote:
>> Joel Jacobson <joel(at)trustly(dot)com> wrote:
>>> The common use-case I have in mind is when you have a function
>>> which takes some kind of ID as an input param, which maps to a
>>> primary key in some table, which you want to update.
>>
>> In that case FOUND works just fine. A primary key value can't have
>> more than one matching row.
>
> No, but your code can have a bug.

So the main use case is to allow buggy functions which are deployed
to production without adequate testing to be detected? Bugs like
not getting the primary key column(s) right? I think it would be
great to have some way to generate an error if a given statement
doesn't affect exactly one row, but the above is a pretty weak
argument for making it a default behavior.

> INTO rejecting any queries returning more than one row helps,
> though, but having to write RETURNING TRUE INTO _OK; is not
> pretty either.

No, that sure would not be.

>>> If the where-clause would be incorrect and the update would
>>> update all rows in the table, that would be a disaster, which is
>>> what I want to prevent.
>>
>> By the time you find out that the number of rows affected is every
>> row in the table, you have horribly bloated the table and all its
>> indexes. Causing a DML statement to abort when it sees a second
>> row is a completely different issue than what I (and I suspect most
>> others on the list) thought we were talking about, and would need
>> to affect far more than the PL.
>
> Updating even two rows instead of one can have catastrophic effects.

That's a different problem than Joel just said was his main
concern. I was pointing out that the solution he was proposing was
a very poor solution to the problem he said he was trying to solve.
Can you imagine the damage if a function that updated every row in
a table whenever anyone tried to update a single row by primary key
made it past testing and staging phases into production? Depending
on the table, it might not need to run more than a few times before
the bloat ate all disk space and your production environment was
totally hosed to the point of needing to delete everything from
$PGDATA and restore from your last known good backup.

Accidentally updating a single unintended row is a whole different
class of problem, with potentially completely different solutions.
We can talk about both, but let's not conflate them. The proposed
new behavior seems like it would only detect a small percentage of
ways you can accidentally update unintended rows, but I agree it
would catch enough of them to be a potentially useful option. If
it were a new option on the DML statement syntax, once could
certainly have code review or some sort of "lint" software to look
for omissions. If you don't have a code review process before
things hit production, well, mechanical solutions like this can
only be expected to catch a small percentage of the damage from
application bugs deployed to production.

>>> It's the same type of mistake I want to prevent from in a
>>> convenient way, and there is nothing more convenient than the
>>> default behavour. That also means *all* users will get that
>>> behaviour even if they don't explicitly request it, which is a
>>> good thing, because then they are protected against the danger of
>>> not knowing how to make sure it updated/deleted only one row.
>>
>> I think that changing the default behavior of SQL from set oriented
>> to something else is a horrible idea. I absolutely, unequivocally
>> oppose that at the SQL or plpgsql level as harmful. I understand
>> the need to check for this in various cases, and in fact the
>> application framework I designed at my previous job had Java
>> methods for doing DML with such a check included, named
>> InsertOneRow(), UpdateOneRow(), and DeleteOneRow(). Very useful.
>> If we can agree on a way to allow users to do the same in plpgsql,
>> fine -- but certainly not as the default default (word
>> intentionally repeated).
>
> Yeah, it doesn't necessarily need to be the default default (and I see a
> lot of people saying it shouldn't be). Even having a per-query modifier
> would be better than the current behaviour.

There we seem to agree. I definitely think it is a useful option
if we can sort out a good way to allow it.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: PL/pgSQL 2 at 2014-09-02 14:33:44 from Marko Tiikkaja

Responses

Re: PL/pgSQL 2 at 2014-09-02 15:33:08 from Marko Tiikkaja
Re: PL/pgSQL 2 at 2014-09-02 15:44:22 from Joel Jacobson

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Neil Tiffin	2014-09-02 15:10:02	Re: PL/pgSQL 2
Previous Message	Christoph Berg	2014-09-02 15:08:26	Re: ALTER SYSTEM RESET?