Re: What happened to the is_<type> family of functions proposal?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, "Colin 't Hart" <colinthart(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: What happened to the is_<type> family of functions proposal?
Date: 2010-09-25 03:15:49
Message-ID: AANLkTinmLjkNdPTCZC4NF8bu_b6dHr5CsYQs0OZJ=JUn@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 24, 2010 at 3:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Sep 21, 2010 at 7:05 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> There are many rules that you could possibly make for type input
>>> functions.  But "you cannot throw an error" is not one of them ---
>>> or at least, not one that you can usefully expect to be followed
>>> for anything more than trivial straightline code.
>
>> OK.  This is one of the things I don't understand.  Why does throwing
>> an error imply that we need to abort the current transaction?  Why
>> can't we just catch the longjmp() and trundle onwards?  Obviously,
>> that's unsafe if a pretty wide variety of cases, but if you're just
>> scrutinizing the input string (even with a little bit of read-only
>> database access) it's not obvious to me what can go wrong.
>
> The problem is to know that "all you did" was scrutinize the input
> string.  If it's simple straightline code (even with some C library
> calls) then you can know that, but then you can write such code without
> including any elog(ERROR) in it in the first place.  If you are trapping
> longjmps then what you'd need to assert is that no error thrown from
> anywhere in any of the code reachable from that place represents a
> problem that requires transaction abort to clean up after.  This gets
> unmaintainable remarkably quickly, especially if you invoke anything
> as complicated as database access.  And then there are asynchronous
> error reasons (query cancel) which you shouldn't trap in any case.

Hmm. So the problem is that we don't want to accidentally catch an
error that isn't actually safe to catch. We could probably mitigate
this problem to a considerable degree by throwing data validation
errors using some special flag that say "this is a recoverable error".
And if that flag isn't set then we abort the whole transaction, but
if it is then we continue on. It's still possible for the person
writing the typinput function to set that flag when they should not,
but at least it's less likely to happen by accident. Another
alternative would be to create some kind of explicit way for the
function to RETURN an error instead of throwing it.

But neither of these things is totally bullet-proof, because you could
still do something that requires clean-up and then lie about it. To
protect against that, you'd presumably need to set some kind of a flag
whenever, say, a heap tuple gets modified, and then you could assert
said flag false. What, other than writing to the database, requires
subtransaction cleanup?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2010-09-25 03:33:03 Re: BUG #5661: The character encoding in logfile is confusing.
Previous Message Alvaro Herrera 2010-09-25 03:11:34 Re: pgsql: git_topo_order script, to match up commits across branches.