Re: Is MinMaxExpr really leakproof?

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: Is MinMaxExpr really leakproof?
Date: 2018-12-31 17:25:51
Message-ID: 20181231172551.GA206480@gust.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This thread duplicates https://postgr.es/m/flat/16539.1431472961%40sss.pgh.pa.us

On Sun, Dec 30, 2018 at 01:24:02PM -0500, Tom Lane wrote:
> So this coding amounts to an undocumented
> assumption that every non-cross-type btree comparison function is
> leakproof.

> select p.oid::regprocedure from pg_proc p, pg_amproc a, pg_opfamily f
> where p.oid=a.amproc and f.oid=a.amprocfamily and opfmethod=403 and not proleakproof and amproclefttype=amprocrighttype and amprocnum=1;
>
> bpcharcmp(character,character)
> btarraycmp(anyarray,anyarray)
> btbpchar_pattern_cmp(character,character)
> btoidvectorcmp(oidvector,oidvector)
> btrecordcmp(record,record)
> btrecordimagecmp(record,record)
> bttext_pattern_cmp(text,text)
> bttextcmp(text,text)
> enum_cmp(anyenum,anyenum)
> jsonb_cmp(jsonb,jsonb)
> numeric_cmp(numeric,numeric)
> pg_lsn_cmp(pg_lsn,pg_lsn)
> range_cmp(anyrange,anyrange)
> tsquery_cmp(tsquery,tsquery)
> tsvector_cmp(tsvector,tsvector)
>
> so this assumption is, on its face, wrong.
>
> In practice it might be all right, because it's hard to see a reason why
> a btree comparison function would ever throw an error except for internal
> failures, which are probably outside the scope of leakproofness guarantees
> anyway. Nonetheless, if we didn't mark these functions as leakproof,
> why not?

pg_lsn_cmp() and btoidvectorcmp() surely could advertise leakproofness. I'm not
sure about enum_cmp(), numeric_cmp(), tsquery_cmp() or tsvector_cmp(). I can't
think of a reason those would leak, though. btrecordcmp() and other polymorphic
cmp functions can fail:

create type boxrec as (a box); select '("(1,1),(0,0)")'::boxrec = '("(1,1),(0,0)")'::boxrec;
=> ERROR: could not identify an equality operator for type box

The documentation says, "a function which throws an error message for some
argument values but not others ... is not leakproof." I would be comfortable
amending that to allow the "could not identify an equality operator" error,
because that error follows from type specifics, not value specifics.

bttextcmp() and other varstr_cmp() callers fall afoul of the same restriction
with their "could not convert string to UTF-16" errors
(https://postgr.es/m/CADyhKSXPwrUv%2B9LtqPAQ_gyZTv4hYbr2KwqBxcs6a3Vee1jBLQ%40mail.gmail.com).
Leaking the binary fact that an unspecified string contains an unspecified rare
Unicode character is not a serious leak, however. Also, those errors would be a
substantial usability impediment if they happened much in practice; you couldn't
index affected values.

> I think that we should either change contain_leaked_vars_walker to
> explicitly verify leakproofness of the comparison function, or decide
> that it's project policy that btree comparison functions are leakproof,
> and change the markings on those (and their associated operators).

Either of those solutions sounds fine. Like last time, I'll vote for explicitly
verifying leakproofness.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2018-12-31 17:29:22 pg_regress: promptly detect failed postmaster startup
Previous Message Adam Brusselback 2018-12-31 16:20:11 Re: Implementing Incremental View Maintenance