Re: [sqlsmith] Crash in mcv_get_match_bitmap

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andreas Seltenreich <seltenreich(at)gmx(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tomas Vondra <tomas(dot)vondra(at)postgresql(dot)org>
Subject: Re: [sqlsmith] Crash in mcv_get_match_bitmap
Date: 2019-07-13 00:11:37
Message-ID: 20190713001137.7saj5g46qpdump4r@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 11, 2019 at 05:08:22PM +0200, Tomas Vondra wrote:
>On Wed, Jul 10, 2019 at 06:48:16PM -0400, Tom Lane wrote:
>>Oh ... while we're piling on here, it just sunk into me that
>>mcv_get_match_bitmap is deciding what the semantics of an operator
>>are by seeing what it's using for a selectivity estimator.
>>That is just absolutely, completely wrong. For starters, it
>>means that the whole mechanism fails for any operator that wants
>>to use a specialized estimator --- hardly an unreasonable thing
>>to do. For another, it's going to be pretty unreliable for
>>extensions, because I do not think they're all careful about using
>>the right estimator --- a lot of 'em probably still haven't adapted
>>to the introduction of separate <= / >= estimators, for instance.
>>
>>The right way to determine operator semantics is to look to see
>>whether they are in a btree opclass. That's what the rest of the
>>planner does, and there is no good reason for the mcv code to
>>do it some other way.
>>
>
>Hmmm, but that will mean we're unable to estimate operators that are not
>part of a btree opclass. Which is a bit annoying, because that would also
>affect equalities (and thus functional dependencies), in which case the
>type may easily have just hash opclass or something.
>
>But maybe I just don't understand how the btree opclass detection works
>and it'd be fine for sensibly defined operators. Can you point me to code
>doing this elsewhere in the planner?
>
>We may need to do something about code in pg10, because functional
>dependencies this too (although just with F_EQSEL). But maybe we should
>leave pg10 alone, we got exactly 0 reports about it until now anyway.
>

Here are WIP patches addressing two of the issues:

1) determining operator semantics by matching them to btree opclasses

2) deconstructing OpExpr into Var/Const nodes

I'd appreciate some feedback particularly on (1).

For the two other issues mentioned in this thread:

a) I don't think unary-argument OpExpr are an issue, because this is
checked when determining which clauses are compatible (and the code only
allows the case with 2 arguments).

b) Const with constisnull=true - I'm not yet sure what to do about this.
The easiest fix would be to deem those clauses incompatible, but that
seems a bit too harsh. The right thing is probably passing the NULL to
the operator proc (but that means we can't use FunctionCall).

Now, looking at this code, I wonder if using DEFAULT_COLLATION_OID when
calling the operator is the right thing. We're using type->typcollation
when building the stats, so maybe we should do the same thing here.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-Determine-operator-semantics-using-btree-opclasses.patch text/plain 9.0 KB
0002-Fix-processing-of-OpExpr-in-extended-statistics.patch text/plain 5.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-07-13 00:33:51 Some thoughts on heaps and freezing
Previous Message Tom Lane 2019-07-12 23:59:13 Re: Check-out mutable functions in check constraints