Re: Less-silly selectivity for JSONB matching operators

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Less-silly selectivity for JSONB matching operators
Date: 2020-02-28 22:09:34
Message-ID: 14469.1582927774@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> This patch is not complete, because I didn't look at changing
> the contrib modules, and grep says at least some of them are using
> contsel for non-geometric data types. But I thought I'd put it up
> for discussion at this stage.

Hearing nothing, I went ahead and hacked on the contrib code.
The attached 0002 updates hstore, ltree, and pg_trgm to get them
out of using contsel/contjoinsel for anything. (0001 is the same
patch I posted before.)

In ltree, I noted that < <= >= > were using contsel even though
those are part of a btree opclass, meaning they could perfectly
well use scalarltsel and friends. So now they do. Everything
else now uses matchsel/matchjoinsel, leaving ltreeparentsel as
an unused backward-compatibility feature. I didn't think that
the default selectivity in ltreeparentsel was particularly sane,
so having those operators use their own selectivity logic
instead of using matchsel like everything else seemed pointless
(and certainly pairing a custom ltreeparentsel with contjoinsel
isn't something to encourage).

In pg_trgm, the change of default selectivity estimate causes one
plan to change, but I think that's fine; looking at the data hidden
by COSTS OFF shows the new estimate is closer to reality anyway.
(That test is meant to exercise some gist consistent-function logic,
which it still does, so no worries there.)

The cube and seg extensions still make significant use of contsel and
the other geometric estimator stubs. Although we could in principle
change those operators to use matchsel, I'm hesitant to do so without
closer analysis. The sort orderings imposed by their default btree
opclasses correlate strongly with cube/seg size, which is related to
overlap/containment outcomes, so I'm not sure that the histogram
entries would provide a plausibly random sample for this purpose.
So those modules are not touched here.

There are a few other random uses of geometric join estimators
paired with non-geometric restriction estimators, including
these in the core core:

@>(anyrange,anyelement) | range_contains_elem | rangesel | contjoinsel
@>(anyrange,anyrange) | range_contains | rangesel | contjoinsel
<@(anyelement,anyrange) | elem_contained_by_range | rangesel | contjoinsel
<@(anyrange,anyrange) | range_contained_by | rangesel | contjoinsel
&&(anyrange,anyrange) | range_overlaps | rangesel | areajoinsel

plus the @@ and ~~ operators in intarray. While this is ugly,
it's probably not worth changing until somebody creates non-stub
join selectivity code that will work for these cases.

regards, tom lane

Attachment Content-Type Size
0001-better-jsonb-selectivity-1.patch text/x-diff 16.3 KB
0002-better-contrib-selectivity-1.patch text/x-diff 9.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-02-28 22:40:30 Re: Portal->commandTag as an enum
Previous Message Alvaro Herrera 2020-02-28 20:54:16 Re: Portal->commandTag as an enum