| From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Marc Cousin <cousinmarc(at)gmail(dot)com> | 
| Subject: | Re: Avoid full GIN index scan when possible | 
| Date: | 2019-06-28 19:54:01 | 
| Message-ID: | 20190628195401.frwhcga76rrytbc4@development | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Fri, Jun 28, 2019 at 03:03:19PM -0400, Tom Lane wrote:
>Julien Rouhaud <rjuju123(at)gmail(dot)com> writes:
>> On Fri, Jun 28, 2019 at 6:10 PM Tomas Vondra
>> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>> 2) I'm not sure it's a good idea to add dependency on a specific AM type
>>> into indxpath.c. At the moment there are only two places, both referring
>>> to BTREE_AM_OID, do we really hard-code another OID here?
>>>
>>> I wonder if this could be generalized to another support proc in the
>>> inde AM API, with just GIN implementing it.
>
>> Yes, this patch was more a POC than anything, to discuss the approach
>> before spending too much time on infrastructure.  I considered another
>> support function, but I'm still unclear of how useful it'd be for
>> custom AM (as I don't see any use for that for the vanilla one I
>> think), or whether if this should be opclass specific or not.
>
>I just spent a lot of sweat to get rid of (most of) indxpath.c's knowledge
>about specific AMs' capabilities; I'd be very sad if we started to put any
>back.  Aside from being a modularity violation, it's going to fall foul
>of the principle that if index AM X wants something, some index AM Y is
>going to want it too, eventually.
>
>Also, I'm quite unhappy about including index_selfuncs.h into indxpath.c
>at all, never mind whether you got the alphabetical ordering right.
>I have doubts still about how we ought to refactor the mess that is
>*selfuncs.c, but this isn't going in the right direction.
>
Right.
>>> 3) selfuncs.c is hardly the right place for gin_get_optimizable_quals,
>>> as it's only for functions computing selectivity estimates (and funcs
>>> directly related to that). And the new function is not related to that
>>> at all, so why not to define it in indxpath.c directly?
>
>I not only don't want that function in indxpath.c, I don't even want
>it to be known/called from there.  If we need the ability for the index
>AM to editorialize on the list of indexable quals (which I'm not very
>convinced of yet), let's make an AM interface function to do it.
>
Wouldn't it be better to have a function that inspects a single qual and
says whether it's "optimizable" or not? That could be part of the AM
implementation, and we'd call it and it'd be us messing with the list.
That being said, is this really a binary thing - if you have a value
that matches 99% of rows, that probably is not much better than a full
scan.  It may be more difficult to decide (compared to the 'short
trigram' case), but perhaps we should allow that too? Essentially,
instead of 'optimizable' returning true/false, it might return value
between 0.0 and 1.0, as a measure of 'optimizability'.
But that kinda resembles stuff we already have - selectivity/cost. So
why shouldn't this be considered as part of costing? That is, could
gincostestimate look at the index quals and decide what will be used for
scanning the index? Of course, this would make the logic GIN-specific,
and other index AMs would have to implement their own version ...
regards
-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2019-06-28 19:56:20 | Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits | 
| Previous Message | Tom Lane | 2019-06-28 19:03:19 | Re: Avoid full GIN index scan when possible |