Re: Avoid full GIN index scan when possible

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Marc Cousin <cousinmarc(at)gmail(dot)com>
Subject: Re: Avoid full GIN index scan when possible
Date: 2019-06-28 16:43:40
Message-ID: CAOBaU_ZCJBm3mGvJpMN4VX4Q2oQvgPEwMA1jJyrb7-P0HJbnQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 28, 2019 at 6:10 PM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> I've briefly looked at the patch today. I think the idea is worthwhile,

Thanks!

> 2) I'm not sure it's a good idea to add dependency on a specific AM type
> into indxpath.c. At the moment there are only two places, both referring
> to BTREE_AM_OID, do we really hard-code another OID here?
>
> I wonder if this could be generalized to another support proc in the
> inde AM API, with just GIN implementing it.

Yes, this patch was more a POC than anything, to discuss the approach
before spending too much time on infrastructure. I considered another
support function, but I'm still unclear of how useful it'd be for
custom AM (as I don't see any use for that for the vanilla one I
think), or whether if this should be opclass specific or not.

> 3) selfuncs.c is hardly the right place for gin_get_optimizable_quals,
> as it's only for functions computing selectivity estimates (and funcs
> directly related to that). And the new function is not related to that
> at all, so why not to define it in indxpath.c directly?

I kept this function in selfuncs.c as it's using some private
functions (gincost_opexpr and gincost_scalararrayopexpr) used by
gincostestimate. That seemed the simplest approach at this stage.
BTW there's also an ongoing discussion to move the (am)estimate
functions in AM-specific files [1], so that'll directly impact this
too.

> 4) The gin_get_optimizable_quals is quite misleading. Firstly, it's not
> very obvious what "optimizable" means in this context, but that's a
> minor issue. The bigger issue is that it's a lie - when there are no
> "optimizable" clauses (so all clauses would require full scan) the
> function returns the original list, which is by definition completely
> non-optimizable.

The comment is hopefully clearer about what this function does, but
definitely this name is terrible. I'll try to come up with a better
one.

[1] https://www.postgresql.org/message-id/4079.1561661677%40sss.pgh.pa.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2019-06-28 17:06:28 Re: SimpleLruTruncate() mutual exclusion
Previous Message Li, Zheng 2019-06-28 16:18:52 NOT IN subquery optimization