Re: [HACKERS] Surjective functional indexes

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Christoph Berg <myon(at)debian(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Surjective functional indexes
Date: 2018-01-18 10:08:53
Message-ID: CANP8+jJoQCGXWdhHjvGxu2J5A2hDOWo6xpZ0NseV8YQypsqn_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18 January 2018 at 08:59, Konstantin Knizhnik
<k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>
>
> On 18.01.2018 11:38, Simon Riggs wrote:
>>
>> On 10 January 2018 at 09:54, Konstantin Knizhnik
>> <k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>>
>>> Sorry, issue with documentation is fixed.
>>
>> OK, thanks.
>>
>> Patch appears to work cleanly now.
>>
>> I'm wondering now about automatically inferring "recheck_on_update =
>> true" for certain common datatype/operators. It doesn't need to be an
>> exhaustive list, but it would be useful if we detected the main use
>> case of
>>
>> (JSONB datatype column)->>CONSTANT
>>
>> Seems like we could do a test to see if the index function is
>> FUNCTION(COLUMNNAME, CONSTANTs...)
>> {JSONB, ->>} or
>> {jsonb_object_field_text(Columnname, Constant)}
>> {substring(Columname, Constants...)}
>>
>> It would be a shame if people had to remember to use this for the
>> common and obvious cases.
>>
> Right now by default index is considered as projective. So even if you do
> not specify "recheck_on_update" option, then recheck will be done.

That's good

> This decision is based on the assumption that most of functional indexes are
> actually projective and looks likes (JSONB datatype column)->>CONSTANT.
> So do you think that this assumption is not correct and we should switch
> disable recheck_on_update by default?

No thanks

> If not, then there is an opposite challenge: find out class of functions
> which definitely are not projective and recheck on them will have no sense.

If there are some.

Projective is not quite correct, since sin((col->>'angle')::numeric))
could stay same but the result is not a subset of the input.

I think it would be better to avoid the use of mathematical terms and
keep the description simple

"If the indexed value depends upon only a subset of the data, it is
possible that the function value will remain constant after an UPDATE
that changes the non-indexed data.
e.g.

If a column is updated from '/some/url/before' to '/some/url/after'
then the value of substing(col, 1, 5) will not change when updated

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2018-01-18 10:20:09 Re: Index-only scan returns incorrect results when using a composite GIST index with a gist_trgm_ops column.
Previous Message Amit Langote 2018-01-18 09:54:18 Re: [Sender Address Forgery]Re: pg_(total_)relation_size and partitioned tables