Re: Surjective functional indexes

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Christoph Berg <myon(at)debian(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Surjective functional indexes
Date: 2017-09-14 15:53:56
Message-ID: CANP8+jJHNuBHGPzmvYi5gbTB3Ofx5nufZo5HC6N89OwptPMucg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14 September 2017 at 16:37, Konstantin Knizhnik
<k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>
>
> On 14.09.2017 13:19, Simon Riggs wrote:

>> This works by looking at overall stats, and only looks at the overall
>> HOT %, so its too heavyweight and coarse.
>>
>> I suggested storing stat info on the relcache and was expecting you
>> would look at how often the expression evaluates to new == old. If we
>> evaluate new against old many times, then if the success rate is low
>> we should stop attempting the comparison. (<10%?)
>>
>> Another idea:
>> If we don't make a check when we should have done then we will get a
>> non-HOT update, so we waste time extra time difference between a HOT
>> and non-HOT update. If we check and fail we waste time take to perform
>> check. So the question is how expensive the check is against how
>> expensive a non-HOT update is. Could we simply say we don't bother to
>> check functions that have a cost higher than 10000? So if the user
>> doesn't want to perform the check they can just increase the cost of
>> the function above the check threshold?
>>
> Attached pleased find one more patch which calculates hot update check hit
> rate more precisely: I have to extended PgStat_StatTabEntry with two new
> fields:
> hot_update_hits and hot_update_misses.

It's not going to work, as already mentioned above. Those stats are at
table level and very little to do with this particular index.

But you've not commented on the design I mention that can work: index relcache.

> Concerning your idea to check cost of index function: it certainly makes
> sense.
> The only problems: I do not understand now how to calculate this cost.
> It can be easily calculated by optimizer when it is building query execution
> plan.
> But inside BuildIndexInfo I have just reference to Relation and have no idea
> how
> I can propagate here information about index expression cost from optimizer.

We could copy at create index, if we took that route. Or we can look
up the cost for the index expression and cache it.

Anyway, this is just jumping around because we still have a parameter
and the idea was to remove the parameter entirely by autotuning, which
I think is both useful and possible, just as HOT itself is autotuned.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-09-14 15:59:04 Re: Optimise default partition scanning while adding new partition
Previous Message Stephen Frost 2017-09-14 15:41:17 Re: Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b