Re: gincostestimate and hypothetical indexes

From: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gincostestimate and hypothetical indexes
Date: 2015-12-01 00:08:41
Message-ID: 565CE509.4070607@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/12/2015 00:37, Tom Lane wrote:
> Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com> writes:
>> I figured out that it's not possible to use a hypothetical gin index, as
>> the gincostestimate function try to retrieve some statistical data from
>> the index meta page.
>
> Good point.
>
>> Attached patch fixes this. I believe this should be back-patched as was
>> a2095f7fb5a57ea1794f25d029756d9a140fd429.
>
> I don't much care for this patch though. The core problem is that just
> returning all zeroes seems quite useless: it will probably result in silly
> cost estimates. The comment in the patch claiming that this would be the
> situation in a never-vacuumed index is wrong, because ginbuild() updates
> those stats too. But I'm not sure exactly what to do instead :-(.
>

Oops, it looks that this is only true for pre 9.1 indexes (per comment
shown below).

> Ideally we'd put it on the head of the hypothetical-index plugin to invent
> some numbers, but I dunno if we want to create such an API or not ... and
> we certainly couldn't back-patch such a change.
>
> Maybe we could do something along the lines of pretending that 90% of the
> index size given by the plugin is entry pages? Don't know what a good
> ratio would be exactly, but we could probably come up with one with a bit
> of testing.
>

I used zero values because gincostestimate already handle empty
statistics, and pretend that 100% of the pages are entry pages:

/*
* nPendingPages can be trusted, but the other fields are as of the last
* VACUUM. Scale them by the ratio numPages / nTotalPages to account for
* growth since then. If the fields are zero (implying no VACUUM at all,
* and an index created pre-9.1), assume all pages are entry pages.
*/
if (ginStats.nTotalPages == 0 || ginStats.nEntryPages == 0)
{
numEntryPages = numPages;
numDataPages = 0;
numEntries = numTuples; /* bogus, but no other info available */
}

But I don't have any clue of what would be a better ratio either.

> regards, tom lane
>

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-12-01 00:18:52 Re: Additional role attributes && superuser review
Previous Message Robert Haas 2015-11-30 23:47:10 Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)