Re: BRIN cost estimate

From: Emre Hasegeli <emre(at)hasegeli(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BRIN cost estimate
Date: 2017-03-17 10:50:26
Message-ID: CAE2gYzz_skZOEOrW62juJ5ShTNQ-iVN_5CXiSZfwv24i9_=BBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

> 1.
>
> + Assert(nnumbers == 1);
>
> I think its a bad idea to Assert() this. The stat tuple can come from
> a plugin which could do anything. Seems like if we need to be certain
> of that then it should be an elog(ERROR), maybe mention that we
> expected a 1 element array, but got <nnumbers> elements.

But it is not coming from a plugin which can do anything. It is the
plugin we know. Assert() is exact same coding with btcostestimate().

> 2.
>
> + Assert(varCorrelation >= 0 && varCorrelation <= 1);
>
> same goes for that. I don't think we want to Assert() that either.
> elog(ERROR) seems better, or perhaps we should fall back on the old
> estimation method in this case?

Again the correlation value is expected to be in this range. I don't
think we even need to bother with Assert(). Garbage in garbage out.

> The Asserted range also seems to contradict the range mentioned in
> pg_statistic.h:

We are using Abs() of the value.

> 3.
>
> + numBlocks = 1 + baserel->pages / BrinGetPagesPerRange(indexRel);
>
> should numBlocks be named numRanges? After all we call the option
> "pages_per_range".

It was named this way before.

> 4.
>
> + * correlated table is copied 4 times, the correlation would be 0.25,
> + * although the index would be almost as good as the version on the
>
> What's meant by "table is copied 4 times" ?
>
> As best as I can work out, it means.
>
> INSERT INTO t SELECT * FROM t;
> INSERT INTO t SELECT * FROM t;
>
> but I'm uncertain, and it's not very clear to me what it means.

This was exactly what I meant. I included the INSERT INTO statement
to make it more clear.

> 5.
>
> + blockSelectivity = (blockProportion + 1 - *indexCorrelation) / 2;
>
> I missed the comment that explains why we divide by two here.

To get the arithmetic mean. The paragraph explaining this had a word
missing. I fixed it.

> 6.
>
> Might want to consider not applying this estimate when the statistics
> don't contain any STATISTIC_KIND_CORRELATION array.

I am not sure what would be better than using the value as 0 in this case.

> 7.
>
> + blockProportion = (double) BrinGetPagesPerRange(indexRel) /
> + baserel->pages;
>
> Perhaps the blockProportion should also be clamped to the number of
> pages in the relation. Even a 1 page relation is estimated to have 128
> pages with the default pages_per_range, by the method used in the
> patch.

I have failed to consider the case when there are less pages than
pages_per_range. New version considers for this.

> Good news is, I'm seeing much better plans coming out in cases when
> the index is unlikely to help. So +1 for fixing this issue.

It is also useful when same columns are indexed with different access
methods. If we let HOT-updates with BRIN indexes, people can freely
spread BRIN indexes around. I have also noticed while testing again
that the patch positively effects selecting parallel bitmap index
scans.

The most doubtful part of the patch is the "(1 + logical_selectivity)
* (1 + physical_selectivity) - 1" equation I made up.
logical_selectivity is the value btree would return. I am sure BRIN
should always return something greater. physical_selectivity is the
value calculated to fix the logical_selectivity using correlation and
pages_per_range. I had no idea how to combine them together, and
picked the current one after random tests with different datasets. We
can try to make an empirical formula using the actual values, but it
would be too much tied with the tests and datasets we would use.

One of the things the patch makes worse is the case when the
selectivity is correctly estimated as 0. For example, searching for
id = 101 where ids are changing between 0 and 100. That is bad, but I
don't see a way to improve it without pushing down the job to the
opclasses and going through a lot more complication. The adjustment
would be useful when searching for id = 101 where ids are changing
between 0 and 100 or 102 and 200.

I can look into supporting expression indexes, if you think something
very much incomplete like this has a chance to get committed.

Attachment Content-Type Size
brin-correlation-v4.patch application/octet-stream 10.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2017-03-17 11:03:35 Re: Radix tree for character conversion
Previous Message Amit Khandekar 2017-03-17 10:37:16 Re: UPDATE of partition key