Re: BRIN cost estimate breaks geometric indexes

From: Darafei "Komяpa" Praliaskouski <me(at)komzpa(dot)net>
To: Egor Rogov <e(dot)rogov(at)postgrespro(dot)ru>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BRIN cost estimate breaks geometric indexes
Date: 2020-02-14 15:20:50
Message-ID: CAC8Q8t+g2XV1o6iXAr9_ySyqfJrvW_mWo+VXTPtRbQHnHbfymQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Patch may look as simple as this one:
https://patch-diff.githubusercontent.com/raw/postgres/postgres/pull/49.diff

Previous mention in -hackers is available at
https://postgrespro.com/list/id/CAKJS1f9n-Wapop5Xz1dtGdpdqmzeGqQK4sV2MK-zZugfC14Xtw(at)mail(dot)gmail(dot)com
-
seems everyone overlooked that patch there breaks geometric indexing back
then.

On Tue, Jan 21, 2020 at 2:07 AM Egor Rogov <e(dot)rogov(at)postgrespro(dot)ru> wrote:

> On 21.01.2020 0:00, Darafei "Komяpa" Praliaskouski wrote:
> > Hi,
> >
> > Found out today that BRIN indexes don't really work for PostGIS and
> > box datatypes.
> >
> > Since
> >
> https://github.com/postgres/postgres/commit/7e534adcdc70866e7be74d626b0ed067c890a251 Postgres
>
> > requires datatype to provide correlation statistics. Such statistics
> > wasn't provided by PostGIS and box types.
> >
> > Today I tried to replace a 200gb gist index with 8mb brin index and
> > queries didn't work as expected - it was never used. set
> > enable_seqscan=off helped for a bit but that's not a permanent solution.
> > Plans for context:
> > https://gist.github.com/Komzpa/2cd396ec9b65e2c93341e9934d974826
> >
> > Debugging session on #postgis IRC channel leads to this ticket to
> > create a (not that meaningful) correlation statistics for geometry
> > datatype: https://trac.osgeo.org/postgis/ticket/4625#ticket
> >
> > Postgres Professional mentioned symptoms of the issue in their
> > in-depth manual:
> > https://habr.com/ru/company/postgrespro/blog/346460/ - box datatype
> > showed same unusable BRIN symptoms for them.
>
>
> (Translated to English:
> https://habr.com/en/company/postgrespro/blog/452900/)
>
>
> > A reasonable course of action on Postgres side seems to be to not
> > assume selectivity of 1 in absence of correlation statistics, but
> > something that would prefer such an index to a parallel seq scan, but
> > higher than similar GIST.
> >
> > Any other ideas?
>
>
> As far as I understand, correlation is computed only for sortable types,
> which means that the current concept of correlation works as intended
> only for B-tree indexes.
>
> Ideally, correlation should be computed for (attribute, index) pair,
> taking into account order of values returned by the index scan. Less
> ideal but more easier approach can be to ignore the computed correlation
> for any index access except B-tree, and just assume some predefined
> constant.
>
>
>
>
>

--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-02-14 15:28:29 Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Previous Message Alvaro Herrera 2020-02-14 15:13:01 Re: assert pg_class.relnatts is consistent