Re: Optimization of range queries

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimization of range queries
Date: 2018-04-09 17:05:31
Message-ID: 28d7508c-bd3d-4597-a4c6-d0b37355a137@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

12 years ago I proposed patch to which could "union" OR clauses into one
range clause if it's possible. In that time pgsql could not use IS NULL
as index clause, so patch doesn't support that

https://www.postgresql.org/message-id/flat/45742C51.9020602%40sigaev.ru

option number 4), all other are already committed.

Konstantin Knizhnik wrote:
> Hi hackers,
>
> Postgres optimizer is not able to build efficient execution plan for the
> following query:
>
> explain select * from  people_raw where not ("ID"<2068113880 AND "INN"
> is not null) and "ID"<=2068629726 AND "INN" is not null;
>                                          QUERY PLAN
> --------------------------------------------------------------------------------------------
>
>  Bitmap Heap Scan on people_raw  (cost=74937803.72..210449640.49
> rows=121521030 width=336)
>    Recheck Cond: ("ID" <= 2068629726)
>    Filter: (("INN" IS NOT NULL) AND (("ID" >= 2068113880) OR ("INN" IS
> NULL)))
>    ->  Bitmap Index Scan on "People_pkey" (cost=0.00..74907423.47
> rows=2077021718 width=0)
>          Index Cond: ("ID" <= 2068629726)
> (5 rows)
>
>
> Here the table is very large, but query effects only relatively small
> number of rows located in the range: [2068113880,2068629726]
> But unfortunately optimizer doesn't take it into the account.
> Moreover, using "is not null" and "not null" is both operands of AND is
> not smart:
>      (("INN" IS NOT NULL) AND (("ID" >= 2068113880) OR ("INN" IS NULL)))
>
> If I remove "is not null" condition, then plan is perfect:
>
> explain select * from  people_raw where not ("ID"<2068113880) and
> "ID"<=2068629726;
>                                          QUERY PLAN
> --------------------------------------------------------------------------------------------
>
>  Index Scan using "People_pkey" on people_raw  (cost=0.58..196745.57
> rows=586160 width=336)
>    Index Cond: (("ID" >= 2068113880) AND ("ID" <= 2068629726))
> (2 rows)
>
> Before starting  investigation of the problem, I will like to know
> opinion and may be some advise of people familiar with optimizer:
> how difficult will be to handle this case and where to look.
>
> Thanks in advance,
>

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2018-04-09 17:06:59 pgsql: Revert "Allow on-line enabling and disabling of data checksums"
Previous Message Robert Haas 2018-04-09 16:45:00 Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS