Re: Index scan optimization

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index scan optimization
Date: 2014-09-22 07:05:17
Message-ID: 541FCA2D.5090508@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/22/2014 07:47 AM, Rajeev rastogi wrote:
> I have observed a scope of considerable performance improvement in-case of index by a very minor code change.
> Consider the below schema:
>
> create table tbl2(id1 int, id2 varchar(10), id3 int);
> create index idx2 on tbl2(id2, id3);
>
> Query as:
> select count(*) from tbl2 where id2>'a' and id3>990000;
>
> As per current design, it takes following steps to retrieve index tuples:
>
> 1. Find the scan start position by searching first position in BTree as per the first key condition i.e. as per id2>'a'
>
> 2. Then it fetches each tuples from position found in step-1.
>
> 3. For each tuple, it matches all scan key condition, in our example it matches both scan key condition.
>
> 4. If condition match, it returns the tuple otherwise scan stops.
>
> Now problem is here that already first scan key condition is matched to find the scan start position (Step-1), so it is obvious that any further tuple also will match the first scan key condition (as records are sorted).
> So comparison on first scan key condition again in step-3 seems to be redundant.
>
> So my proposal is to skip the condition check on the first scan key condition for every tuple.

The same happens in a single-column case. If you have a query like
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start
position of the scan, you know that all the rows that follow match too.

> I would like to submit the patch for this improvement.
> Please provide your feedback. Also let me know if I am missing something.

Yeah, sounds like a good idea. This scenario might not arise very often,
but it should be cheap to check, so I doubt it will add any measurable
overhead to the cases where the optimization doesn't help.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitriy Igrishin 2014-09-22 07:17:32 Re: libpq connection status and closed fd
Previous Message Amit Kapila 2014-09-22 06:55:15 Re: Scaling shared buffer eviction