Re: Quesion on the use of indexes

From: <gnuoytr(at)rcn(dot)com>
To: "pgsql-performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Quesion on the use of indexes
Date: 2010-08-17 15:27:56
Message-ID: 20100817112756.AHO63900@ms14.lnh.mail.rcn.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Here's a quote from the docs:

To combine multiple indexes, the system scans each needed index and prepares a bitmap in memory giving the locations of table rows that are reported as matching that index's conditions. The bitmaps are then ANDed and ORed together as needed by the query. Finally, the actual table rows are visited and returned. The table rows are visited in physical order, because that is how the bitmap is laid out; this means that any ordering of the original indexes is lost, and so a separate sort step will be needed if the query has an ORDER BY clause. For this reason, and because each additional index scan adds extra time, the planner will sometimes choose to use a simple index scan even though additional indexes are available that could have been used as well

So, if you have a multi-column index, supply values from the major component on down, not skipping any (but not necessarily supplying all components, leaving the tail components), and the index is clustered, then you will get the best performance on a range scan. For equality scans, who knows? For high selectivity (meaning here, few hits) of single indexes the cost of preparing the bitmaps and such may be less than traversing the multi-index and visiting the table. For non-clustered multi-column, my bet would be on the single indexes up to some small number of indexes.

And, as the docs say, the optimizer may well decide that it isn't worth the effort to use more than the most selective single index.

Robert

---- Original message ----
>Date: Tue, 17 Aug 2010 11:07:39 -0400
>From: pgsql-performance-owner(at)postgresql(dot)org (on behalf of Alvaro Herrera <alvherre(at)commandprompt(dot)com>)
>Subject: Re: [PERFORM] Quesion on the use of indexes
>To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
>Cc: Benjamin Krajmalnik <kraj(at)servoyant(dot)com>,pgsql-performance <pgsql-performance(at)postgresql(dot)org>
>
>Excerpts from Tom Lane's message of lun ago 16 23:33:29 -0400 2010:
>> "Benjamin Krajmalnik" <kraj(at)servoyant(dot)com> writes:
>> > A little background - I have various multi-column indexes whenever I
>> > have queries which restrict the output based on the values of the 2
>> > fields (for example, a client code and the date of a transaction).
>>
>> > Is there a performance gain using this approach as opposed to using 2
>> > separate indexes, one on the first column and one on the second column?
>>
>> Maybe, maybe not ... it's going to depend on a bunch of factors, one of
>> which is what your update load is like compared to the queries that read
>> the indexes. There's a bit of coverage of this in the fine manual: see
>> http://www.postgresql.org/docs/8.4/static/indexes-multicolumn.html
>> and the next few pages.
>
>Another important factor is how selective is each clause in isolation
>compared to how selective they are together. We have found that doing
>BitmapAnd of two bitmap-scanned indexes is sometimes much too slow
>compared to a two-column index. (I have yet to see a case where indexes
>beyond two columns are useful; at this point, combined bitmap indexscans
>are enough.)
>
>--
>Álvaro Herrera <alvherre(at)commandprompt(dot)com>
>The PostgreSQL Company - Command Prompt, Inc.
>PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>
>--
>Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-performance

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2010-08-17 15:43:33 Re: Very poor performance
Previous Message Maciek Sakrejda 2010-08-17 15:16:25 Re: Search query is curious