Re: Multicolumn index - WHERE ... ORDER BY

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Lucas Maystre <lum(at)open(dot)ch>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Multicolumn index - WHERE ... ORDER BY
Date: 2009-12-24 15:56:40
Message-ID: 530.1261670200@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Lucas Maystre <lum(at)open(dot)ch> writes:
> Example of a query I might have:
> SELECT id FROM mail WHERE from_address LIKE 'bill%'
> ORDER BY time DESC LIMIT 50;

> The solution I had in mind was to create a multicolumn index over
> 'from_address' and 'time':
> CREATE INDEX idx_from_time ON mail (from_address, time DESC);
> so that it could directly use the 'time' ordering and lookup only the
> first 50 rows using the index.

> but... it doesn't work :-) i.e. my multicolumn index is never used. So:
> - do you guys have any ideas why it doesn't work?

The from_address condition isn't simple equality, so the output of a
scan wouldn't be sorted by time --- it would have subranges that are
sorted, but that's no help overall. You still have to read the whole
scan output and re-sort. So this index has no advantage over the
smaller index on just from_address.

> - do you see an alternative solution?

There might be some use in an index on (time, from_address). That
gives the correct time ordering, and at least the prefix part of the
from_address condition can be checked in the index without visiting the
heap.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2009-12-24 16:42:27 Re: SATA drives performance
Previous Message Greg Smith 2009-12-24 15:51:41 Re: SATA drives performance