Re: PostgreSQL using the wrong Index

From: Alex Stapleton <alexs(at)advfn(dot)com>
To: John A Meinel <john(at)arbash-meinel(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL using the wrong Index
Date: 2005-06-13 15:20:19
Message-ID: A07C7E6A-6E84-476A-AEA3-1FE9344870BF@advfn.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 13 Jun 2005, at 15:47, John A Meinel wrote:

> Alex Stapleton wrote:
>
>
>> Oh, we are running 7.4.2 btw. And our random_page_cost = 1
>>
>>
> Which is only correct if your entire db fits into memory. Also, try
> updating to a later 7.4 version if at all possible.
>

I am aware of this, I didn't configure this machine though
unfortuantely.

>> On 13 Jun 2005, at 14:02, Alex Stapleton wrote:
>>
>>
>>> We have two index's like so
>>>
>>> l1_historical=# \d "N_intra_time_idx"
>>> Index "N_intra_time_idx"
>>> Column | Type
>>> --------+-----------------------------
>>> time | timestamp without time zone
>>> btree
>>>
>>>
> Just so you are aware, writing this as: "We have an index on
> N_intra(time) and one on N_Intra(symbol, time)" is a lot more
> succinct.
>

Sorry, I happened to have them there in my clipboard at the time so I
just blindly pasted them in.

>>>
>>> l1_historical=# \d "N_intra_pkey"
>>> Index "N_intra_pkey"
>>> Column | Type
>>> --------+-----------------------------
>>> symbol | text
>>> time | timestamp without time zone
>>> unique btree (primary key)
>>>
>>> and on queries like this
>>>
>>> select * from "N_intra" where symbol='SOMETHING WHICH DOESNT EXIST'
>>> order by time desc limit 1;
>>>
>>> PostgreSQL takes a very long time to complete, as it effectively
>>> scans the entire table, backwards. And the table is huge, about 450
>>> million rows. (btw, there are no triggers or any other exciting
>>> things like that on our tables in this db.)
>>>
>>> but on things where the symbol does exist in the table, it's
>>> more or
>>> less fine, and nice and fast.
>>>
>>
>>
> What happens if you do:
> SELECT * FROM "N_intra" WHERE symbol='doesnt exist' ORDER BY symbol,
> time DESC LIMIT 1;

Hurrah! I should of thought of this, considering i've done it in the
past :) Thanks a lot, that's great.

> Yes, symbol is constant, but it frequently helps the planner
> realize it
> can use an index scan if you include all terms in the index in the
> ORDER
> BY clause.

>
>>>
>>> Whilst the option the planner has taken might be faster most of the
>>> time, the worst case scenario is unacceptable for obvious reasons.
>>> I've googled for trying to force the use of a specific index, but
>>> can't find anything relevant. Does anyone have any suggestions on
>>> getting it to use an index which hopefully will have better worst
>>> case performance?
>>>
>>
>>
> Try the above first. You could also create a new index on symbol
> CREATE INDEX "N_intra_symbol_idx" ON "N_intra"(symbol);
>
> Then the WHERE clause should use the symbol index, which means it can
> know quickly that an entry doesn't exist. I'm not sure how many
> entries
> you have per symbol, though, so this might cause problems in the ORDER
> BY time portion.
>
> I'm guessing what you really want is to just do the ORDER BY
> symbol, time.
>
> John
> =:->
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Wei Weng 2005-06-13 15:47:30 Re: PostgreSQL using the wrong Index
Previous Message Tom Lane 2005-06-13 14:54:52 Re: PostgreSQL using the wrong Index