Re: indexed column not working as fast as expected

From: Amir Zicherman <amir(dot)zicherman(at)gmail(dot)com>
To: "Gregory S(dot) Williamson" <gsw(at)globexplorer(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: indexed column not working as fast as expected
Date: 2004-08-18 03:04:42
Message-ID: 27a5b7d10408172004752df544@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

thanx for the advice guys. I didn't know you can do an explain. i'll
look into what that gives me.

Greg, I still don't see why a Hash is not ideal for this situation.
i'm actually looking to do a select with a limit of X. so it should
just go into the hash bucket with the number i want and get the first
X rows that it wants. i the case of doing a select * on the value
that only appears within 5 rows in the table, a hash should be really
fast. do i need to vacuum pretty often to make sure my index is
working ok?

thanx for the help, amir

On Tue, 17 Aug 2004 16:56:08 -0700, Gregory S. Williamson
<gsw(at)globexplorer(dot)com> wrote:
> Amir,
>
> The index lacks much specificity so it probably won't help very much at all. ideally an indexed column has to have a wide range of values to be usefull.
>
> 1000000 rows with one value --> all rows are in the same "bucket"
> 1000000 rows with 2 values --> if evenly split, 500000 in each division; if not you might have 10 in one and 9999990 in the other. Hence, an index on a boolean column would be of little use ...
>
> I would suspect that in your case a query against the value with only 5 values might be fast as the planner would use the index. If the planner sees that it needs 5000000 rows of data its not going to use the index since that would greatly increase the amount of work needed (e.g. get the index value, get the real data instead of simply getting data in sequentail reads and discarding the non-interesting data).
>
> HTH clarify things, altho not much help in speeding your queries ...
>
> Greg Williamson
> DBA
> GlobeXplorer LLC
>
> -----Original Message-----
> From: Amir Zicherman [mailto:amir(dot)zicherman(at)gmail(dot)com]
> Sent: Tue 8/17/2004 4:24 PM
> To: pgsql-general(at)postgresql(dot)org
> Cc:
> Subject: [GENERAL] indexed column not working as fast as expected
> hi,
>
> i have a btree index on col1 in table1. The column has either values
> 1,2,3, or 4. 4 does not appear that much in the table (only 5 times).
> there are about 20 million rows in the table. when i do a "select *
> from table1 where col1=4" it takes very long time to get back to me
> (around 4 minutes). why is it taking so long if i have an index on
> it? I also tried this with a hash index and it was still slow.
>
> thanx, amir
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-08-18 03:24:14 Re: [HACKERS] SRPM for 8.0.0 beta?
Previous Message Paul Tillotson 2004-08-18 01:23:56 Re: pg_dump feature request: Exclude tables?