Re: [HACKERS] Help/advice/suggestions on query optimizer for a large table

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: Martin Weinberg <weinberg(at)osprey(dot)phast(dot)umass(dot)edu>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Help/advice/suggestions on query optimizer for a large table
Date: 1999-05-10 16:40:40
Message-ID: 199905101640.MAA07611@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


OK, so here is the answer to this question. Thanks Thomas.

> > Anyway, if I submit a query such as:
> > select * from mytable where x=3.14 and y=6.28;
> > it takes about 3 minutes to return the record. Both x and y are indexed:
> > And "explain" on the select query above says it's doing a sequential scan.
> > However if I say:
> > select * from mytable where x='3.14'::float4 and y='6.28'::float4;
> > it takes about 3 seconds! And now "explain" says it's doing an indexed
> > scan.
> > My understanding is that the query optimizer should pick the index
> > scan for this query based on the cost.
>
> This is a known feature. The Postgres parser converts an unquoted 3.14
> to a float8, which is not the same as the float4 column you have
> indexed. And the optimizer is not (yet) bright enough to convert
> constants to the column type, and then use the available indexes.
>
> In fact, the apparently more desirable strategy is not particularly
> easy to get right. Look at this example:
>
> create table t1 (i int4);
> (insert a bunch of data)
> create index tx on t1 using btree(i);
> vacuum;
> select * from t1 where i < 3.5;
>
> In this case, we can't convert the 3.5 to an integer (3) without
> changing the comparison operator to "<=". And in your case,
> "downconverting" the float8 to a float4 probably would risk the same
> problem. So Postgres *promotes* the float4s to float8s, and has to do
> a sequential scan along the way.
>
> Anyway, afaik you have two options. The first is to surround the
> "3.14" in your example with single quotes (probably the coersion to
> float4 is unnecessary). The second is to create a function index on
> your table, to allow your queries to use float8 arguments
> transparently:
>
> create index mx on mytable using btree (float8(x));
> create index my on mytable using btree (float8(y));
>
> If you are hiding all of the queries inside an app, then I'd suggest
> the first technique. If you are going to be doing a lot of direct SQL,
> then you might want to use the second.
>
> - Tom
>
> --
> Thomas Lockhart lockhart(at)alumni(dot)caltech(dot)edu
> South Pasadena, California
>
>

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 1999-05-10 16:44:37 Re: [HACKERS] "op must return bool to be used with subquery"?
Previous Message Vadim Mikheev 1999-05-10 16:39:22 Re: [HACKERS] INSERT INTO ... SELECT eats all my memory